NASA Astrophysics Data System (ADS)
Bamiah, Mervat Adib; Brohi, Sarfraz Nawaz; Chuprat, Suriayati
2012-01-01
Virtualization is one of the hottest research topics nowadays. Several academic researchers and developers from IT industry are designing approaches for solving security and manageability issues of Virtual Machines (VMs) residing on virtualized cloud infrastructures. Moving the application from a physical to a virtual platform increases the efficiency, flexibility and reduces management cost as well as effort. Cloud computing is adopting the paradigm of virtualization, using this technique, memory, CPU and computational power is provided to clients' VMs by utilizing the underlying physical hardware. Beside these advantages there are few challenges faced by adopting virtualization such as management of VMs and network traffic, unexpected additional cost and resource allocation. Virtual Machine Monitor (VMM) or hypervisor is the tool used by cloud providers to manage the VMs on cloud. There are several heterogeneous hypervisors provided by various vendors that include VMware, Hyper-V, Xen and Kernel Virtual Machine (KVM). Considering the challenge of VM management, this paper describes several techniques to monitor and manage virtualized cloud infrastructures.
USDA-ARS?s Scientific Manuscript database
Infrastructure-as-a-service (IaaS) clouds provide a new medium for deployment of environmental modeling applications. Harnessing advancements in virtualization, IaaS clouds can provide dynamic scalable infrastructure to better support scientific modeling computational demands. Providing scientific m...
Cloud Infrastructure & Applications - CloudIA
NASA Astrophysics Data System (ADS)
Sulistio, Anthony; Reich, Christoph; Doelitzscher, Frank
The idea behind Cloud Computing is to deliver Infrastructure-as-a-Services and Software-as-a-Service over the Internet on an easy pay-per-use business model. To harness the potentials of Cloud Computing for e-Learning and research purposes, and to small- and medium-sized enterprises, the Hochschule Furtwangen University establishes a new project, called Cloud Infrastructure & Applications (CloudIA). The CloudIA project is a market-oriented cloud infrastructure that leverages different virtualization technologies, by supporting Service-Level Agreements for various service offerings. This paper describes the CloudIA project in details and mentions our early experiences in building a private cloud using an existing infrastructure.
Scaling the CERN OpenStack cloud
NASA Astrophysics Data System (ADS)
Bell, T.; Bompastor, B.; Bukowiec, S.; Castro Leon, J.; Denis, M. K.; van Eldik, J.; Fermin Lobo, M.; Fernandez Alvarez, L.; Fernandez Rodriguez, D.; Marino, A.; Moreira, B.; Noel, B.; Oulevey, T.; Takase, W.; Wiebalck, A.; Zilli, S.
2015-12-01
CERN has been running a production OpenStack cloud since July 2013 to support physics computing and infrastructure services for the site. In the past year, CERN Cloud Infrastructure has seen a constant increase in nodes, virtual machines, users and projects. This paper will present what has been done in order to make the CERN cloud infrastructure scale out.
Grids, virtualization, and clouds at Fermilab
Timm, S.; Chadwick, K.; Garzoglio, G.; ...
2014-06-11
Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture andmore » the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). Lastly, this work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.« less
Grids, virtualization, and clouds at Fermilab
NASA Astrophysics Data System (ADS)
Timm, S.; Chadwick, K.; Garzoglio, G.; Noh, S.
2014-06-01
Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture and the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). This work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Li, C.; Wang, J.; Cui, C.; He, B.; Fan, D.; Yang, Y.; Chen, J.; Zhang, H.; Yu, C.; Xiao, J.; Wang, C.; Cao, Z.; Fan, Y.; Hong, Z.; Li, S.; Mi, L.; Wan, W.; Wang, J.; Yin, S.
2015-09-01
AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Based on CloudStack, an open source software, we set up the cloud computing environment for AstroCloud Project. It consists of five distributed nodes across the mainland of China. Users can use and analysis data in this cloud computing environment. Based on GlusterFS, we built a scalable cloud storage system. Each user has a private space, which can be shared among different virtual machines and desktop systems. With this environments, astronomer can access to astronomical data collected by different telescopes and data centers easily, and data producers can archive their datasets safely.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Overview
NASA Astrophysics Data System (ADS)
Cui, C.; Yu, C.; Xiao, J.; He, B.; Li, C.; Fan, D.; Wang, C.; Hong, Z.; Li, S.; Mi, L.; Wan, W.; Cao, Z.; Wang, J.; Yin, S.; Fan, Y.; Wang, J.
2015-09-01
AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Tasks such as proposal submission, proposal peer-review, data archiving, data quality control, data release and open access, Cloud based data processing and analyzing, will be all supported on the platform. It will act as a full lifecycle management system for astronomical data and telescopes. Achievements from international Virtual Observatories and Cloud Computing are adopted heavily. In this paper, backgrounds of the project, key features of the system, and latest progresses are introduced.
An adaptive process-based cloud infrastructure for space situational awareness applications
NASA Astrophysics Data System (ADS)
Liu, Bingwei; Chen, Yu; Shen, Dan; Chen, Genshe; Pham, Khanh; Blasch, Erik; Rubin, Bruce
2014-06-01
Space situational awareness (SSA) and defense space control capabilities are top priorities for groups that own or operate man-made spacecraft. Also, with the growing amount of space debris, there is an increase in demand for contextual understanding that necessitates the capability of collecting and processing a vast amount sensor data. Cloud computing, which features scalable and flexible storage and computing services, has been recognized as an ideal candidate that can meet the large data contextual challenges as needed by SSA. Cloud computing consists of physical service providers and middleware virtual machines together with infrastructure, platform, and software as service (IaaS, PaaS, SaaS) models. However, the typical Virtual Machine (VM) abstraction is on a per operating systems basis, which is at too low-level and limits the flexibility of a mission application architecture. In responding to this technical challenge, a novel adaptive process based cloud infrastructure for SSA applications is proposed in this paper. In addition, the details for the design rationale and a prototype is further examined. The SSA Cloud (SSAC) conceptual capability will potentially support space situation monitoring and tracking, object identification, and threat assessment. Lastly, the benefits of a more granular and flexible cloud computing resources allocation are illustrated for data processing and implementation considerations within a representative SSA system environment. We show that the container-based virtualization performs better than hypervisor-based virtualization technology in an SSA scenario.
Dynamic Extension of a Virtualized Cluster by using Cloud Resources
NASA Astrophysics Data System (ADS)
Oberst, Oliver; Hauth, Thomas; Kernert, David; Riedel, Stephan; Quast, Günter
2012-12-01
The specific requirements concerning the software environment within the HEP community constrain the choice of resource providers for the outsourcing of computing infrastructure. The use of virtualization in HPC clusters and in the context of cloud resources is therefore a subject of recent developments in scientific computing. The dynamic virtualization of worker nodes in common batch systems provided by ViBatch serves each user with a dynamically virtualized subset of worker nodes on a local cluster. Now it can be transparently extended by the use of common open source cloud interfaces like OpenNebula or Eucalyptus, launching a subset of the virtual worker nodes within the cloud. This paper demonstrates how a dynamically virtualized computing cluster is combined with cloud resources by attaching remotely started virtual worker nodes to the local batch system.
Managing a tier-2 computer centre with a private cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, Stefano; Berzano, Dario; Brunetti, Riccardo; Lusso, Stefano; Vallero, Sara
2014-06-01
In a typical scientific computing centre, several applications coexist and share a single physical infrastructure. An underlying Private Cloud infrastructure eases the management and maintenance of such heterogeneous applications (such as multipurpose or application-specific batch farms, Grid sites, interactive data analysis facilities and others), allowing dynamic allocation resources to any application. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques. Such infrastructures are being deployed in some large centres (see e.g. the CERN Agile Infrastructure project), but with several open-source tools reaching maturity this is becoming viable also for smaller sites. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 centre, an Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The private cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem and the OpenWRT Linux distribution (used for network virtualization); a future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and OCCI.
Migrating EO/IR sensors to cloud-based infrastructure as service architectures
NASA Astrophysics Data System (ADS)
Berglie, Stephen T.; Webster, Steven; May, Christopher M.
2014-06-01
The Night Vision Image Generator (NVIG), a product of US Army RDECOM CERDEC NVESD, is a visualization tool used widely throughout Army simulation environments to provide fully attributed synthesized, full motion video using physics-based sensor and environmental effects. The NVIG relies heavily on contemporary hardware-based acceleration and GPU processing techniques, which push the envelope of both enterprise and commodity-level hypervisor support for providing virtual machines with direct access to hardware resources. The NVIG has successfully been integrated into fully virtual environments where system architectures leverage cloudbased technologies to various extents in order to streamline infrastructure and service management. This paper details the challenges presented to engineers seeking to migrate GPU-bound processes, such as the NVIG, to virtual machines and, ultimately, Cloud-Based IAS architectures. In addition, it presents the path that led to success for the NVIG. A brief overview of Cloud-Based infrastructure management tool sets is provided, and several virtual desktop solutions are outlined. A discrimination is made between general purpose virtual desktop technologies compared to technologies that expose GPU-specific capabilities, including direct rendering and hard ware-based video encoding. Candidate hypervisor/virtual machine configurations that nominally satisfy the virtualized hardware-level GPU requirements of the NVIG are presented , and each is subsequently reviewed in light of its implications on higher-level Cloud management techniques. Implementation details are included from the hardware level, through the operating system, to the 3D graphics APls required by the NVIG and similar GPU-bound tools.
The StratusLab cloud distribution: Use-cases and support for scientific applications
NASA Astrophysics Data System (ADS)
Floros, E.
2012-04-01
The StratusLab project is integrating an open cloud software distribution that enables organizations to setup and provide their own private or public IaaS (Infrastructure as a Service) computing clouds. StratusLab distribution capitalizes on popular infrastructure virtualization solutions like KVM, the OpenNebula virtual machine manager, Claudia service manager and SlipStream deployment platform, which are further enhanced and expanded with additional components developed within the project. The StratusLab distribution covers the core aspects of a cloud IaaS architecture, namely Computing (life-cycle management of virtual machines), Storage, Appliance management and Networking. The resulting software stack provides a packaged turn-key solution for deploying cloud computing services. The cloud computing infrastructures deployed using StratusLab can support a wide range of scientific and business use cases. Grid computing has been the primary use case pursued by the project and for this reason the initial priority has been the support for the deployment and operation of fully virtualized production-level grid sites; a goal that has already been achieved by operating such a site as part of EGI's (European Grid Initiative) pan-european grid infrastructure. In this area the project is currently working to provide non-trivial capabilities like elastic and autonomic management of grid site resources. Although grid computing has been the motivating paradigm, StratusLab's cloud distribution can support a wider range of use cases. Towards this direction, we have developed and currently provide support for setting up general purpose computing solutions like Hadoop, MPI and Torque clusters. For what concerns scientific applications the project is collaborating closely with the Bioinformatics community in order to prepare VM appliances and deploy optimized services for bioinformatics applications. In a similar manner additional scientific disciplines like Earth Science can take advantage of StratusLab cloud solutions. Interested users are welcomed to join StratusLab's user community by getting access to the reference cloud services deployed by the project and offered to the public.
Data Intensive Scientific Workflows on a Federated Cloud: CRADA Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
The Fermilab Scientific Computing Division and the KISTI Global Science Experimental Data Hub Center have built a prototypical large-scale infrastructure to handle scientific workflows of stakeholders to run on multiple cloud resources. The demonstrations have been in the areas of (a) Data-Intensive Scientific Workflows on Federated Clouds, (b) Interoperability and Federation of Cloud Resources, and (c) Virtual Infrastructure Automation to enable On-Demand Services.
Integrating multiple scientific computing needs via a Private Cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Brunetti, R.; Lusso, S.; Vallero, S.
2014-06-01
In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud facility eases the management and maintenance of heterogeneous use cases such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and others. It allows to dynamically and efficiently allocate resources to any application and to tailor the virtual machines according to the applications' requirements. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques; for example, rolling updates can be performed easily and minimizing the downtime. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 site and a dynamically expandable PROOF-based Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The Private Cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem (used in two different configurations for worker- and service-class hypervisors) and the OpenWRT Linux distribution (used for network virtualization). A future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and by using mainstream contextualization tools like CloudInit.
Virtual Facility at Fermilab: Infrastructure and Services Expand to Public Clouds
Timm, Steve; Garzoglio, Gabriele; Cooper, Glenn; ...
2016-02-18
In preparation for its new Virtual Facility Project, Fermilab has launched a program of work to determine the requirements for running a computation facility on-site, in public clouds, or a combination of both. This program builds on the work we have done to successfully run experimental workflows of 1000-VM scale both on an on-site private cloud and on Amazon AWS. To do this at scale we deployed dynamically launched and discovered caching services on the cloud. We are now testing the deployment of more complicated services on Amazon AWS using native load balancing and auto scaling features they provide. Themore » Virtual Facility Project will design and develop a facility including infrastructure and services that can live on the site of Fermilab, off-site, or a combination of both. We expect to need this capacity to meet the peak computing requirements in the future. The Virtual Facility is intended to provision resources on the public cloud on behalf of the facility as a whole instead of having each experiment or Virtual Organization do it on their own. We will describe the policy aspects of a distributed Virtual Facility, the requirements, and plans to make a detailed comparison of the relative cost of the public and private clouds. Furthermore, this talk will present the details of the technical mechanisms we have developed to date, and the plans currently taking shape for a Virtual Facility at Fermilab.« less
Virtual Facility at Fermilab: Infrastructure and Services Expand to Public Clouds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timm, Steve; Garzoglio, Gabriele; Cooper, Glenn
In preparation for its new Virtual Facility Project, Fermilab has launched a program of work to determine the requirements for running a computation facility on-site, in public clouds, or a combination of both. This program builds on the work we have done to successfully run experimental workflows of 1000-VM scale both on an on-site private cloud and on Amazon AWS. To do this at scale we deployed dynamically launched and discovered caching services on the cloud. We are now testing the deployment of more complicated services on Amazon AWS using native load balancing and auto scaling features they provide. Themore » Virtual Facility Project will design and develop a facility including infrastructure and services that can live on the site of Fermilab, off-site, or a combination of both. We expect to need this capacity to meet the peak computing requirements in the future. The Virtual Facility is intended to provision resources on the public cloud on behalf of the facility as a whole instead of having each experiment or Virtual Organization do it on their own. We will describe the policy aspects of a distributed Virtual Facility, the requirements, and plans to make a detailed comparison of the relative cost of the public and private clouds. Furthermore, this talk will present the details of the technical mechanisms we have developed to date, and the plans currently taking shape for a Virtual Facility at Fermilab.« less
The Integration of CloudStack and OCCI/OpenNebula with DIRAC
NASA Astrophysics Data System (ADS)
Méndez Muñoz, Víctor; Fernández Albor, Víctor; Graciani Diaz, Ricardo; Casajús Ramo, Adriàn; Fernández Pena, Tomás; Merino Arévalo, Gonzalo; José Saborido Silva, Juan
2012-12-01
The increasing availability of Cloud resources is arising as a realistic alternative to the Grid as a paradigm for enabling scientific communities to access large distributed computing resources. The DIRAC framework for distributed computing is an easy way to efficiently access to resources from both systems. This paper explains the integration of DIRAC with two open-source Cloud Managers: OpenNebula (taking advantage of the OCCI standard) and CloudStack. These are computing tools to manage the complexity and heterogeneity of distributed data center infrastructures, allowing to create virtual clusters on demand, including public, private and hybrid clouds. This approach has required to develop an extension to the previous DIRAC Virtual Machine engine, which was developed for Amazon EC2, allowing the connection with these new cloud managers. In the OpenNebula case, the development has been based on the CernVM Virtual Software Appliance with appropriate contextualization, while in the case of CloudStack, the infrastructure has been kept more general, which permits other Virtual Machine sources and operating systems being used. In both cases, CernVM File System has been used to facilitate software distribution to the computing nodes. With the resulting infrastructure, the cloud resources are transparent to the users through a friendly interface, like the DIRAC Web Portal. The main purpose of this integration is to get a system that can manage cloud and grid resources at the same time. This particular feature pushes DIRAC to a new conceptual denomination as interware, integrating different middleware. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine which is transparent to the user. This paper presents an analysis of the overhead of the virtual layer, doing some tests to compare the proposed approach with the existing Grid solution. License Notice: Published under licence in Journal of Physics: Conference Series by IOP Publishing Ltd.
The Next Generation of Lab and Classroom Computing - The Silver Lining
2016-12-01
desktop infrastructure (VDI) solution, as well as the computing solutions at three universities, was selected as the basis for comparison. The research... infrastructure , VDI, hardware cost, software cost, manpower, availability, cloud computing, private cloud, bring your own device, BYOD, thin client...virtual desktop infrastructure (VDI) solution, as well as the computing solutions at three universities, was selected as the basis for comparison. The
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center propose a joint project. The goals are to enable scientific workflows of stakeholders to run on multiple cloud resources by use of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federat ion of Cloud Resources , and (c) High-Throughput Fabric Virtualization. This is a matching fund project in which Fermilab and KISTI will contribute equal resources .
Infrastructures for Distributed Computing: the case of BESIII
NASA Astrophysics Data System (ADS)
Pellegrino, J.
2018-05-01
The BESIII is an electron-positron collision experiment hosted at BEPCII in Beijing and aimed to investigate Tau-Charm physics. Now BESIII has been running for several years and gathered more than 1PB raw data. In order to analyze these data and perform massive Monte Carlo simulations, a large amount of computing and storage resources is needed. The distributed computing system is based up on DIRAC and it is in production since 2012. It integrates computing and storage resources from different institutes and a variety of resource types such as cluster, grid, cloud or volunteer computing. About 15 sites from BESIII Collaboration from all over the world joined this distributed computing infrastructure, giving a significant contribution to the IHEP computing facility. Nowadays cloud computing is playing a key role in the HEP computing field, due to its scalability and elasticity. Cloud infrastructures take advantages of several tools, such as VMDirac, to manage virtual machines through cloud managers according to the job requirements. With the virtually unlimited resources from commercial clouds, the computing capacity could scale accordingly in order to deal with any burst demands. General computing models have been discussed in the talk and are addressed herewith, with particular focus on the BESIII infrastructure. Moreover new computing tools and upcoming infrastructures will be addressed.
ERIC Educational Resources Information Center
Sukwong, Orathai
2013-01-01
Virtualization enables the ability to consolidate multiple servers on a single physical machine, increasing the infrastructure utilization. Maximizing the ratio of server virtual machines (VMs) to physical machines, namely the consolidation ratio, becomes an important goal toward infrastructure cost saving in a cloud. However, the consolidation…
Infrastructure Suitability Assessment Modeling for Cloud Computing Solutions
2011-09-01
Virtualization vs . Para-Virtualization .......................................................10 Figure 4. Modeling alternatives in relation to model...the conceptual difference between full virtualization and para-virtualization. Figure 3. Full Virtualization vs . Para-Virtualization 2. XEN...Besides Microsoft’s own client implementations, dubbed “Remote Desktop Con- nection Client” for Windows® and Apple ® operating systems, various open
Identity federation in OpenStack - an introduction to hybrid clouds
NASA Astrophysics Data System (ADS)
Denis, Marek; Castro Leon, Jose; Ormancey, Emmanuel; Tedesco, Paolo
2015-12-01
We are evaluating cloud identity federation available in the OpenStack ecosystem that allows for on premise bursting into remote clouds with use of local identities (i.e. domain accounts). Further enhancements to identity federation are a clear way to hybrid cloud architectures - virtualized infrastructures layered across independent private and public clouds.
Network Monitoring in the age of the Cloud
NASA Astrophysics Data System (ADS)
Ciuffoletti, Augusto
Network virtualization plays a relevant role in provisioning an Infrastructure as a Service (IaaS), implementing the fabric that interconnects virtual components. We identify the standard protocol IEEE802.1Q, that describes Virtual LAN (VLAN) functionalities, as a cornerstone in this architecture.
Efficient operating system level virtualization techniques for cloud resources
NASA Astrophysics Data System (ADS)
Ansu, R.; Samiksha; Anju, S.; Singh, K. John
2017-11-01
Cloud computing is an advancing technology which provides the servcies of Infrastructure, Platform and Software. Virtualization and Computer utility are the keys of Cloud computing. The numbers of cloud users are increasing day by day. So it is the need of the hour to make resources available on demand to satisfy user requirements. The technique in which resources namely storage, processing power, memory and network or I/O are abstracted is known as Virtualization. For executing the operating systems various virtualization techniques are available. They are: Full System Virtualization and Para Virtualization. In Full Virtualization, the whole architecture of hardware is duplicated virtually. No modifications are required in Guest OS as the OS deals with the VM hypervisor directly. In Para Virtualization, modifications of OS is required to run in parallel with other OS. For the Guest OS to access the hardware, the host OS must provide a Virtual Machine Interface. OS virtualization has many advantages such as migrating applications transparently, consolidation of server, online maintenance of OS and providing security. This paper briefs both the virtualization techniques and discusses the issues in OS level virtualization.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.
Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E
2012-03-19
A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community
2012-01-01
Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them. PMID:22429538
Storing and using health data in a virtual private cloud.
Regola, Nathan; Chawla, Nitesh V
2013-03-13
Electronic health records are being adopted at a rapid rate due to increased funding from the US federal government. Health data provide the opportunity to identify possible improvements in health care delivery by applying data mining and statistical methods to the data and will also enable a wide variety of new applications that will be meaningful to patients and medical professionals. Researchers are often granted access to health care data to assist in the data mining process, but HIPAA regulations mandate comprehensive safeguards to protect the data. Often universities (and presumably other research organizations) have an enterprise information technology infrastructure and a research infrastructure. Unfortunately, both of these infrastructures are generally not appropriate for sensitive research data such as HIPAA, as they require special accommodations on the part of the enterprise information technology (or increased security on the part of the research computing environment). Cloud computing, which is a concept that allows organizations to build complex infrastructures on leased resources, is rapidly evolving to the point that it is possible to build sophisticated network architectures with advanced security capabilities. We present a prototype infrastructure in Amazon's Virtual Private Cloud to allow researchers and practitioners to utilize the data in a HIPAA-compliant environment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pais Pitta de Lacerda Ruivo, Tiago; Bernabeu Altayo, Gerard; Garzoglio, Gabriele
2014-11-11
has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56more » virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).« less
Multi-Dimensional Optimization for Cloud Based Multi-Tier Applications
ERIC Educational Resources Information Center
Jung, Gueyoung
2010-01-01
Emerging trends toward cloud computing and virtualization have been opening new avenues to meet enormous demands of space, resource utilization, and energy efficiency in modern data centers. By being allowed to host many multi-tier applications in consolidated environments, cloud infrastructure providers enable resources to be shared among these…
Design and Implement of Astronomical Cloud Computing Environment In China-VO
NASA Astrophysics Data System (ADS)
Li, Changhua; Cui, Chenzhou; Mi, Linying; He, Boliang; Fan, Dongwei; Li, Shanshan; Yang, Sisi; Xu, Yunfei; Han, Jun; Chen, Junyi; Zhang, Hailong; Yu, Ce; Xiao, Jian; Wang, Chuanjun; Cao, Zihuang; Fan, Yufeng; Liu, Liang; Chen, Xiao; Song, Wenming; Du, Kangyu
2017-06-01
Astronomy cloud computing environment is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Based on virtualization technology, astronomy cloud computing environment was designed and implemented by China-VO team. It consists of five distributed nodes across the mainland of China. Astronomer can get compuitng and storage resource in this cloud computing environment. Through this environments, astronomer can easily search and analyze astronomical data collected by different telescopes and data centers , and avoid the large scale dataset transportation.
ERIC Educational Resources Information Center
Schaffhauser, Dian
2012-01-01
Half of servers in higher ed are virtualized. But that number's not high enough for Link Alander, interim vice chancellor and CIO at the Lone Star College System (Texas). He aspires to see 100 percent of the system's infrastructure requirements delivered as IT services from its own virtualized data centers or other cloud-based operators. Back in…
Cloud Computing with iPlant Atmosphere.
McKay, Sheldon J; Skidmore, Edwin J; LaRose, Christopher J; Mercer, Andre W; Noutsos, Christos
2013-10-15
Cloud Computing refers to distributed computing platforms that use virtualization software to provide easy access to physical computing infrastructure and data storage, typically administered through a Web interface. Cloud-based computing provides access to powerful servers, with specific software and virtual hardware configurations, while eliminating the initial capital cost of expensive computers and reducing the ongoing operating costs of system administration, maintenance contracts, power consumption, and cooling. This eliminates a significant barrier to entry into bioinformatics and high-performance computing for many researchers. This is especially true of free or modestly priced cloud computing services. The iPlant Collaborative offers a free cloud computing service, Atmosphere, which allows users to easily create and use instances on virtual servers preconfigured for their analytical needs. Atmosphere is a self-service, on-demand platform for scientific computing. This unit demonstrates how to set up, access and use cloud computing in Atmosphere. Copyright © 2013 John Wiley & Sons, Inc.
Enterprise Cloud Architecture for Chinese Ministry of Railway
NASA Astrophysics Data System (ADS)
Shan, Xumei; Liu, Hefeng
Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Archiving and Quality Control
NASA Astrophysics Data System (ADS)
He, B.; Cui, C.; Fan, D.; Li, C.; Xiao, J.; Yu, C.; Wang, C.; Cao, Z.; Chen, J.; Yi, W.; Li, S.; Mi, L.; Yang, S.
2015-09-01
AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences)1(Cui et al. 2014). To archive the astronomical data in China, we present the implementation of the astronomical data archiving system (ADAS). Data archiving and quality control are the infrastructure for the AstroCloud. Throughout the data of the entire life cycle, data archiving system standardized data, transferring data, logging observational data, archiving ambient data, And storing these data and metadata in database. Quality control covers the whole process and all aspects of data archiving.
Storing and Using Health Data in a Virtual Private Cloud
Regola, Nathan
2013-01-01
Electronic health records are being adopted at a rapid rate due to increased funding from the US federal government. Health data provide the opportunity to identify possible improvements in health care delivery by applying data mining and statistical methods to the data and will also enable a wide variety of new applications that will be meaningful to patients and medical professionals. Researchers are often granted access to health care data to assist in the data mining process, but HIPAA regulations mandate comprehensive safeguards to protect the data. Often universities (and presumably other research organizations) have an enterprise information technology infrastructure and a research infrastructure. Unfortunately, both of these infrastructures are generally not appropriate for sensitive research data such as HIPAA, as they require special accommodations on the part of the enterprise information technology (or increased security on the part of the research computing environment). Cloud computing, which is a concept that allows organizations to build complex infrastructures on leased resources, is rapidly evolving to the point that it is possible to build sophisticated network architectures with advanced security capabilities. We present a prototype infrastructure in Amazon’s Virtual Private Cloud to allow researchers and practitioners to utilize the data in a HIPAA-compliant environment. PMID:23485880
A Cloud-based Infrastructure and Architecture for Environmental System Research
NASA Astrophysics Data System (ADS)
Wang, D.; Wei, Y.; Shankar, M.; Quigley, J.; Wilson, B. E.
2016-12-01
The present availability of high-capacity networks, low-cost computers and storage devices, and the widespread adoption of hardware virtualization and service-oriented architecture provide a great opportunity to enable data and computing infrastructure sharing between closely related research activities. By taking advantage of these approaches, along with the world-class high computing and data infrastructure located at Oak Ridge National Laboratory, a cloud-based infrastructure and architecture has been developed to efficiently deliver essential data and informatics service and utilities to the environmental system research community, and will provide unique capabilities that allows terrestrial ecosystem research projects to share their software utilities (tools), data and even data submission workflow in a straightforward fashion. The infrastructure will minimize large disruptions from current project-based data submission workflows for better acceptances from existing projects, since many ecosystem research projects already have their own requirements or preferences for data submission and collection. The infrastructure will eliminate scalability problems with current project silos by provide unified data services and infrastructure. The Infrastructure consists of two key components (1) a collection of configurable virtual computing environments and user management systems that expedite data submission and collection from environmental system research community, and (2) scalable data management services and system, originated and development by ORNL data centers.
NASA Astrophysics Data System (ADS)
Cox, S. J.; Wyborn, L. A.; Fraser, R.; Rankine, T.; Woodcock, R.; Vote, J.; Evans, B.
2012-12-01
The Virtual Geophysics Laboratory (VGL) is web portal that provides geoscientists with an integrated online environment that: seamlessly accesses geophysical and geoscience data services from the AuScope national geoscience information infrastructure; loosely couples these data to a variety of gesocience software tools; and provides large scale processing facilities via cloud computing. VGL is a collaboration between CSIRO, Geoscience Australia, National Computational Infrastructure, Monash University, Australian National University and the University of Queensland. The VGL provides a distributed system whereby a user can enter an online virtual laboratory to seamlessly connect to OGC web services for geoscience data. The data is supplied in open standards formats using international standards like GeoSciML. A VGL user uses a web mapping interface to discover and filter the data sources using spatial and attribute filters to define a subset. Once the data is selected the user is not required to download the data. VGL collates the service query information for later in the processing workflow where it will be staged directly to the computing facilities. The combination of deferring data download and access to Cloud computing enables VGL users to access their data at higher resolutions and to undertake larger scale inversions, more complex models and simulations than their own local computing facilities might allow. Inside the Virtual Geophysics Laboratory, the user has access to a library of existing models, complete with exemplar workflows for specific scientific problems based on those models. For example, the user can load a geological model published by Geoscience Australia, apply a basic deformation workflow provided by a CSIRO scientist, and have it run in a scientific code from Monash. Finally the user can publish these results to share with a colleague or cite in a paper. This opens new opportunities for access and collaboration as all the resources (models, code, data, processing) are shared in the one virtual laboratory. VGL provides end users with access to an intuitive, user-centered interface that leverages cloud storage and cloud and cluster processing from both the research communities and commercial suppliers (e.g. Amazon). As the underlying data and information services are agnostic of the scientific domain, they can support many other data types. This fundamental characteristic results in a highly reusable virtual laboratory infrastructure that could also be used for example natural hazards, satellite processing, soil geochemistry, climate modeling, agriculture crop modeling.
Services for domain specific developments in the Cloud
NASA Astrophysics Data System (ADS)
Schwichtenberg, Horst; Gemuend, André
2015-04-01
We will discuss and demonstrate the possibilities of new Cloud Services where the complete development of code is in the Cloud. We will discuss the possibilities of such services where the complete development cycle from programing to testing is in the cloud. This can be also combined with dedicated research domain specific services and hide the burden of accessing available infrastructures. As an example, we will show a service that is intended to complement the services of the VERCE projects infrastructure, a service that utilizes Cloud resources to offer simplified execution of data pre- and post-processing scripts. It offers users access to the ObsPy seismological toolbox for processing data with the Python programming language, executed on virtual Cloud resources in a secured sandbox. The solution encompasses a frontend with a modern graphical user interface, a messaging infrastructure as well as Python worker nodes for background processing. All components are deployable in the Cloud and have been tested on different environments based on OpenStack and OpenNebula. Deployments on commercial, public Clouds will be tested in the future.
Quantitative Investigation of the Technologies That Support Cloud Computing
ERIC Educational Resources Information Center
Hu, Wenjin
2014-01-01
Cloud computing is dramatically shaping modern IT infrastructure. It virtualizes computing resources, provides elastic scalability, serves as a pay-as-you-use utility, simplifies the IT administrators' daily tasks, enhances the mobility and collaboration of data, and increases user productivity. We focus on providing generalized black-box…
Phenomenology tools on cloud infrastructures using OpenStack
NASA Astrophysics Data System (ADS)
Campos, I.; Fernández-del-Castillo, E.; Heinemeyer, S.; Lopez-Garcia, A.; Pahlen, F.; Borges, G.
2013-04-01
We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing. On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way. We analyze the performance of this environment based on "virtual" machines versus the utilization of physical hardware. In this way we provide a qualitative result for the influence of the host operating system on the performance of a representative set of applications for phenomenology calculations.
Scientific Services on the Cloud
NASA Astrophysics Data System (ADS)
Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong
Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.
Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
ERIC Educational Resources Information Center
Younge, Andrew J.
2016-01-01
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…
Implementation of Grid Tier 2 and Tier 3 facilities on a Distributed OpenStack Cloud
NASA Astrophysics Data System (ADS)
Limosani, Antonio; Boland, Lucien; Coddington, Paul; Crosby, Sean; Huang, Joanna; Sevior, Martin; Wilson, Ross; Zhang, Shunde
2014-06-01
The Australian Government is making a AUD 100 million investment in Compute and Storage for the academic community. The Compute facilities are provided in the form of 30,000 CPU cores located at 8 nodes around Australia in a distributed virtualized Infrastructure as a Service facility based on OpenStack. The storage will eventually consist of over 100 petabytes located at 6 nodes. All will be linked via a 100 Gb/s network. This proceeding describes the development of a fully connected WLCG Tier-2 grid site as well as a general purpose Tier-3 computing cluster based on this architecture. The facility employs an extension to Torque to enable dynamic allocations of virtual machine instances. A base Scientific Linux virtual machine (VM) image is deployed in the OpenStack cloud and automatically configured as required using Puppet. Custom scripts are used to launch multiple VMs, integrate them into the dynamic Torque cluster and to mount remote file systems. We report on our experience in developing this nation-wide ATLAS and Belle II Tier 2 and Tier 3 computing infrastructure using the national Research Cloud and storage facilities.
NASA Astrophysics Data System (ADS)
Lescinsky, D. T.; Wyborn, L. A.; Evans, B. J. K.; Allen, C.; Fraser, R.; Rankine, T.
2014-12-01
We present collaborative work on a generic, modular infrastructure for virtual laboratories (VLs, similar to science gateways) that combine online access to data, scientific code, and computing resources as services that support multiple data intensive scientific computing needs across a wide range of science disciplines. We are leveraging access to 10+ PB of earth science data on Lustre filesystems at Australia's National Computational Infrastructure (NCI) Research Data Storage Infrastructure (RDSI) node, co-located with NCI's 1.2 PFlop Raijin supercomputer and a 3000 CPU core research cloud. The development, maintenance and sustainability of VLs is best accomplished through modularisation and standardisation of interfaces between components. Our approach has been to break up tightly-coupled, specialised application packages into modules, with identified best techniques and algorithms repackaged either as data services or scientific tools that are accessible across domains. The data services can be used to manipulate, visualise and transform multiple data types whilst the scientific tools can be used in concert with multiple scientific codes. We are currently designing a scalable generic infrastructure that will handle scientific code as modularised services and thereby enable the rapid/easy deployment of new codes or versions of codes. The goal is to build open source libraries/collections of scientific tools, scripts and modelling codes that can be combined in specially designed deployments. Additional services in development include: provenance, publication of results, monitoring, workflow tools, etc. The generic VL infrastructure will be hosted at NCI, but can access alternative computing infrastructures (i.e., public/private cloud, HPC).The Virtual Geophysics Laboratory (VGL) was developed as a pilot project to demonstrate the underlying technology. This base is now being redesigned and generalised to develop a Virtual Hazards Impact and Risk Laboratory (VHIRL); any enhancements and new capabilities will be incorporated into a generic VL infrastructure. At same time, we are scoping seven new VLs and in the process, identifying other common components to prioritise and focus development.
Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian
2011-08-30
Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong
2014-01-01
The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872
Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong
2014-01-01
The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Architecture
NASA Astrophysics Data System (ADS)
Xiao, J.; Yu, C.; Cui, C.; He, B.; Li, C.; Fan, D.; Hong, Z.; Yin, S.; Wang, C.; Cao, Z.; Fan, Y.; Li, S.; Mi, L.; Wan, W.; Wang, J.; Zhang, H.
2015-09-01
AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). The ultimate goal of this project is to provide a comprehensive end-to-end astronomy research environment where several independent systems seamlessly collaborate to support the full lifecycle of the modern observational astronomy based on big data, from proposal submission, to data archiving, data release, and to in-situ data analysis and processing. In this paper, the architecture and key designs of the AstroCloud platform are introduced, including data access middleware, access control and security framework, extendible proposal workflow, and system integration mechanism.
NASA Astrophysics Data System (ADS)
Capone, V.; Esposito, R.; Pardi, S.; Taurino, F.; Tortone, G.
2012-12-01
Over the last few years we have seen an increasing number of services and applications needed to manage and maintain cloud computing facilities. This is particularly true for computing in high energy physics, which often requires complex configurations and distributed infrastructures. In this scenario a cost effective rationalization and consolidation strategy is the key to success in terms of scalability and reliability. In this work we describe an IaaS (Infrastructure as a Service) cloud computing system, with high availability and redundancy features, which is currently in production at INFN-Naples and ATLAS Tier-2 data centre. The main goal we intended to achieve was a simplified method to manage our computing resources and deliver reliable user services, reusing existing hardware without incurring heavy costs. A combined usage of virtualization and clustering technologies allowed us to consolidate our services on a small number of physical machines, reducing electric power costs. As a result of our efforts we developed a complete solution for data and computing centres that can be easily replicated using commodity hardware. Our architecture consists of 2 main subsystems: a clustered storage solution, built on top of disk servers running GlusterFS file system, and a virtual machines execution environment. GlusterFS is a network file system able to perform parallel writes on multiple disk servers, providing this way live replication of data. High availability is also achieved via a network configuration using redundant switches and multiple paths between hypervisor hosts and disk servers. We also developed a set of management scripts to easily perform basic system administration tasks such as automatic deployment of new virtual machines, adaptive scheduling of virtual machines on hypervisor hosts, live migration and automated restart in case of hypervisor failures.
Integration of Cloud resources in the LHCb Distributed Computing
NASA Astrophysics Data System (ADS)
Úbeda García, Mario; Méndez Muñoz, Víctor; Stagni, Federico; Cabarrou, Baptiste; Rauschmayr, Nathalie; Charpentier, Philippe; Closier, Joel
2014-06-01
This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using its specific Dirac extension (LHCbDirac) as an interware for its Distributed Computing. So far, it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of DIRAC (VMDIRAC) allows the integration of Cloud computing infrastructures. It is able to interact with multiple types of infrastructures in commercial and institutional clouds, supported by multiple interfaces (Amazon EC2, OpenNebula, OpenStack and CloudStack) - instantiates, monitors and manages Virtual Machines running on this aggregation of Cloud resources. Moreover, specifications for institutional Cloud resources proposed by Worldwide LHC Computing Grid (WLCG), mainly by the High Energy Physics Unix Information Exchange (HEPiX) group, have been taken into account. Several initiatives and computing resource providers in the eScience environment have already deployed IaaS in production during 2013. Keeping this on mind, pros and cons of a cloud based infrasctructure have been studied in contrast with the current setup. As a result, this work addresses four different use cases which represent a major improvement on several levels of our infrastructure. We describe the solution implemented by LHCb for the contextualisation of the VMs based on the idea of Cloud Site. We report on operational experience of using in production several institutional Cloud resources that are thus becoming integral part of the LHCb Distributed Computing resources. Furthermore, we describe as well the gradual migration of our Service Infrastructure towards a fully distributed architecture following the Service as a Service (SaaS) model.
2011-01-01
Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
Working with Planetary-Scale Data in the Cloud
NASA Astrophysics Data System (ADS)
Flasher, J.
2017-12-01
When data is shared on AWS, it can be analyzed using AWS on-demand computing resources quickly and efficiently. Users can work with any amount of data without needing to download it or store their own copies. When heavy data like imagery, genomics data, or volumes of sensor data are available in AWS's cloud, the time required to copy the data to a virtual server for analysis is virtually eliminated. AWS's global infrastructure allows data providers to make their data available worldwide and ensure quick access to critical data from anywhere. In this session, we will share lessons learned from our experience supporting a global community of entrepreneurs, students and researchers by making petabytes of data freely available for anyone to use in the cloud.
CFCC: A Covert Flows Confinement Mechanism for Virtual Machine Coalitions
NASA Astrophysics Data System (ADS)
Cheng, Ge; Jin, Hai; Zou, Deqing; Shi, Lei; Ohoussou, Alex K.
Normally, virtualization technology is adopted to construct the infrastructure of cloud computing environment. Resources are managed and organized dynamically through virtual machine (VM) coalitions in accordance with the requirements of applications. Enforcing mandatory access control (MAC) on the VM coalitions will greatly improve the security of VM-based cloud computing. However, the existing MAC models lack the mechanism to confine the covert flows and are hard to eliminate the convert channels. In this paper, we propose a covert flows confinement mechanism for virtual machine coalitions (CFCC), which introduces dynamic conflicts of interest based on the activity history of VMs, each of which is attached with a label. The proposed mechanism can be used to confine the covert flows between VMs in different coalitions. We implement a prototype system, evaluate its performance, and show that our mechanism is practical.
Virtualized Networks and Virtualized Optical Line Terminal (vOLT)
NASA Astrophysics Data System (ADS)
Ma, Jonathan; Israel, Stephen
2017-03-01
The success of the Internet and the proliferation of the Internet of Things (IoT) devices is forcing telecommunications carriers to re-architecture a central office as a datacenter (CORD) so as to bring the datacenter economics and cloud agility to a central office (CO). The Open Network Operating System (ONOS) is the first open-source software-defined network (SDN) operating system which is capable of managing and controlling network, computing, and storage resources to support CORD infrastructure and network virtualization. The virtualized Optical Line Termination (vOLT) is one of the key components in such virtualized networks.
The Cloud Area Padovana: from pilot to production
NASA Astrophysics Data System (ADS)
Andreetto, P.; Costa, F.; Crescente, A.; Dorigo, A.; Fantinel, S.; Fanzago, F.; Sgaravatto, M.; Traldi, S.; Verlato, M.; Zangrando, L.
2017-10-01
The Cloud Area Padovana has been running for almost two years. This is an OpenStack-based scientific cloud, spread across two different sites: the INFN Padova Unit and the INFN Legnaro National Labs. The hardware resources have been scaled horizontally and vertically, by upgrading some hypervisors and by adding new ones: currently it provides about 1100 cores. Some in-house developments were also integrated in the OpenStack dashboard, such as a tool for user and project registrations with direct support for the INFN-AAI Identity Provider as a new option for the user authentication. In collaboration with the EU-funded Indigo DataCloud project, the integration with Docker-based containers has been experimented with and will be available in production soon. This computing facility now satisfies the computational and storage demands of more than 70 users affiliated with about 20 research projects. We present here the architecture of this Cloud infrastructure, the tools and procedures used to operate it. We also focus on the lessons learnt in these two years, describing the problems that were found and the corrective actions that had to be applied. We also discuss about the chosen strategy for upgrades, which combines the need to promptly integrate the OpenStack new developments, the demand to reduce the downtimes of the infrastructure, and the need to limit the effort requested for such updates. We also discuss how this Cloud infrastructure is being used. In particular we focus on two big physics experiments which are intensively exploiting this computing facility: CMS and SPES. CMS deployed on the cloud a complex computational infrastructure, composed of several user interfaces for job submission in the Grid environment/local batch queues or for interactive processes; this is fully integrated with the local Tier-2 facility. To avoid a static allocation of the resources, an elastic cluster, based on cernVM, has been configured: it allows to automatically create and delete virtual machines according to the user needs. SPES, using a client-server system called TraceWin, exploits INFN’s virtual resources performing a very large number of simulations on about a thousand nodes elastically managed.
A Combination Therapy of JO-I and Chemotherapy in Ovarian Cancer Models
2013-10-01
which consists of a 3PAR storage backend and is sharing data via a highly available NetApp storage gateway and 2 high throughput commodity storage...Environment is configured as self- service Enterprise cloud and currently hosts more than 700 virtual machines. The network infrastructure consists of...technology infrastructure and information system applications designed to integrate, automate, and standardize operations. These systems fuse state of
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Access and Interoperability
NASA Astrophysics Data System (ADS)
Fan, D.; He, B.; Xiao, J.; Li, S.; Li, C.; Cui, C.; Yu, C.; Hong, Z.; Yin, S.; Wang, C.; Cao, Z.; Fan, Y.; Mi, L.; Wan, W.; Wang, J.
2015-09-01
Data access and interoperability module connects the observation proposals, data, virtual machines and software. According to the unique identifier of PI (principal investigator), an email address or an internal ID, data can be collected by PI's proposals, or by the search interfaces, e.g. conesearch. Files associated with the searched results could be easily transported to cloud storages, including the storage with virtual machines, or several commercial platforms like Dropbox. Benefitted from the standards of IVOA (International Observatories Alliance), VOTable formatted searching result could be sent to kinds of VO software. Latter endeavor will try to integrate more data and connect archives and some other astronomical resources.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center are working on a multi-year Collaborative Research and Development Agreement.With the knowledge developed in the first year on how to provision and manage a federation of virtual machines through Cloud management systems. In this second year, we expanded the work on provisioning and federation, increasing both scale and diversity of solutions, and we started to build on-demand services on the established fabric, introducing the paradigm of Platform as a Service to assist with the execution of scientific workflows. We have enabled scientific workflows ofmore » stakeholders to run on multiple cloud resources at the scale of 1,000 concurrent machines. The demonstrations have been in the areas of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federation of Cloud Resources, and (c) On-demand Services for ScientificWorkflows.« less
Commissioning the CERN IT Agile Infrastructure with experiment workloads
NASA Astrophysics Data System (ADS)
Medrano Llamas, Ramón; Harald Barreiro Megino, Fernando; Kucharczyk, Katarzyna; Kamil Denis, Marek; Cinquilli, Mattia
2014-06-01
In order to ease the management of their infrastructure, most of the WLCG sites are adopting cloud based strategies. In the case of CERN, the Tier 0 of the WLCG, is completely restructuring the resource and configuration management of their computing center under the codename Agile Infrastructure. Its goal is to manage 15,000 Virtual Machines by means of an OpenStack middleware in order to unify all the resources in CERN's two datacenters: the one placed in Meyrin and the new on in Wigner, Hungary. During the commissioning of this infrastructure, CERN IT is offering an attractive amount of computing resources to the experiments (800 cores for ATLAS and CMS) through a private cloud interface. ATLAS and CMS have joined forces to exploit them by running stress tests and simulation workloads since November 2012. This work will describe the experience of the first deployments of the current experiment workloads on the CERN private cloud testbed. The paper is organized as follows: the first section will explain the integration of the experiment workload management systems (WMS) with the cloud resources. The second section will revisit the performance and stress testing performed with HammerCloud in order to evaluate and compare the suitability for the experiment workloads. The third section will go deeper into the dynamic provisioning techniques, such as the use of the cloud APIs directly by the WMS. The paper finishes with a review of the conclusions and the challenges ahead.
Cloud Computing for Mission Design and Operations
NASA Technical Reports Server (NTRS)
Arrieta, Juan; Attiyah, Amy; Beswick, Robert; Gerasimantos, Dimitrios
2012-01-01
The space mission design and operations community already recognizes the value of cloud computing and virtualization. However, natural and valid concerns, like security, privacy, up-time, and vendor lock-in, have prevented a more widespread and expedited adoption into official workflows. In the interest of alleviating these concerns, we propose a series of guidelines for internally deploying a resource-oriented hub of data and algorithms. These guidelines provide a roadmap for implementing an architecture inspired in the cloud computing model: associative, elastic, semantical, interconnected, and adaptive. The architecture can be summarized as exposing data and algorithms as resource-oriented Web services, coordinated via messaging, and running on virtual machines; it is simple, and based on widely adopted standards, protocols, and tools. The architecture may help reduce common sources of complexity intrinsic to data-driven, collaborative interactions and, most importantly, it may provide the means for teams and agencies to evaluate the cloud computing model in their specific context, with minimal infrastructure changes, and before committing to a specific cloud services provider.
International Symposium on Grids and Clouds (ISGC) 2016
NASA Astrophysics Data System (ADS)
The International Symposium on Grids and Clouds (ISGC) 2016 will be held at Academia Sinica in Taipei, Taiwan from 13-18 March 2016, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). The theme of ISGC 2016 focuses on“Ubiquitous e-infrastructures and Applications”. Contemporary research is impossible without a strong IT component - researchers rely on the existence of stable and widely available e-infrastructures and their higher level functions and properties. As a result of these expectations, e-Infrastructures are becoming ubiquitous, providing an environment that supports large scale collaborations that deal with global challenges as well as smaller and temporal research communities focusing on particular scientific problems. To support those diversified communities and their needs, the e-Infrastructures themselves are becoming more layered and multifaceted, supporting larger groups of applications. Following the call for the last year conference, ISGC 2016 continues its aim to bring together users and application developers with those responsible for the development and operation of multi-purpose ubiquitous e-Infrastructures. Topics of discussion include Physics (including HEP) and Engineering Applications, Biomedicine & Life Sciences Applications, Earth & Environmental Sciences & Biodiversity Applications, Humanities, Arts, and Social Sciences (HASS) Applications, Virtual Research Environment (including Middleware, tools, services, workflow, etc.), Data Management, Big Data, Networking & Security, Infrastructure & Operations, Infrastructure Clouds and Virtualisation, Interoperability, Business Models & Sustainability, Highly Distributed Computing Systems, and High Performance & Technical Computing (HPTC), etc.
Cyber-Physical Multi-Core Optimization for Resource and Cache Effects (C2ORES)
2014-03-01
DoD-sponsored ATAACK mobile cloud testbed funded through the DURIP program, which is deployed at Virginia Tech and Vanderbilt University to conduct...0.9.2. Jug was configured to use a filesystem (network file system (nfs)) backend for locking and task synchronization. 4.1.7.2 Experiment 1...and performance-aware virtual machine placement technique that is realized as cloud infrastructure middleware. The key contributions of iPlace include
Liberating Virtual Machines from Physical Boundaries through Execution Knowledge
2015-12-01
trivial infrastructures such as VM distribution networks, clients need to wait for an extended period of time before launching a VM. In cloud settings...hardware support. MobiDesk [28] efficiently supports virtual desktops in mobile environments by decou- pling the user’s workload from host systems and...experiment set-up. VMs are migrated between a pair of source and destination hosts, which are connected through a backend 10 Gbps network for
"Tactic": Traffic Aware Cloud for Tiered Infrastructure Consolidation
ERIC Educational Resources Information Center
Sangpetch, Akkarit
2013-01-01
Large-scale enterprise applications are deployed as distributed applications. These applications consist of many inter-connected components with heterogeneous roles and complex dependencies. Each component typically consumes 5-15% of the server capacity. Deploying each component as a separate virtual machine (VM) allows us to consolidate the…
Simonyan, Vahan; Chumakov, Konstantin; Dingerdissen, Hayley; Faison, William; Goldweber, Scott; Golikov, Anton; Gulzar, Naila; Karagiannis, Konstantinos; Vinh Nguyen Lam, Phuc; Maudru, Thomas; Muravitskaja, Olesja; Osipova, Ekaterina; Pan, Yang; Pschenichnov, Alexey; Rostovtsev, Alexandre; Santana-Quintero, Luis; Smith, Krista; Thompson, Elaine E.; Tkachenko, Valery; Torcivia-Rodriguez, John; Wan, Quan; Wang, Jing; Wu, Tsung-Jung; Wilson, Carolyn; Mazumder, Raja
2016-01-01
The High-performance Integrated Virtual Environment (HIVE) is a distributed storage and compute environment designed primarily to handle next-generation sequencing (NGS) data. This multicomponent cloud infrastructure provides secure web access for authorized users to deposit, retrieve, annotate and compute on NGS data, and to analyse the outcomes using web interface visual environments appropriately built in collaboration with research and regulatory scientists and other end users. Unlike many massively parallel computing environments, HIVE uses a cloud control server which virtualizes services, not processes. It is both very robust and flexible due to the abstraction layer introduced between computational requests and operating system processes. The novel paradigm of moving computations to the data, instead of moving data to computational nodes, has proven to be significantly less taxing for both hardware and network infrastructure. The honeycomb data model developed for HIVE integrates metadata into an object-oriented model. Its distinction from other object-oriented databases is in the additional implementation of a unified application program interface to search, view and manipulate data of all types. This model simplifies the introduction of new data types, thereby minimizing the need for database restructuring and streamlining the development of new integrated information systems. The honeycomb model employs a highly secure hierarchical access control and permission system, allowing determination of data access privileges in a finely granular manner without flooding the security subsystem with a multiplicity of rules. HIVE infrastructure will allow engineers and scientists to perform NGS analysis in a manner that is both efficient and secure. HIVE is actively supported in public and private domains, and project collaborations are welcomed. Database URL: https://hive.biochemistry.gwu.edu PMID:26989153
Simonyan, Vahan; Chumakov, Konstantin; Dingerdissen, Hayley; Faison, William; Goldweber, Scott; Golikov, Anton; Gulzar, Naila; Karagiannis, Konstantinos; Vinh Nguyen Lam, Phuc; Maudru, Thomas; Muravitskaja, Olesja; Osipova, Ekaterina; Pan, Yang; Pschenichnov, Alexey; Rostovtsev, Alexandre; Santana-Quintero, Luis; Smith, Krista; Thompson, Elaine E; Tkachenko, Valery; Torcivia-Rodriguez, John; Voskanian, Alin; Wan, Quan; Wang, Jing; Wu, Tsung-Jung; Wilson, Carolyn; Mazumder, Raja
2016-01-01
The High-performance Integrated Virtual Environment (HIVE) is a distributed storage and compute environment designed primarily to handle next-generation sequencing (NGS) data. This multicomponent cloud infrastructure provides secure web access for authorized users to deposit, retrieve, annotate and compute on NGS data, and to analyse the outcomes using web interface visual environments appropriately built in collaboration with research and regulatory scientists and other end users. Unlike many massively parallel computing environments, HIVE uses a cloud control server which virtualizes services, not processes. It is both very robust and flexible due to the abstraction layer introduced between computational requests and operating system processes. The novel paradigm of moving computations to the data, instead of moving data to computational nodes, has proven to be significantly less taxing for both hardware and network infrastructure.The honeycomb data model developed for HIVE integrates metadata into an object-oriented model. Its distinction from other object-oriented databases is in the additional implementation of a unified application program interface to search, view and manipulate data of all types. This model simplifies the introduction of new data types, thereby minimizing the need for database restructuring and streamlining the development of new integrated information systems. The honeycomb model employs a highly secure hierarchical access control and permission system, allowing determination of data access privileges in a finely granular manner without flooding the security subsystem with a multiplicity of rules. HIVE infrastructure will allow engineers and scientists to perform NGS analysis in a manner that is both efficient and secure. HIVE is actively supported in public and private domains, and project collaborations are welcomed. Database URL: https://hive.biochemistry.gwu.edu. © The Author(s) 2016. Published by Oxford University Press.
A practical approach to virtualization in HEP
NASA Astrophysics Data System (ADS)
Buncic, P.; Aguado Sánchez, C.; Blomer, J.; Harutyunyan, A.; Mudrinic, M.
2011-01-01
In the attempt to solve the problem of processing data coming from LHC experiments at CERN at a rate of 15PB per year, for almost a decade the High Enery Physics (HEP) community has focused its efforts on the development of the Worldwide LHC Computing Grid. This generated large interest and expectations promising to revolutionize computing. Meanwhile, having initially taken part in the Grid standardization process, industry has moved in a different direction and started promoting the Cloud Computing paradigm which aims to solve problems on a similar scale and in equally seamless way as it was expected in the idealized Grid approach. A key enabling technology behind Cloud computing is server virtualization. In early 2008, an R&D project was established in the PH-SFT group at CERN to investigate how virtualization technology could be used to improve and simplify the daily interaction of physicists with experiment software frameworks and the Grid infrastructure. In this article we shall first briefly compare Grid and Cloud computing paradigms and then summarize the results of the R&D activity pointing out where and how virtualization technology could be effectively used in our field in order to maximize practical benefits whilst avoiding potential pitfalls.
Signal and image processing algorithm performance in a virtual and elastic computing environment
NASA Astrophysics Data System (ADS)
Bennett, Kelly W.; Robertson, James
2013-05-01
The U.S. Army Research Laboratory (ARL) supports the development of classification, detection, tracking, and localization algorithms using multiple sensing modalities including acoustic, seismic, E-field, magnetic field, PIR, and visual and IR imaging. Multimodal sensors collect large amounts of data in support of algorithm development. The resulting large amount of data, and their associated high-performance computing needs, increases and challenges existing computing infrastructures. Purchasing computer power as a commodity using a Cloud service offers low-cost, pay-as-you-go pricing models, scalability, and elasticity that may provide solutions to develop and optimize algorithms without having to procure additional hardware and resources. This paper provides a detailed look at using a commercial cloud service provider, such as Amazon Web Services (AWS), to develop and deploy simple signal and image processing algorithms in a cloud and run the algorithms on a large set of data archived in the ARL Multimodal Signatures Database (MMSDB). Analytical results will provide performance comparisons with existing infrastructure. A discussion on using cloud computing with government data will discuss best security practices that exist within cloud services, such as AWS.
Cloud Computing for radiologists.
Kharat, Amit T; Safvi, Amjad; Thind, Ss; Singh, Amarjit
2012-07-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future.
Cloud Computing for radiologists
Kharat, Amit T; Safvi, Amjad; Thind, SS; Singh, Amarjit
2012-01-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future. PMID:23599560
Calibration of LOFAR data on the cloud
NASA Astrophysics Data System (ADS)
Sabater, J.; Sánchez-Expósito, S.; Best, P.; Garrido, J.; Verdes-Montenegro, L.; Lezzi, D.
2017-04-01
New scientific instruments are starting to generate an unprecedented amount of data. The Low Frequency Array (LOFAR), one of the Square Kilometre Array (SKA) pathfinders, is already producing data on a petabyte scale. The calibration of these data presents a huge challenge for final users: (a) extensive storage and computing resources are required; (b) the installation and maintenance of the software required for the processing is not trivial; and (c) the requirements of calibration pipelines, which are experimental and under development, are quickly evolving. After encountering some limitations in classical infrastructures like dedicated clusters, we investigated the viability of cloud infrastructures as a solution. We found that the installation and operation of LOFAR data calibration pipelines is not only possible, but can also be efficient in cloud infrastructures. The main advantages were: (1) the ease of software installation and maintenance, and the availability of standard APIs and tools, widely used in the industry; this reduces the requirement for significant manual intervention, which can have a highly negative impact in some infrastructures; (2) the flexibility to adapt the infrastructure to the needs of the problem, especially as those demands change over time; (3) the on-demand consumption of (shared) resources. We found that a critical factor (also in other infrastructures) is the availability of scratch storage areas of an appropriate size. We found no significant impediments associated with the speed of data transfer, the use of virtualization, the use of external block storage, or the memory available (provided a minimum threshold is reached). Finally, we considered the cost-effectiveness of a commercial cloud like Amazon Web Services. While a cloud solution is more expensive than the operation of a large, fully-utilized cluster completely dedicated to LOFAR data reduction, we found that its costs are competitive if the number of datasets to be analysed is not high, or if the costs of maintaining a system capable of calibrating LOFAR data become high. Coupled with the advantages discussed above, this suggests that a cloud infrastructure may be favourable for many users.
A model of cloud application assignments in software-defined storages
NASA Astrophysics Data System (ADS)
Bolodurina, Irina P.; Parfenov, Denis I.; Polezhaev, Petr N.; E Shukhman, Alexander
2017-01-01
The aim of this study is to analyze the structure and mechanisms of interaction of typical cloud applications and to suggest the approaches to optimize their placement in storage systems. In this paper, we describe a generalized model of cloud applications including the three basic layers: a model of application, a model of service, and a model of resource. The distinctive feature of the model suggested implies analyzing cloud resources from the user point of view and from the point of view of a software-defined infrastructure of the virtual data center (DC). The innovation character of this model is in describing at the same time the application data placements, as well as the state of the virtual environment, taking into account the network topology. The model of software-defined storage has been developed as a submodel within the resource model. This model allows implementing the algorithm for control of cloud application assignments in software-defined storages. Experimental researches returned this algorithm decreases in cloud application response time and performance growth in user request processes. The use of software-defined data storages allows the decrease in the number of physical store devices, which demonstrates the efficiency of our algorithm.
Consolidation of cloud computing in ATLAS
NASA Astrophysics Data System (ADS)
Taylor, Ryan P.; Domingues Cordeiro, Cristovao Jose; Giordano, Domenico; Hover, John; Kouba, Tomas; Love, Peter; McNab, Andrew; Schovancova, Jaroslava; Sobie, Randall; ATLAS Collaboration
2017-10-01
Throughout the first half of LHC Run 2, ATLAS cloud computing has undergone a period of consolidation, characterized by building upon previously established systems, with the aim of reducing operational effort, improving robustness, and reaching higher scale. This paper describes the current state of ATLAS cloud computing. Cloud activities are converging on a common contextualization approach for virtual machines, and cloud resources are sharing monitoring and service discovery components. We describe the integration of Vacuum resources, streamlined usage of the Simulation at Point 1 cloud for offline processing, extreme scaling on Amazon compute resources, and procurement of commercial cloud capacity in Europe. Finally, building on the previously established monitoring infrastructure, we have deployed a real-time monitoring and alerting platform which coalesces data from multiple sources, provides flexible visualization via customizable dashboards, and issues alerts and carries out corrective actions in response to problems.
NASA Astrophysics Data System (ADS)
van Lew, Baldur; Botha, Charl P.; Milles, Julien R.; Vrooman, Henri A.; van de Giessen, Martijn; Lelieveldt, Boudewijn P. F.
2015-03-01
The cohort size required in epidemiological imaging genetics studies often mandates the pooling of data from multiple hospitals. Patient data, however, is subject to strict privacy protection regimes, and physical data storage may be legally restricted to a hospital network. To enable biomarker discovery, fast data access and interactive data exploration must be combined with high-performance computing resources, while respecting privacy regulations. We present a system using fast and inherently secure light-paths to access distributed data, thereby obviating the need for a central data repository. A secure private cloud computing framework facilitates interactive, computationally intensive exploration of this geographically distributed, privacy sensitive data. As a proof of concept, MRI brain imaging data hosted at two remote sites were processed in response to a user command at a third site. The system was able to automatically start virtual machines, run a selected processing pipeline and write results to a user accessible database, while keeping data locally stored in the hospitals. Individual tasks took approximately 50% longer compared to a locally hosted blade server but the cloud infrastructure reduced the total elapsed time by a factor of 40 using 70 virtual machines in the cloud. We demonstrated that the combination light-path and private cloud is a viable means of building an analysis infrastructure for secure data analysis. The system requires further work in the areas of error handling, load balancing and secure support of multiple users.
NGScloud: RNA-seq analysis of non-model species using cloud computing.
Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai
2018-05-03
RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.
Cross-VM Side Channels and Their Use to Extract Private Keys
2012-10-16
clouds such as Amazon EC2 and Rackspace, but also by other Xen use cases. For ex- 4 ample, many virtual desktop infrastructure ( VDI ) solutions (e.g...whose bit length is, for example, 337, 403, or 457 when κ is 2048 , 3072, or 4096, respectively. We note that this deviates from standard ElGamal, in
GATECloud.net: a platform for large-scale, open-source text processing on the cloud.
Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina
2013-01-28
Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.
Dynamic Collaboration Infrastructure for Hydrologic Science
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Idaszak, R.; Castillo, C.; Yi, H.; Jiang, F.; Jones, N.; Goodall, J. L.
2016-12-01
Data and modeling infrastructure is becoming increasingly accessible to water scientists. HydroShare is a collaborative environment that currently offers water scientists the ability to access modeling and data infrastructure in support of data intensive modeling and analysis. It supports the sharing of and collaboration around "resources" which are social objects defined to include both data and models in a structured standardized format. Users collaborate around these objects via comments, ratings, and groups. HydroShare also supports web services and cloud based computation for the execution of hydrologic models and analysis and visualization of hydrologic data. However, the quantity and variety of data and modeling infrastructure available that can be accessed from environments like HydroShare is increasing. Storage infrastructure can range from one's local PC to campus or organizational storage to storage in the cloud. Modeling or computing infrastructure can range from one's desktop to departmental clusters to national HPC resources to grid and cloud computing resources. How does one orchestrate this vast number of data and computing infrastructure without needing to correspondingly learn each new system? A common limitation across these systems is the lack of efficient integration between data transport mechanisms and the corresponding high-level services to support large distributed data and compute operations. A scientist running a hydrology model from their desktop may require processing a large collection of files across the aforementioned storage and compute resources and various national databases. To address these community challenges a proof-of-concept prototype was created integrating HydroShare with RADII (Resource Aware Data-centric collaboration Infrastructure) to provide software infrastructure to enable the comprehensive and rapid dynamic deployment of what we refer to as "collaborative infrastructure." In this presentation we discuss the results of this proof-of-concept prototype which enabled HydroShare users to readily instantiate virtual infrastructure marshaling arbitrary combinations, varieties, and quantities of distributed data and computing infrastructure in addressing big problems in hydrology.
NASA Astrophysics Data System (ADS)
Shamugam, Veeramani; Murray, I.; Leong, J. A.; Sidhu, Amandeep S.
2016-03-01
Cloud computing provides services on demand instantly, such as access to network infrastructure consisting of computing hardware, operating systems, network storage, database and applications. Network usage and demands are growing at a very fast rate and to meet the current requirements, there is a need for automatic infrastructure scaling. Traditional networks are difficult to automate because of the distributed nature of their decision making process for switching or routing which are collocated on the same device. Managing complex environments using traditional networks is time-consuming and expensive, especially in the case of generating virtual machines, migration and network configuration. To mitigate the challenges, network operations require efficient, flexible, agile and scalable software defined networks (SDN). This paper discuss various issues in SDN and suggests how to mitigate the network management related issues. A private cloud prototype test bed was setup to implement the SDN on the OpenStack platform to test and evaluate the various network performances provided by the various configurations.
Unidata's Vision for Transforming Geoscience by Moving Data Services and Software to the Cloud
NASA Astrophysics Data System (ADS)
Ramamurthy, Mohan; Fisher, Ward; Yoksas, Tom
2015-04-01
Universities are facing many challenges: shrinking budgets, rapidly evolving information technologies, exploding data volumes, multidisciplinary science requirements, and high expectations from students who have grown up with smartphones and tablets. These changes are upending traditional approaches to accessing and using data and software. Unidata recognizes that its products and services must evolve to support new approaches to research and education. After years of hype and ambiguity, cloud computing is maturing in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Cloud services aimed at providing any resource, at any time, from any place, using any device are increasingly being embraced by all types of organizations. Given this trend and the enormous potential of cloud-based services, Unidata is taking moving to augment its products, services, data delivery mechanisms and applications to align with the cloud-computing paradigm. Specifically, Unidata is working toward establishing a community-based development environment that supports the creation and use of software services to build end-to-end data workflows. The design encourages the creation of services that can be broken into small, independent chunks that provide simple capabilities. Chunks could be used individually to perform a task, or chained into simple or elaborate workflows. The services will also be portable in the form of downloadable Unidata-in-a-box virtual images, allowing their use in researchers' own cloud-based computing environments. In this talk, we present a vision for Unidata's future in a cloud-enabled data services and discuss our ongoing efforts to deploy a suite of Unidata data services and tools in the Amazon EC2 and Microsoft Azure cloud environments, including the transfer of real-time meteorological data into its cloud instances, product generation using those data, and the deployment of TDS, McIDAS ADDE and AWIPS II data servers and the Integrated Data Server visualization tool.
Calibration of radio-astronomical data on the cloud. LOFAR, the pathway to SKA
NASA Astrophysics Data System (ADS)
Sabater, J.; Sánchez-Expósito, S.; Garrido, J.; Ruiz, J. E.; Best, P. N.; Verdes-Montenegro, L.
2015-05-01
The radio interferometer LOFAR (LOw Frequency ARray) is fully operational now. This Square Kilometre Array (SKA) pathfinder allows the observation of the sky at frequencies between 10 and 240 MHz, a relatively unexplored region of the spectrum. LOFAR is a software defined telescope: the data is mainly processed using specialized software running in common computing facilities. That means that the capabilities of the telescope are virtually defined by software and mainly limited by the available computing power. However, the quantity of data produced can quickly reach huge volumes (several Petabytes per day). After the correlation and pre-processing of the data in a dedicated cluster, the final dataset is handled to the user (typically several Terabytes). The calibration of these data requires a powerful computing facility in which the specific state of the art software under heavy continuous development can be easily installed and updated. That makes this case a perfect candidate for a cloud infrastructure which adds the advantages of an on demand, flexible solution. We present our approach to the calibration of LOFAR data using Ibercloud, the cloud infrastructure provided by Ibergrid. With the calibration work-flow adapted to the cloud, we can explore calibration strategies for the SKA and show how private or commercial cloud infrastructures (Ibercloud, Amazon EC2, Google Compute Engine, etc.) can help to solve the problems with big datasets that will be prevalent in the future of astronomy.
Building a Cloud Infrastructure for a Virtual Environmental Observatory
NASA Astrophysics Data System (ADS)
El-khatib, Y.; Blair, G. S.; Gemmell, A. L.; Gurney, R. J.
2012-12-01
Environmental science is often fragmented: data is collected by different organizations using mismatched formats and conventions, and models are misaligned and run in isolation. Cloud computing offers a lot of potential in the way of resolving such issues by supporting data from different sources and at various scales, and integrating models to create more sophisticated and collaborative software services. The Environmental Virtual Observatory pilot (EVOp) project, funded by the UK Natural Environment Research Council, aims to demonstrate how cloud computing principles and technologies can be harnessed to develop more effective solutions to pressing environmental issues. The EVOp infrastructure is a tailored one constructed from resources in both private clouds (owned and managed by us) and public clouds (leased from third party providers). All system assets are accessible via a uniform web service interface in order to enable versatile and transparent resource management, and to support fundamental infrastructure properties such as reliability and elasticity. The abstraction that this 'everything as a service' principle brings also supports mashups, i.e. combining different web services (such as models) and data resources of different origins (in situ gauging stations, warehoused data stores, external sources, etc.). We adopt the RESTful style of web services in order to draw a clear line between client and server (i.e. cloud host) and also to keep the server completely stateless. This significantly improves the scalability of the infrastructure and enables easy infrastructure management. For instance, tasks such as load balancing and failure recovery are greatly simplified without the need for techniques such as advance resource reservation or shared block devices. Upon this infrastructure, we developed a web portal composed of a bespoke collection of web-based visualization tools to help bring out relationships or patterns within the data. The portal was designed for use without any programming prerequisites by stakeholders from different backgrounds such as scientists, policy makers, local communities, and the general public. The development of the portal was carried out using an iterative behaviour-driven approach. We have developed six distinct storyboards to determine the requirements of different users. From these, we identified two storyboards to implement during the pilot phase. The first explores flooding at a local catchment scale for farmers and the public. We simulate hydrological interactions to determine where saturated land-surface areas develop. Model parameter values resembling catchment characteristics could be specified either explicitly (for domain specialists) or indirectly using one of several predefined land use scenarios (for less familiar audiences). The second storyboard investigates the diffuse of agricultural pollution at a national level, with regulators as users. We study the flux of Nitrogen and Phosphorus from land to rivers and coastal regions at various scales of drainage and reporting units. This is particularly useful to uncover the impact of existing policy instruments or risk from future environmental changes on the levels of N and P flux.
NASA Astrophysics Data System (ADS)
McNab, A.
2017-10-01
This paper describes GridPP’s Vacuum Platform for managing virtual machines (VMs), which has been used to run production workloads for WLCG and other HEP experiments. The platform provides a uniform interface between VMs and the sites they run at, whether the site is organised as an Infrastructure-as-a-Service cloud system such as OpenStack, or an Infrastructure-as-a-Client system such as Vac. The paper describes our experience in using this platform, in developing and operating VM lifecycle managers Vac and Vcycle, and in interacting with VMs provided by LHCb, ATLAS, ALICE, CMS, and the GridPP DIRAC service to run production workloads.
Liu, Li; Chen, Weiping; Nie, Min; Zhang, Fengjuan; Wang, Yu; He, Ailing; Wang, Xiaonan; Yan, Gen
2016-11-01
To handle the emergence of the regional healthcare ecosystem, physicians and surgeons in various departments and healthcare institutions must process medical images securely, conveniently, and efficiently, and must integrate them with electronic medical records (EMRs). In this manuscript, we propose a software as a service (SaaS) cloud called the iMAGE cloud. A three-layer hybrid cloud was created to provide medical image processing services in the smart city of Wuxi, China, in April 2015. In the first step, medical images and EMR data were received and integrated via the hybrid regional healthcare network. Then, traditional and advanced image processing functions were proposed and computed in a unified manner in the high-performance cloud units. Finally, the image processing results were delivered to regional users using the virtual desktop infrastructure (VDI) technology. Security infrastructure was also taken into consideration. Integrated information query and many advanced medical image processing functions-such as coronary extraction, pulmonary reconstruction, vascular extraction, intelligent detection of pulmonary nodules, image fusion, and 3D printing-were available to local physicians and surgeons in various departments and healthcare institutions. Implementation results indicate that the iMAGE cloud can provide convenient, efficient, compatible, and secure medical image processing services in regional healthcare networks. The iMAGE cloud has been proven to be valuable in applications in the regional healthcare system, and it could have a promising future in the healthcare system worldwide.
Predictive Anomaly Management for Resilient Virtualized Computing Infrastructures
2015-05-27
PREC: Practical Root Exploit Containment for Android Devices, ACM Conference on Data and Application Security and Privacy (CODASPY) . 03-MAR-14...05-OCT-11, . : , Hiep Nguyen, Yongmin Tan, Xiaohui Gu. Propagation-aware Anomaly Localization for Cloud Hosted Distributed Applications , ACM...Workshop on Managing Large-Scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques (SLAML) in conjunction with SOSP
Cloud Computing and Virtual Desktop Infrastructures in Afloat Environments
2012-06-01
Institute of Standards and Technology NPS Naval Postgraduate School OCONUS Outside of the Continental United States ONE- NET OCONUS Navy Enterprise... framework of technology that allows all interested systems, inside and outside of an organization, to expose and access well-defined services, and...was established to manage the Navy’s three largest enterprise networks; the OCONUS Navy Enterprise 22 Network (ONE- NET ), the Navy-Marine Corps
NASA Astrophysics Data System (ADS)
Schaap, Dick M. A.; Fichaut, Michele
2017-04-01
SeaDataCloud marks the third phase of developing the pan-European SeaDataNet infrastructure for marine and ocean data management. The SeaDataCloud project is funded by EU and runs for 4 years from 1st November 2016. It succeeds the successful SeaDataNet II (2011 - 2015) and SeaDataNet (2006 - 2011) projects. SeaDataNet has set up and operates a pan-European infrastructure for managing marine and ocean data and is undertaken by National Oceanographic Data Centres (NODC's) and oceanographic data focal points from 34 coastal states in Europe. The infrastructure comprises a network of interconnected data centres and central SeaDataNet portal. The portal provides users a harmonised set of metadata directories and controlled access to the large collections of datasets, managed by the interconnected data centres. The population of directories has increased considerably in cooperation with and involvement in many associated EU projects and initiatives such as EMODnet. SeaDataNet at present gives overview and access to more than 1.9 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 100 connected data centres from 34 countries riparian to European seas. SeaDataNet is also active in setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards of ISO (19115, 19139), and OGC (WMS, WFS, CS-W and SWE). Standards and associated SeaDataNet tools are made available at the SeaDataNet portal for wide uptake by data handling and managing organisations. SeaDataCloud aims at further developing standards, innovating services & products, adopting new technologies, and giving more attention to users. Moreover, it is about implementing a cooperation between the SeaDataNet consortium of marine data centres and the EUDAT consortium of e-infrastructure service providers. SeaDataCloud aims at considerably advancing services and increasing their usage by adopting cloud and High Performance Computing technology. SeaDataCloud will empower researchers with a packaged collection of services and tools, tailored to their specific needs, supporting research and enabling generation of added-value products from marine and ocean data. Substantial activities will be focused on developing added-value services, such as data subsetting, analysis, visualisation, and publishing workflows for users, both regular and advanced users, as part of a Virtual Research Environment (VRE). SeaDataCloud aims at a number of leading user communities that have new challenges for upgrading and expanding the SeaDataNet standards and services: Science, EMODnet, Copernicus Marine Environmental Monitoring Service (CMEMS) and EuroGOOS, and International scientific programmes. The presentation will give information on present services of the SeaDataNet infrastructure and services, and the new challenges in SeaDataCloud, and will highlight a number of key achievements in SeaDataCloud so far.
1001 Ways to run AutoDock Vina for virtual screening
NASA Astrophysics Data System (ADS)
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D.
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
1001 Ways to run AutoDock Vina for virtual screening.
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
Unidata Cyberinfrastructure in the Cloud
NASA Astrophysics Data System (ADS)
Ramamurthy, M. K.; Young, J. W.
2016-12-01
Data services, software, and user support are critical components of geosciences cyber-infrastructure to help researchers to advance science. With the maturity of and significant advances in cloud computing, it has recently emerged as an alternative new paradigm for developing and delivering a broad array of services over the Internet. Cloud computing is now mature enough in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Given the enormous potential of cloud-based services, Unidata has been moving to augment its software, services, data delivery mechanisms to align with the cloud-computing paradigm. To realize the above vision, Unidata has worked toward: * Providing access to many types of data from a cloud (e.g., via the THREDDS Data Server, RAMADDA and EDEX servers); * Deploying data-proximate tools to easily process, analyze, and visualize those data in a cloud environment cloud for consumption by any one, by any device, from anywhere, at any time; * Developing and providing a range of pre-configured and well-integrated tools and services that can be deployed by any university in their own private or public cloud settings. Specifically, Unidata has developed Docker for "containerized applications", making them easy to deploy. Docker helps to create "disposable" installs and eliminates many configuration challenges. Containerized applications include tools for data transport, access, analysis, and visualization: THREDDS Data Server, Integrated Data Viewer, GEMPAK, Local Data Manager, RAMADDA Data Server, and Python tools; * Leveraging Jupyter as a central platform and hub with its powerful set of interlinking tools to connect interactively data servers, Python scientific libraries, scripts, and workflows; * Exploring end-to-end modeling and prediction capabilities in the cloud; * Partnering with NOAA and public cloud vendors (e.g., Amazon and OCC) on the NOAA Big Data Project to harness their capabilities and resources for the benefit of the academic community.
Supporting the scientific lifecycle through cloud services
NASA Astrophysics Data System (ADS)
Gensch, S.; Klump, J. F.; Bertelmann, R.; Braune, C.
2014-12-01
Cloud computing has made resources and applications available for numerous use cases ranging from business processes in the private sector to scientific applications. Developers have created tools for data management, collaborative writing, social networking, data access and visualization, project management and many more; either for free or as paid premium services with additional or extended features. Scientists have begun to incorporate tools that fit their needs into their daily work. To satisfy specialized needs, some cloud applications specifically address the needs of scientists for sharing research data, literature search, laboratory documentation, or data visualization. Cloud services may vary in extent, user coverage, and inter-service integration and are also at risk of being abandonend or changed by the service providers making changes to their business model, or leaving the field entirely.Within the project Academic Enterprise Cloud we examine cloud based services that support the research lifecycle, using feature models to describe key properties in the areas of infrastructure and service provision, compliance to legal regulations, and data curation. Emphasis is put on the term Enterprise as to establish an academic cloud service provider infrastructure that satisfies demands of the research community through continious provision across the whole cloud stack. This could enable the research community to be independent from service providers regarding changes to terms of service and ensuring full control of its extent and usage. This shift towards a self-empowered scientific cloud provider infrastructure and its community raises implications about feasability of provision and overall costs. Legal aspects and licensing issues have to be considered, when moving data into cloud services, especially when personal data is involved.Educating researchers about cloud based tools is important to help in the transition towards effective and safe use. Scientists can benefit from the provision of standard services, like weblog and website creation, virtual machine deployments, and groupware provision using cloud based app store-like portals. And, other than in an industrial environment, researchers will want to keep their existing user profile when moving from one institution to another.
Valdivieso Caraguay, Ángel Leonardo; García Villalba, Luis Javier
2017-01-01
This paper presents the Monitoring and Discovery Framework of the Self-Organized Network Management in Virtualized and Software Defined Networks SELFNET project. This design takes into account the scalability and flexibility requirements needed by 5G infrastructures. In this context, the present framework focuses on gathering and storing the information (low-level metrics) related to physical and virtual devices, cloud environments, flow metrics, SDN traffic and sensors. Similarly, it provides the monitoring data as a generic information source in order to allow the correlation and aggregation tasks. Our design enables the collection and storing of information provided by all the underlying SELFNET sublayers, including the dynamically onboarded and instantiated SDN/NFV Apps, also known as SELFNET sensors. PMID:28362346
An element search ant colony technique for solving virtual machine placement problem
NASA Astrophysics Data System (ADS)
Srija, J.; Rani John, Rose; Kanaga, Grace Mary, Dr.
2017-09-01
The data centres in the cloud environment play a key role in providing infrastructure for ubiquitous computing, pervasive computing, mobile computing etc. This computing technique tries to utilize the available resources in order to provide services. Hence maintaining the resource utilization without wastage of power consumption has become a challenging task for the researchers. In this paper we propose the direct guidance ant colony system for effective mapping of virtual machines to the physical machine with maximal resource utilization and minimal power consumption. The proposed algorithm has been compared with the existing ant colony approach which is involved in solving virtual machine placement problem and thus the proposed algorithm proves to provide better result than the existing technique.
Caraguay, Ángel Leonardo Valdivieso; Villalba, Luis Javier García
2017-03-31
This paper presents the Monitoring and Discovery Framework of the Self-Organized Network Management in Virtualized and Software Defined Networks SELFNET project. This design takes into account the scalability and flexibility requirements needed by 5G infrastructures. In this context, the present framework focuses on gathering and storing the information (low-level metrics) related to physical and virtual devices, cloud environments, flow metrics, SDN traffic and sensors. Similarly, it provides the monitoring data as a generic information source in order to allow the correlation and aggregation tasks. Our design enables the collection and storing of information provided by all the underlying SELFNET sublayers, including the dynamically onboarded and instantiated SDN/NFV Apps, also known as SELFNET sensors.
Yu, Si; Gui, Xiaolin; Lin, Jiancai; Tian, Feng; Zhao, Jianqiang; Dai, Min
2014-01-01
Cloud computing gets increasing attention for its capacity to leverage developers from infrastructure management tasks. However, recent works reveal that side channel attacks can lead to privacy leakage in the cloud. Enhancing isolation between users is an effective solution to eliminate the attack. In this paper, to eliminate side channel attacks, we investigate the isolation enhancement scheme from the aspect of virtual machine (VM) management. The security-awareness VMs management scheme (SVMS), a VMs isolation enhancement scheme to defend against side channel attacks, is proposed. First, we use the aggressive conflict of interest relation (ACIR) and aggressive in ally with relation (AIAR) to describe user constraint relations. Second, based on the Chinese wall policy, we put forward four isolation rules. Third, the VMs placement and migration algorithms are designed to enforce VMs isolation between the conflict users. Finally, based on the normal distribution, we conduct a series of experiments to evaluate SVMS. The experimental results show that SVMS is efficient in guaranteeing isolation between VMs owned by conflict users, while the resource utilization rate decreases but not by much.
Gui, Xiaolin; Lin, Jiancai; Tian, Feng; Zhao, Jianqiang; Dai, Min
2014-01-01
Cloud computing gets increasing attention for its capacity to leverage developers from infrastructure management tasks. However, recent works reveal that side channel attacks can lead to privacy leakage in the cloud. Enhancing isolation between users is an effective solution to eliminate the attack. In this paper, to eliminate side channel attacks, we investigate the isolation enhancement scheme from the aspect of virtual machine (VM) management. The security-awareness VMs management scheme (SVMS), a VMs isolation enhancement scheme to defend against side channel attacks, is proposed. First, we use the aggressive conflict of interest relation (ACIR) and aggressive in ally with relation (AIAR) to describe user constraint relations. Second, based on the Chinese wall policy, we put forward four isolation rules. Third, the VMs placement and migration algorithms are designed to enforce VMs isolation between the conflict users. Finally, based on the normal distribution, we conduct a series of experiments to evaluate SVMS. The experimental results show that SVMS is efficient in guaranteeing isolation between VMs owned by conflict users, while the resource utilization rate decreases but not by much. PMID:24688434
Physicists Get INSPIREd: INSPIRE Project and Grid Applications
NASA Astrophysics Data System (ADS)
Klem, Jukka; Iwaszkiewicz, Jan
2011-12-01
INSPIRE is the new high-energy physics scientific information system developed by CERN, DESY, Fermilab and SLAC. INSPIRE combines the curated and trusted contents of SPIRES database with Invenio digital library technology. INSPIRE contains the entire HEP literature with about one million records and in addition to becoming the reference HEP scientific information platform, it aims to provide new kinds of data mining services and metrics to assess the impact of articles and authors. Grid and cloud computing provide new opportunities to offer better services in areas that require large CPU and storage resources including document Optical Character Recognition (OCR) processing, full-text indexing of articles and improved metrics. D4Science-II is a European project that develops and operates an e-Infrastructure supporting Virtual Research Environments (VREs). It develops an enabling technology (gCube) which implements a mechanism for facilitating the interoperation of its e-Infrastructure with other autonomously running data e-Infrastructures. As a result, this creates the core of an e-Infrastructure ecosystem. INSPIRE is one of the e-Infrastructures participating in D4Science-II project. In the context of the D4Science-II project, the INSPIRE e-Infrastructure makes available some of its resources and services to other members of the resulting ecosystem. Moreover, it benefits from the ecosystem via a dedicated Virtual Organization giving access to an array of resources ranging from computing and storage resources of grid infrastructures to data and services.
Unidata cyberinfrastructure in the cloud: A progress report
NASA Astrophysics Data System (ADS)
Ramamurthy, Mohan
2016-04-01
Data services, software, and committed support are critical components of geosciences cyber-infrastructure that can help scientists address problems of unprecedented complexity, scale, and scope. Unidata is currently working on innovative ideas, new paradigms, and novel techniques to complement and extend its offerings. Our goal is to empower users so that they can tackle major, heretofore difficult problems. Unidata recognizes that its products and services must evolve to support new approaches to research and education. After years of hype and ambiguity, cloud computing is maturing in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Cloud services aimed at providing any resource, at any time, from any place, using any device are increasingly being embraced by all types of organizations. Given this trend and the enormous potential of cloud-based services, Unidata is moving to augment its products, services, data delivery mechanisms and applications to align with the cloud-computing paradigm. To realize the above vision, Unidata is working toward: * Providing access to many types of data from a cloud (e.g., TDS, RAMADDA and EDEX); * Deploying data-proximate tools to easily process, analyze and visualize those data in a cloud environment cloud for consumption by any one, by any device, from anywhere, at any time; * Developing and providing a range of pre-configured and well-integrated tools and services that can be deployed by any university in their own private or public cloud settings. Specifically, Unidata has developed Docker for "containerized applications", making them easy to deploy. Docker helps to create "disposable" installs and eliminates many configuration challenges. Containerized applications include tools for data transport, access, analysis, and visualization: THREDDS Data Server, Integrated Data Viewer, GEMPAK, Local Data Manager, RAMADDA Data Server, and Python tools; * Fostering partnerships with NOAA and public cloud vendors (e.g., Amazon) to harness their capabilities and resources for the benefit of the academic community.
T-dominance: Prioritized Defense Deployment for BYOD Security (Post Print)
2013-10-01
infrastructure. Employees’ demand/ satisfaction , decreased IT acquisition and support cost, and increased use of cloud/virtualization technologies in...example, a report [8] on hijacking hotel Wi-Fi hotspots for drive-by malware attacks on laptops comes close to what we have in mind; practical man-in...obtaining unwarranted privilege, are often ignored for convenience, or circumvented for customization by the users. Rootkits, like iOS Jailbreak5, are
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
NASA Astrophysics Data System (ADS)
Berzano, D.; Blomer, J.; Buncic, P.; Charalampidis, I.; Ganis, G.; Meusel, R.
2015-12-01
During the last years, several Grid computing centres chose virtualization as a better way to manage diverse use cases with self-consistent environments on the same bare infrastructure. The maturity of control interfaces (such as OpenNebula and OpenStack) opened the possibility to easily change the amount of resources assigned to each use case by simply turning on and off virtual machines. Some of those private clouds use, in production, copies of the Virtual Analysis Facility, a fully virtualized and self-contained batch analysis cluster capable of expanding and shrinking automatically upon need: however, resources starvation occurs frequently as expansion has to compete with other virtual machines running long-living batch jobs. Such batch nodes cannot relinquish their resources in a timely fashion: the more jobs they run, the longer it takes to drain them and shut off, and making one-job virtual machines introduces a non-negligible virtualization overhead. By improving several components of the Virtual Analysis Facility we have realized an experimental “Docked” Analysis Facility for ALICE, which leverages containers instead of virtual machines for providing performance and security isolation. We will present the techniques we have used to address practical problems, such as software provisioning through CVMFS, as well as our considerations on the maturity of containers for High Performance Computing. As the abstraction layer is thinner, our Docked Analysis Facilities may feature a more fine-grained sizing, down to single-job node containers: we will show how this approach will positively impact automatic cluster resizing by deploying lightweight pilot containers instead of replacing central queue polls.
NASA Astrophysics Data System (ADS)
Stagni, F.; McNab, A.; Luzzi, C.; Krzemien, W.; Consortium, DIRAC
2017-10-01
In the last few years, new types of computing models, such as IAAS (Infrastructure as a Service) and IAAC (Infrastructure as a Client), gained popularity. New resources may come as part of pledged resources, while others are in the form of opportunistic ones. Most but not all of these new infrastructures are based on virtualization techniques. In addition, some of them, present opportunities for multi-processor computing slots to the users. Virtual Organizations are therefore facing heterogeneity of the available resources and the use of an Interware software like DIRAC to provide the transparent, uniform interface has become essential. The transparent access to the underlying resources is realized by implementing the pilot model. DIRAC’s newest generation of generic pilots (the so-called Pilots 2.0) are the “pilots for all the skies”, and have been successfully released in production more than a year ago. They use a plugin mechanism that makes them easily adaptable. Pilots 2.0 have been used for fetching and running jobs on every type of resource, being it a Worker Node (WN) behind a CREAM/ARC/HTCondor/DIRAC Computing element, a Virtual Machine running on IaaC infrastructures like Vac or BOINC, on IaaS cloud resources managed by Vcycle, the LHCb High Level Trigger farm nodes, and any type of opportunistic computing resource. Make a machine a “Pilot Machine”, and all diversities between them will disappear. This contribution describes how pilots are made suitable for different resources, and the recent steps taken towards a fully unified framework, including monitoring. Also, the cases of multi-processor computing slots either on real or virtual machines, with the whole node or a partition of it, is discussed.
Considerations for Software Defined Networking (SDN): Approaches and use cases
NASA Astrophysics Data System (ADS)
Bakshi, K.
Software Defined Networking (SDN) is an evolutionary approach to network design and functionality based on the ability to programmatically modify the behavior of network devices. SDN uses user-customizable and configurable software that's independent of hardware to enable networked systems to expand data flow control. SDN is in large part about understanding and managing a network as a unified abstraction. It will make networks more flexible, dynamic, and cost-efficient, while greatly simplifying operational complexity. And this advanced solution provides several benefits including network and service customizability, configurability, improved operations, and increased performance. There are several approaches to SDN and its practical implementation. Among them, two have risen to prominence with differences in pedigree and implementation. This paper's main focus will be to define, review, and evaluate salient approaches and use cases of the OpenFlow and Virtual Network Overlay approaches to SDN. OpenFlow is a communication protocol that gives access to the forwarding plane of a network's switches and routers. The Virtual Network Overlay relies on a completely virtualized network infrastructure and services to abstract the underlying physical network, which allows the overlay to be mobile to other physical networks. This is an important requirement for cloud computing, where applications and associated network services are migrated to cloud service providers and remote data centers on the fly as resource demands dictate. The paper will discuss how and where SDN can be applied and implemented, including research and academia, virtual multitenant data center, and cloud computing applications. Specific attention will be given to the cloud computing use case, where automated provisioning and programmable overlay for scalable multi-tenancy is leveraged via the SDN approach.
NASA Astrophysics Data System (ADS)
The CHAIN-REDS Project is organising a workshop on "e-Infrastructures for e-Sciences" focusing on Cloud Computing and Data Repositories under the aegis of the European Commission and in co-location with the International Conference on e-Science 2013 (IEEE2013) that will be held in Beijing, P.R. of China on October 17-22, 2013. The core objective of the CHAIN-REDS project is to promote, coordinate and support the effort of a critical mass of non-European e-Infrastructures for Research and Education to collaborate with Europe addressing interoperability and interoperation of Grids and other Distributed Computing Infrastructures (DCI). From this perspective, CHAIN-REDS will optimise the interoperation of European infrastructures with those present in 6 other regions of the world, both from a development and use point of view, and catering to different communities. Overall, CHAIN-REDS will provide input for future strategies and decision-making regarding collaboration with other regions on e-Infrastructure deployment and availability of related data; it will raise the visibility of e-Infrastructures towards intercontinental audiences, covering most of the world and will provide support to establish globally connected and interoperable infrastructures, in particular between the EU and the developing regions. Organised by IHEP, INFN and Sigma Orionis with the support of all project partners, this workshop will aim at: - Presenting the state of the art of Cloud computing in Europe and in China and discussing the opportunities offered by having interoperable and federated e-Infrastructures; - Exploring the existing initiatives of Data Infrastructures in Europe and China, and highlighting the Data Repositories of interest for the Virtual Research Communities in several domains such as Health, Agriculture, Climate, etc.
Virtual Geophysics Laboratory: Exploiting the Cloud and Empowering Geophysicsts
NASA Astrophysics Data System (ADS)
Fraser, Ryan; Vote, Josh; Goh, Richard; Cox, Simon
2013-04-01
Over the last five decades geoscientists from Australian state and federal agencies have collected and assembled around 3 Petabytes of geoscience data sets under public funding. As a consequence of technological progress, data is now being acquired at exponential rates and in higher resolution than ever before. Effective use of these big data sets challenges the storage and computational infrastructure of most organizations. The Virtual Geophysics Laboratory (VGL) is a scientific workflow portal addresses some of the resulting issues by providing Australian geophysicists with access to a Web 2.0 or Rich Internet Application (RIA) based integrated environment that exploits eResearch tools and Cloud computing technology, and promotes collaboration between the user community. VGL simplifies and automates large portions of what were previously manually intensive scientific workflow processes, allowing scientists to focus on the natural science problems, rather than computer science and IT. A number of geophysical processing codes are incorporated to support multiple workflows. For example a gravity inversion can be performed by combining the Escript/Finley codes (from the University of Queensland) with the gravity data registered in VGL. Likewise, tectonic processes can also be modeled by combining the Underworld code (from Monash University) with one of the various 3D models available to VGL. Cloud services provide scalable and cost effective compute resources. VGL is built on top of mature standards-compliant information services, many deployed using the Spatial Information Services Stack (SISS), which provides direct access to geophysical data. A large number of data sets from Geoscience Australia assist users in data discovery. GeoNetwork provides a metadata catalog to store workflow results for future use, discovery and provenance tracking. VGL has been developed in collaboration with the research community using incremental software development practices and open source tools. While developed to provide the geophysics research community with a sustainable platform and scalable infrastructure; VGL has also developed a number of concepts, patterns and generic components of which have been reused for cases beyond geophysics, including natural hazards, satellite processing and other areas requiring spatial data discovery and processing. Future plans for VGL include a number of improvements in both functional and non-functional areas in response to its user community needs and advancement in information technologies. In particular, research is underway in the following areas (a) distributed and parallel workflow processing in the cloud, (b) seamless integration with various cloud providers, and (c) integration with virtual laboratories representing other science domains. Acknowledgements: VGL was developed by CSIRO in collaboration with Geoscience Australia, National Computational Infrastructure, Australia National University, Monash University and University of Queensland, and has been supported by the Australian Government's Education Investment Funds through NeCTAR.
SLA-based optimisation of virtualised resource for multi-tier web applications in cloud data centres
NASA Astrophysics Data System (ADS)
Bi, Jing; Yuan, Haitao; Tie, Ming; Tan, Wei
2015-10-01
Dynamic virtualised resource allocation is the key to quality of service assurance for multi-tier web application services in cloud data centre. In this paper, we develop a self-management architecture of cloud data centres with virtualisation mechanism for multi-tier web application services. Based on this architecture, we establish a flexible hybrid queueing model to determine the amount of virtual machines for each tier of virtualised application service environments. Besides, we propose a non-linear constrained optimisation problem with restrictions defined in service level agreement. Furthermore, we develop a heuristic mixed optimisation algorithm to maximise the profit of cloud infrastructure providers, and to meet performance requirements from different clients as well. Finally, we compare the effectiveness of our dynamic allocation strategy with two other allocation strategies. The simulation results show that the proposed resource allocation method is efficient in improving the overall performance and reducing the resource energy cost.
User-driven Cloud Implementation of environmental models and data for all
NASA Astrophysics Data System (ADS)
Gurney, R. J.; Percy, B. J.; Elkhatib, Y.; Blair, G. S.
2014-12-01
Environmental data and models come from disparate sources over a variety of geographical and temporal scales with different resolutions and data standards, often including terabytes of data and model simulations. Unfortunately, these data and models tend to remain solely within the custody of the private and public organisations which create the data, and the scientists who build models and generate results. Although many models and datasets are theoretically available to others, the lack of ease of access tends to keep them out of reach of many. We have developed an intuitive web-based tool that utilises environmental models and datasets located in a cloud to produce results that are appropriate to the user. Storyboards showing the interfaces and visualisations have been created for each of several exemplars. A library of virtual machine images has been prepared to serve these exemplars. Each virtual machine image has been tailored to run computer models appropriate to the end user. Two approaches have been used; first as RESTful web services conforming to the Open Geospatial Consortium (OGC) Web Processing Service (WPS) interface standard using the Python-based PyWPS; second, a MySQL database interrogated using PHP code. In all cases, the web client sends the server an HTTP GET request to execute the process with a number of parameter values and, once execution terminates, an XML or JSON response is sent back and parsed at the client side to extract the results. All web services are stateless, i.e. application state is not maintained by the server, reducing its operational overheads and simplifying infrastructure management tasks such as load balancing and failure recovery. A hybrid cloud solution has been used with models and data sited on both private and public clouds. The storyboards have been transformed into intuitive web interfaces at the client side using HTML, CSS and JavaScript, utilising plug-ins such as jQuery and Flot (for graphics), and Google Maps APIs. We have demonstrated that a cloud infrastructure can be used to assemble a virtual research environment that, coupled with a user-driven development approach, is able to cater to the needs of a wide range of user groups, from domain experts to concerned members of the general public.
BlueSky Cloud - rapid infrastructure capacity using Amazon's Cloud for wildfire emergency response
NASA Astrophysics Data System (ADS)
Haderman, M.; Larkin, N. K.; Beach, M.; Cavallaro, A. M.; Stilley, J. C.; DeWinter, J. L.; Craig, K. J.; Raffuse, S. M.
2013-12-01
During peak fire season in the United States, many large wildfires often burn simultaneously across the country. Smoke from these fires can produce air quality emergencies. It is vital that incident commanders, air quality agencies, and public health officials have smoke impact information at their fingertips for evaluating where fires and smoke are and where the smoke will go next. To address the need for this kind of information, the U.S. Forest Service AirFire Team created the BlueSky Framework, a modeling system that predicts concentrations of particle pollution from wildfires. During emergency response, decision makers use BlueSky predictions to make public outreach and evacuation decisions. The models used in BlueSky predictions are computationally intensive, and the peak fire season requires significantly more computer resources than off-peak times. Purchasing enough hardware to run the number of BlueSky Framework runs that are needed during fire season is expensive and leaves idle servers running the majority of the year. The AirFire Team and STI developed BlueSky Cloud to take advantage of Amazon's virtual servers hosted in the cloud. With BlueSky Cloud, as demand increases and decreases, servers can be easily spun up and spun down at a minimal cost. Moving standard BlueSky Framework runs into the Amazon Cloud made it possible for the AirFire Team to rapidly increase the number of BlueSky Framework instances that could be run simultaneously without the costs associated with purchasing and managing servers. In this presentation, we provide an overview of the features of BlueSky Cloud, describe how the system uses Amazon Cloud, and discuss the costs and benefits of moving from privately hosted servers to a cloud-based infrastructure.
Investigation of Storage Options for Scientific Computing on Grid and Cloud Facilities
NASA Astrophysics Data System (ADS)
Garzoglio, Gabriele
2012-12-01
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on “bare metal” nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.
Investigation of storage options for scientific computing on Grid and Cloud facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storagemore » server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on bare metal nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.« less
NASA Astrophysics Data System (ADS)
Timm, S.; Cooper, G.; Fuess, S.; Garzoglio, G.; Holzman, B.; Kennedy, R.; Grassano, D.; Tiradani, A.; Krishnamurthy, R.; Vinayagam, S.; Raicu, I.; Wu, H.; Ren, S.; Noh, S.-Y.
2017-10-01
The Fermilab HEPCloud Facility Project has as its goal to extend the current Fermilab facility interface to provide transparent access to disparate resources including commercial and community clouds, grid federations, and HPC centers. This facility enables experiments to perform the full spectrum of computing tasks, including data-intensive simulation and reconstruction. We have evaluated the use of the commercial cloud to provide elasticity to respond to peaks of demand without overprovisioning local resources. Full scale data-intensive workflows have been successfully completed on Amazon Web Services for two High Energy Physics Experiments, CMS and NOνA, at the scale of 58000 simultaneous cores. This paper describes the significant improvements that were made to the virtual machine provisioning system, code caching system, and data movement system to accomplish this work. The virtual image provisioning and contextualization service was extended to multiple AWS regions, and to support experiment-specific data configurations. A prototype Decision Engine was written to determine the optimal availability zone and instance type to run on, minimizing cost and job interruptions. We have deployed a scalable on-demand caching service to deliver code and database information to jobs running on the commercial cloud. It uses the frontiersquid server and CERN VM File System (CVMFS) clients on EC2 instances and utilizes various services provided by AWS to build the infrastructure (stack). We discuss the architecture and load testing benchmarks on the squid servers. We also describe various approaches that were evaluated to transport experimental data to and from the cloud, and the optimal solutions that were used for the bulk of the data transport. Finally, we summarize lessons learned from this scale test, and our future plans to expand and improve the Fermilab HEP Cloud Facility.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timm, S.; Cooper, G.; Fuess, S.
The Fermilab HEPCloud Facility Project has as its goal to extend the current Fermilab facility interface to provide transparent access to disparate resources including commercial and community clouds, grid federations, and HPC centers. This facility enables experiments to perform the full spectrum of computing tasks, including data-intensive simulation and reconstruction. We have evaluated the use of the commercial cloud to provide elasticity to respond to peaks of demand without overprovisioning local resources. Full scale data-intensive workflows have been successfully completed on Amazon Web Services for two High Energy Physics Experiments, CMS and NOνA, at the scale of 58000 simultaneous cores.more » This paper describes the significant improvements that were made to the virtual machine provisioning system, code caching system, and data movement system to accomplish this work. The virtual image provisioning and contextualization service was extended to multiple AWS regions, and to support experiment-specific data configurations. A prototype Decision Engine was written to determine the optimal availability zone and instance type to run on, minimizing cost and job interruptions. We have deployed a scalable on-demand caching service to deliver code and database information to jobs running on the commercial cloud. It uses the frontiersquid server and CERN VM File System (CVMFS) clients on EC2 instances and utilizes various services provided by AWS to build the infrastructure (stack). We discuss the architecture and load testing benchmarks on the squid servers. We also describe various approaches that were evaluated to transport experimental data to and from the cloud, and the optimal solutions that were used for the bulk of the data transport. Finally, we summarize lessons learned from this scale test, and our future plans to expand and improve the Fermilab HEP Cloud Facility.« less
The International Symposium on Grids and Clouds
NASA Astrophysics Data System (ADS)
The International Symposium on Grids and Clouds (ISGC) 2012 will be held at Academia Sinica in Taipei from 26 February to 2 March 2012, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). 2012 is the decennium anniversary of the ISGC which over the last decade has tracked the convergence, collaboration and innovation of individual researchers across the Asia Pacific region to a coherent community. With the continuous support and dedication from the delegates, ISGC has provided the primary international distributed computing platform where distinguished researchers and collaboration partners from around the world share their knowledge and experiences. The last decade has seen the wide-scale emergence of e-Infrastructure as a critical asset for the modern e-Scientist. The emergence of large-scale research infrastructures and instruments that has produced a torrent of electronic data is forcing a generational change in the scientific process and the mechanisms used to analyse the resulting data deluge. No longer can the processing of these vast amounts of data and production of relevant scientific results be undertaken by a single scientist. Virtual Research Communities that span organisations around the world, through an integrated digital infrastructure that connects the trust and administrative domains of multiple resource providers, have become critical in supporting these analyses. Topics covered in ISGC 2012 include: High Energy Physics, Biomedicine & Life Sciences, Earth Science, Environmental Changes and Natural Disaster Mitigation, Humanities & Social Sciences, Operations & Management, Middleware & Interoperability, Security and Networking, Infrastructure Clouds & Virtualisation, Business Models & Sustainability, Data Management, Distributed Volunteer & Desktop Grid Computing, High Throughput Computing, and High Performance, Manycore & GPU Computing.
NASA Astrophysics Data System (ADS)
Evans, J. D.; Hao, W.; Chettri, S.
2013-12-01
The cloud is proving to be a uniquely promising platform for scientific computing. Our experience with processing satellite data using Amazon Web Services highlights several opportunities for enhanced performance, flexibility, and cost effectiveness in the cloud relative to traditional computing -- for example: - Direct readout from a polar-orbiting satellite such as the Suomi National Polar-Orbiting Partnership (S-NPP) requires bursts of processing a few times a day, separated by quiet periods when the satellite is out of receiving range. In the cloud, by starting and stopping virtual machines in minutes, we can marshal significant computing resources quickly when needed, but not pay for them when not needed. To take advantage of this capability, we are automating a data-driven approach to the management of cloud computing resources, in which new data availability triggers the creation of new virtual machines (of variable size and processing power) which last only until the processing workflow is complete. - 'Spot instances' are virtual machines that run as long as one's asking price is higher than the provider's variable spot price. Spot instances can greatly reduce the cost of computing -- for software systems that are engineered to withstand unpredictable interruptions in service (as occurs when a spot price exceeds the asking price). We are implementing an approach to workflow management that allows data processing workflows to resume with minimal delays after temporary spot price spikes. This will allow systems to take full advantage of variably-priced 'utility computing.' - Thanks to virtual machine images, we can easily launch multiple, identical machines differentiated only by 'user data' containing individualized instructions (e.g., to fetch particular datasets or to perform certain workflows or algorithms) This is particularly useful when (as is the case with S-NPP data) we need to launch many very similar machines to process an unpredictable number of data files concurrently. Our experience shows the viability and flexibility of this approach to workflow management for scientific data processing. - Finally, cloud computing is a promising platform for distributed volunteer ('interstitial') computing, via mechanisms such as the Berkeley Open Infrastructure for Network Computing (BOINC) popularized with the SETI@Home project and others such as ClimatePrediction.net and NASA's Climate@Home. Interstitial computing faces significant challenges as commodity computing shifts from (always on) desktop computers towards smartphones and tablets (untethered and running on scarce battery power); but cloud computing offers significant slack capacity. This capacity includes virtual machines with unused RAM or underused CPUs; virtual storage volumes allocated (& paid for) but not full; and virtual machines that are paid up for the current hour but whose work is complete. We are devising ways to facilitate the reuse of these resources (i.e., cloud-based interstitial computing) for satellite data processing and related analyses. We will present our findings and research directions on these and related topics.
A Novel Cost Based Model for Energy Consumption in Cloud Computing
Horri, A.; Dastghaibyfard, Gh.
2015-01-01
Cloud data centers consume enormous amounts of electrical energy. To support green cloud computing, providers also need to minimize cloud infrastructure energy consumption while conducting the QoS. In this study, for cloud environments an energy consumption model is proposed for time-shared policy in virtualization layer. The cost and energy usage of time-shared policy were modeled in the CloudSim simulator based upon the results obtained from the real system and then proposed model was evaluated by different scenarios. In the proposed model, the cache interference costs were considered. These costs were based upon the size of data. The proposed model was implemented in the CloudSim simulator and the related simulation results indicate that the energy consumption may be considerable and that it can vary with different parameters such as the quantum parameter, data size, and the number of VMs on a host. Measured results validate the model and demonstrate that there is a tradeoff between energy consumption and QoS in the cloud environment. Also, measured results validate the model and demonstrate that there is a tradeoff between energy consumption and QoS in the cloud environment. PMID:25705716
A novel cost based model for energy consumption in cloud computing.
Horri, A; Dastghaibyfard, Gh
2015-01-01
Cloud data centers consume enormous amounts of electrical energy. To support green cloud computing, providers also need to minimize cloud infrastructure energy consumption while conducting the QoS. In this study, for cloud environments an energy consumption model is proposed for time-shared policy in virtualization layer. The cost and energy usage of time-shared policy were modeled in the CloudSim simulator based upon the results obtained from the real system and then proposed model was evaluated by different scenarios. In the proposed model, the cache interference costs were considered. These costs were based upon the size of data. The proposed model was implemented in the CloudSim simulator and the related simulation results indicate that the energy consumption may be considerable and that it can vary with different parameters such as the quantum parameter, data size, and the number of VMs on a host. Measured results validate the model and demonstrate that there is a tradeoff between energy consumption and QoS in the cloud environment. Also, measured results validate the model and demonstrate that there is a tradeoff between energy consumption and QoS in the cloud environment.
EVER-EST: a virtual research environment for Earth Sciences
NASA Astrophysics Data System (ADS)
Marelli, Fulvio; Albani, Mirko; Glaves, Helen
2016-04-01
There is an increasing requirement for researchers to work collaboratively using common resources whilst being geographically dispersed. By creating a virtual research environment (VRE) using a service oriented architecture (SOA) tailored to the needs of Earth Science (ES) communities, the EVEREST project will provide a range of both generic and domain specific data management services to support a dynamic approach to collaborative research. EVER-EST will provide the means to overcome existing barriers to sharing of Earth Science data and information allowing research teams to discover, access, share and process heterogeneous data, algorithms, results and experiences within and across their communities, including those domains beyond Earth Science. Researchers will be able to seamlessly manage both the data involved in their computationally intensive disciplines and the scientific methods applied in their observations and modelling, which lead to the specific results that need to be attributable, validated and shared both within the community and more widely e.g. in the form of scholarly communications. Central to the EVEREST approach is the concept of the Research Object (RO) , which provides a semantically rich mechanism to aggregate related resources about a scientific investigation so that they can be shared together using a single unique identifier. Although several e-laboratories are incorporating the research object concept in their infrastructure, the EVER-EST VRE will be the first infrastructure to leverage the concept of Research Objects and their application in observational rather than experimental disciplines. Development of the EVEREST VRE will leverage the results of several previous projects which have produced state-of-the-art technologies for scientific data management and curation as well those which have developed models, techniques and tools for the preservation of scientific methods and their implementation in computational forms such as scientific workflows. The EVER-EST data processing infrastructure will be based on a Cloud Computing approach, in which new applications can be integrated using "virtual machines" that have their own specifications (disk size, processor speed, operating system etc.) and run on shared private (physical deployment over local hardware) or commercial Cloud infrastructures. The EVER-EST e-infrastructure will be validated by four virtual research communities (VRC) covering different multidisciplinary Earth Science domains including: ocean monitoring, natural hazards, land monitoring and risk management (volcanoes and seismicity). Each VRC will use the virtual research environment according to its own specific requirements for data, software, best practice and community engagement. This user-centric approach will allow an assessment to be made of the capability for the proposed solution to satisfy the heterogeneous needs of a variety of Earth Science communities for more effective collaboration, and higher efficiency and creativity in research. EVER-EST is funded by the European Commission's H2020 for three years starting in October 2015. The project is led by the European Space Agency (ESA), involves some of the major European Earth Science data providers/users including NERC, DLR, INGV, CNR and SatCEN.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Using Cloud Computing infrastructure with CloudBioLinux, CloudMan and Galaxy
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-01-01
Cloud computing has revolutionized availability and access to computing and storage resources; making it possible to provision a large computational infrastructure with only a few clicks in a web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this protocol, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatics analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to setup the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command line interface, and the web-based Galaxy interface. PMID:22700313
Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy.
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-06-01
Cloud computing has revolutionized availability and access to computing and storage resources, making it possible to provision a large computational infrastructure with only a few clicks in a Web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this unit, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatic analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy, into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to set up the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command-line interface, and the Web-based Galaxy interface.
Dynamic electronic institutions in agent oriented cloud robotic systems.
Nagrath, Vineet; Morel, Olivier; Malik, Aamir; Saad, Naufal; Meriaudeau, Fabrice
2015-01-01
The dot-com bubble bursted in the year 2000 followed by a swift movement towards resource virtualization and cloud computing business model. Cloud computing emerged not as new form of computing or network technology but a mere remoulding of existing technologies to suit a new business model. Cloud robotics is understood as adaptation of cloud computing ideas for robotic applications. Current efforts in cloud robotics stress upon developing robots that utilize computing and service infrastructure of the cloud, without debating on the underlying business model. HTM5 is an OMG's MDA based Meta-model for agent oriented development of cloud robotic systems. The trade-view of HTM5 promotes peer-to-peer trade amongst software agents. HTM5 agents represent various cloud entities and implement their business logic on cloud interactions. Trade in a peer-to-peer cloud robotic system is based on relationships and contracts amongst several agent subsets. Electronic Institutions are associations of heterogeneous intelligent agents which interact with each other following predefined norms. In Dynamic Electronic Institutions, the process of formation, reformation and dissolution of institutions is automated leading to run time adaptations in groups of agents. DEIs in agent oriented cloud robotic ecosystems bring order and group intellect. This article presents DEI implementations through HTM5 methodology.
Design and Development of a Run-Time Monitor for Multi-Core Architectures in Cloud Computing
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P.; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data. PMID:22163811
Design and development of a run-time monitor for multi-core architectures in cloud computing.
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data.
Controlling Infrastructure Costs: Right-Sizing the Mission Control Facility
NASA Technical Reports Server (NTRS)
Martin, Keith; Sen-Roy, Michael; Heiman, Jennifer
2009-01-01
Johnson Space Center's Mission Control Center is a space vehicle, space program agnostic facility. The current operational design is essentially identical to the original facility architecture that was developed and deployed in the mid-90's. In an effort to streamline the support costs of the mission critical facility, the Mission Operations Division (MOD) of Johnson Space Center (JSC) has sponsored an exploratory project to evaluate and inject current state-of-the-practice Information Technology (IT) tools, processes and technology into legacy operations. The general push in the IT industry has been trending towards a data-centric computer infrastructure for the past several years. Organizations facing challenges with facility operations costs are turning to creative solutions combining hardware consolidation, virtualization and remote access to meet and exceed performance, security, and availability requirements. The Operations Technology Facility (OTF) organization at the Johnson Space Center has been chartered to build and evaluate a parallel Mission Control infrastructure, replacing the existing, thick-client distributed computing model and network architecture with a data center model utilizing virtualization to provide the MCC Infrastructure as a Service. The OTF will design a replacement architecture for the Mission Control Facility, leveraging hardware consolidation through the use of blade servers, increasing utilization rates for compute platforms through virtualization while expanding connectivity options through the deployment of secure remote access. The architecture demonstrates the maturity of the technologies generally available in industry today and the ability to successfully abstract the tightly coupled relationship between thick-client software and legacy hardware into a hardware agnostic "Infrastructure as a Service" capability that can scale to meet future requirements of new space programs and spacecraft. This paper discusses the benefits and difficulties that a migration to cloud-based computing philosophies has uncovered when compared to the legacy Mission Control Center architecture. The team consists of system and software engineers with extensive experience with the MCC infrastructure and software currently used to support the International Space Station (ISS) and Space Shuttle program (SSP).
Integration of XRootD into the cloud infrastructure for ALICE data analysis
NASA Astrophysics Data System (ADS)
Kompaniets, Mikhail; Shadura, Oksana; Svirin, Pavlo; Yurchenko, Volodymyr; Zarochentsev, Andrey
2015-12-01
Cloud technologies allow easy load balancing between different tasks and projects. From the viewpoint of the data analysis in the ALICE experiment, cloud allows to deploy software using Cern Virtual Machine (CernVM) and CernVM File System (CVMFS), to run different (including outdated) versions of software for long term data preservation and to dynamically allocate resources for different computing activities, e.g. grid site, ALICE Analysis Facility (AAF) and possible usage for local projects or other LHC experiments. We present a cloud solution for Tier-3 sites based on OpenStack and Ceph distributed storage with an integrated XRootD based storage element (SE). One of the key features of the solution is based on idea that Ceph has been used as a backend for Cinder Block Storage service for OpenStack, and in the same time as a storage backend for XRootD, with redundancy and availability of data preserved by Ceph settings. For faster and easier OpenStack deployment was applied the Packstack solution, which is based on the Puppet configuration management system. Ceph installation and configuration operations are structured and converted to Puppet manifests describing node configurations and integrated into Packstack. This solution can be easily deployed, maintained and used even in small groups with limited computing resources and small organizations, which usually have lack of IT support. The proposed infrastructure has been tested on two different clouds (SPbSU & BITP) and integrates successfully with the ALICE data analysis model.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers
NASA Astrophysics Data System (ADS)
Dreher, Patrick; Scullin, William; Vouk, Mladen
2015-09-01
Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.
International Symposium on Grids and Clouds (ISGC) 2014
NASA Astrophysics Data System (ADS)
The International Symposium on Grids and Clouds (ISGC) 2014 will be held at Academia Sinica in Taipei, Taiwan from 23-28 March 2014, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC).“Bringing the data scientist to global e-Infrastructures” is the theme of ISGC 2014. The last decade has seen the phenomenal growth in the production of data in all forms by all research communities to produce a deluge of data from which information and knowledge need to be extracted. Key to this success will be the data scientist - educated to use advanced algorithms, applications and infrastructures - collaborating internationally to tackle society’s challenges. ISGC 2014 will bring together researchers working in all aspects of data science from different disciplines around the world to collaborate and educate themselves in the latest achievements and techniques being used to tackle the data deluge. In addition to the regular workshops, technical presentations and plenary keynotes, ISGC this year will focus on how to grow the data science community by considering the educational foundation needed for tomorrow’s data scientist. Topics of discussion include Physics (including HEP) and Engineering Applications, Biomedicine & Life Sciences Applications, Earth & Environmental Sciences & Biodiversity Applications, Humanities & Social Sciences Application, Virtual Research Environment (including Middleware, tools, services, workflow, ... etc.), Data Management, Big Data, Infrastructure & Operations Management, Infrastructure Clouds and Virtualisation, Interoperability, Business Models & Sustainability, Highly Distributed Computing Systems, and High Performance & Technical Computing (HPTC).
A European Federated Cloud: Innovative distributed computing solutions by EGI
NASA Astrophysics Data System (ADS)
Sipos, Gergely; Turilli, Matteo; Newhouse, Steven; Kacsuk, Peter
2013-04-01
The European Grid Infrastructure (EGI) is the result of pioneering work that has, over the last decade, built a collaborative production infrastructure of uniform services through the federation of national resource providers that supports multi-disciplinary science across Europe and around the world. This presentation will provide an overview of the recently established 'federated cloud computing services' that the National Grid Initiatives (NGIs), operators of EGI, offer to scientific communities. The presentation will explain the technical capabilities of the 'EGI Federated Cloud' and the processes whereby earth and space science researchers can engage with it. EGI's resource centres have been providing services for collaborative, compute- and data-intensive applications for over a decade. Besides the well-established 'grid services', several NGIs already offer privately run cloud services to their national researchers. Many of these researchers recently expressed the need to share these cloud capabilities within their international research collaborations - a model similar to the way the grid emerged through the federation of institutional batch computing and file storage servers. To facilitate the setup of a pan-European cloud service from the NGIs' resources, the EGI-InSPIRE project established a Federated Cloud Task Force in September 2011. The Task Force has a mandate to identify and test technologies for a multinational federated cloud that could be provisioned within EGI by the NGIs. A guiding principle for the EGI Federated Cloud is to remain technology neutral and flexible for both resource providers and users: • Resource providers are allowed to use any cloud hypervisor and management technology to join virtualised resources into the EGI Federated Cloud as long as the site is subscribed to the user-facing interfaces selected by the EGI community. • Users can integrate high level services - such as brokers, portals and customised Virtual Research Environments - with the EGI Federated Cloud as long as these services access cloud resources through the user-facing interfaces selected by the EGI community. The Task Force will be closed in May 2013. It already • Identified key enabling technologies by which a multinational, federated 'Infrastructure as a Service' (IaaS) type cloud can be built from the NGIs' resources; • Deployed a test bed to evaluate the integration of virtualised resources within EGI and to engage with early adopter use cases from different scientific domains; • Integrated cloud resources into the EGI production infrastructure through cloud specific bindings of the EGI information system, monitoring system, authentication system, etc.; • Collected and catalogued requirements concerning the federated cloud services from the feedback of early adopter use cases; • Provided feedback and requirements to relevant technology providers on their implementations and worked with these providers to address those requirements; • Identified issues that need to be addressed by other areas of EGI (such as portal solutions, resource allocation policies, marketing and user support) to reach a production system. The Task Force will publish a blueprint in April 2013. The blueprint will drive the establishment of a production level EGI Federated Cloud service after May 2013.
Cloud flexibility using DIRAC interware
NASA Astrophysics Data System (ADS)
Fernandez Albor, Víctor; Seco Miguelez, Marcos; Fernandez Pena, Tomas; Mendez Muñoz, Victor; Saborido Silva, Juan Jose; Graciani Diaz, Ricardo
2014-06-01
Communities of different locations are running their computing jobs on dedicated infrastructures without the need to worry about software, hardware or even the site where their programs are going to be executed. Nevertheless, this usually implies that they are restricted to use certain types or versions of an Operating System because either their software needs an definite version of a system library or a specific platform is required by the collaboration to which they belong. On this scenario, if a data center wants to service software to incompatible communities, it has to split its physical resources among those communities. This splitting will inevitably lead to an underuse of resources because the data centers are bound to have periods where one or more of its subclusters are idle. It is, in this situation, where Cloud Computing provides the flexibility and reduction in computational cost that data centers are searching for. This paper describes a set of realistic tests that we ran on one of such implementations. The test comprise software from three different HEP communities (Auger, LHCb and QCD phenomelogists) and the Parsec Benchmark Suite running on one or more of three Linux flavors (SL5, Ubuntu 10.04 and Fedora 13). The implemented infrastructure has, at the cloud level, CloudStack that manages the virtual machines (VM) and the hosts on which they run, and, at the user level, the DIRAC framework along with a VM extension that will submit, monitorize and keep track of the user jobs and also requests CloudStack to start or stop the necessary VM's. In this infrastructure, the community software is distributed via the CernVM-FS, which has been proven to be a reliable and scalable software distribution system. With the resulting infrastructure, users are allowed to send their jobs transparently to the Data Center. The main purpose of this system is the creation of flexible cluster, multiplatform with an scalable method for software distribution for several VOs. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine, which is transparent to the user.
Adventures in Private Cloud: Balancing Cost and Capability at the CloudSat Data Processing Center
NASA Astrophysics Data System (ADS)
Partain, P.; Finley, S.; Fluke, J.; Haynes, J. M.; Cronk, H. Q.; Miller, S. D.
2016-12-01
Since the beginning of the CloudSat Mission in 2006, The CloudSat Data Processing Center (DPC) at the Cooperative Institute for Research in the Atmosphere (CIRA) has been ingesting data from the satellite and other A-Train sensors, producing data products, and distributing them to researchers around the world. The computing infrastructure was specifically designed to fulfill the requirements as specified at the beginning of what nominally was a two-year mission. The environment consisted of servers dedicated to specific processing tasks in a rigid workflow to generate the required products. To the benefit of science and with credit to the mission engineers, CloudSat has lasted well beyond its planned lifetime and is still collecting data ten years later. Over that period requirements of the data processing system have greatly expanded and opportunities for providing value-added services have presented themselves. But while demands on the system have increased, the initial design allowed for very little expansion in terms of scalability and flexibility. The design did change to include virtual machine processing nodes and distributed workflows but infrastructure management was still a time consuming task when system modification was required to run new tests or implement new processes. To address the scalability, flexibility, and manageability of the system Cloud computing methods and technologies are now being employed. The use of a public cloud like Amazon Elastic Compute Cloud or Google Compute Engine was considered but, among other issues, data transfer and storage cost becomes a problem especially when demand fluctuates as a result of reprocessing and the introduction of new products and services. Instead, the existing system was converted to an on premises private Cloud using the OpenStack computing platform and Ceph software defined storage to reap the benefits of the Cloud computing paradigm. This work details the decisions that were made, the benefits that have been realized, the difficulties that were encountered and issues that still exist.
Unidata's Vision for Transforming Geoscience by Moving Data Services and Software to the Cloud
NASA Astrophysics Data System (ADS)
Ramamurthy, M. K.; Fisher, W.; Yoksas, T.
2014-12-01
Universities are facing many challenges: shrinking budgets, rapidly evolving information technologies, exploding data volumes, multidisciplinary science requirements, and high student expectations. These changes are upending traditional approaches to accessing and using data and software. It is clear that Unidata's products and services must evolve to support new approaches to research and education. After years of hype and ambiguity, cloud computing is maturing in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Cloud services aimed at providing any resource, at any time, from any place, using any device are increasingly being embraced by all types of organizations. Given this trend and the enormous potential of cloud-based services, Unidata is taking moving to augment its products, services, data delivery mechanisms and applications to align with the cloud-computing paradigm. Specifically, Unidata is working toward establishing a community-based development environment that supports the creation and use of software services to build end-to-end data workflows. The design encourages the creation of services that can be broken into small, independent chunks that provide simple capabilities. Chunks could be used individually to perform a task, or chained into simple or elaborate workflows. The services will also be portable, allowing their use in researchers' own cloud-based computing environments. In this talk, we present a vision for Unidata's future in a cloud-enabled data services and discuss our initial efforts to deploy a subset of Unidata data services and tools in the Amazon EC2 and Microsoft Azure cloud environments, including the transfer of real-time meteorological data into its cloud instances, product generation using those data, and the deployment of TDS, McIDAS ADDE and AWIPS II data servers and the Integrated Data Server visualization tool.
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Guarise, A.; Lusso, S.; Masera, M.; Vallero, S.
2015-12-01
The INFN computing centre in Torino hosts a private Cloud, which is managed with the OpenNebula cloud controller. The infrastructure offers Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) services to different scientific computing applications. The main stakeholders of the facility are a grid Tier-2 site for the ALICE collaboration at LHC, an interactive analysis facility for the same experiment and a grid Tier-2 site for the BESIII collaboration, plus an increasing number of other small tenants. The dynamic allocation of resources to tenants is partially automated. This feature requires detailed monitoring and accounting of the resource usage. We set up a monitoring framework to inspect the site activities both in terms of IaaS and applications running on the hosted virtual instances. For this purpose we used the ElasticSearch, Logstash and Kibana (ELK) stack. The infrastructure relies on a MySQL database back-end for data preservation and to ensure flexibility to choose a different monitoring solution if needed. The heterogeneous accounting information is transferred from the database to the ElasticSearch engine via a custom Logstash plugin. Each use-case is indexed separately in ElasticSearch and we setup a set of Kibana dashboards with pre-defined queries in order to monitor the relevant information in each case. For the IaaS metering, we developed sensors for the OpenNebula API. The IaaS level information gathered through the API is sent to the MySQL database through an ad-hoc developed RESTful web service. Moreover, we have developed a billing system for our private Cloud, which relies on the RabbitMQ message queue for asynchronous communication to the database and on the ELK stack for its graphical interface. The Italian Grid accounting framework is also migrating to a similar set-up. Concerning the application level, we used the Root plugin TProofMonSenderSQL to collect accounting data from the interactive analysis facility. The BESIII virtual instances used to be monitored with Zabbix, as a proof of concept we also retrieve the information contained in the Zabbix database. In this way we have achieved a uniform monitoring interface for both the IaaS and the scientific applications, mostly leveraging off-the-shelf tools. At present, we are working to define a model for monitoring-as-a-service, based on the tools described above, which the Cloud tenants can easily configure to suit their specific needs.
Brokered virtual hubs for facilitating access and use of geospatial Open Data
NASA Astrophysics Data System (ADS)
Mazzetti, Paolo; Latre, Miguel; Kamali, Nargess; Brumana, Raffaella; Braumann, Stefan; Nativi, Stefano
2016-04-01
Open Data is a major trend in current information technology scenario and it is often publicised as one of the pillars of the information society in the near future. In particular, geospatial Open Data have a huge potential also for Earth Sciences, through the enablement of innovative applications and services integrating heterogeneous information. However, open does not mean usable. As it was recognized at the very beginning of the Web revolution, many different degrees of openness exist: from simple sharing in a proprietary format to advanced sharing in standard formats and including semantic information. Therefore, to fully unleash the potential of geospatial Open Data, advanced infrastructures are needed to increase the data openness degree, enhancing their usability. In October 2014, the ENERGIC OD (European NEtwork for Redistributing Geospatial Information to user Communities - Open Data) project, funded by the European Union under the Competitiveness and Innovation framework Programme (CIP), has started. In response to the EU call, the general objective of the project is to "facilitate the use of open (freely available) geographic data from different sources for the creation of innovative applications and services through the creation of Virtual Hubs". The ENERGIC OD Virtual Hubs aim to facilitate the use of geospatial Open Data by lowering and possibly removing the main barriers which hampers geo-information (GI) usage by end-users and application developers. Data and services heterogeneity is recognized as one of the major barriers to Open Data (re-)use. It imposes end-users and developers to spend a lot of effort in accessing different infrastructures and harmonizing datasets. Such heterogeneity cannot be completely removed through the adoption of standard specifications for service interfaces, metadata and data models, since different infrastructures adopt different standards to answer to specific challenges and to address specific use-cases. Thus, beyond a certain extent, heterogeneity is irreducible especially in interdisciplinary contexts. ENERGIC OD Virtual Hubs address heterogeneity adopting a mediation and brokering approach: specific components (brokers) are dedicated to harmonize service interfaces, metadata and data models, enabling seamless discovery and access to heterogeneous infrastructures and datasets. As an innovation project, ENERGIC OD integrates several existing technologies to implement Virtual Hubs as single points of access to geospatial datasets provided by new or existing platforms and infrastructures, including INSPIRE-compliant systems and Copernicus services. A first version of the ENERGIC OD brokers has been implemented based on the GI-Suite Brokering Framework developed by CNR-IIA, and complemented with other tools under integration and development. It already enables mediated discovery and harmonized access to different geospatial Open Data sources. It is accessible by users as Software-as-a-Service through a browser. Moreover, open APIs and a Javascript library are available for application developers. Six ENERGIC OD Virtual Hubs have been currently deployed: one at regional level (Berlin metropolitan area) and five at national-level (in France, Germany, Italy, Poland and Spain). Each Virtual Hub manager decided the deployment strategy (local infrastructure or commercial Infrastructure-as-a-Service cloud), and the list of connected Open Data sources. The ENERGIC OD Virtual Hubs are under test and validation through the development of ten different mobile and Web applications.
On localization attacks against cloud infrastructure
NASA Astrophysics Data System (ADS)
Ge, Linqiang; Yu, Wei; Sistani, Mohammad Ali
2013-05-01
One of the key characteristics of cloud computing is the device and location independence that enables the user to access systems regardless of their location. Because cloud computing is heavily based on sharing resource, it is vulnerable to cyber attacks. In this paper, we investigate a localization attack that enables the adversary to leverage central processing unit (CPU) resources to localize the physical location of server used by victims. By increasing and reducing CPU usage through the malicious virtual machine (VM), the response time from the victim VM will increase and decrease correspondingly. In this way, by embedding the probing signal into the CPU usage and correlating the same pattern in the response time from the victim VM, the adversary can find the location of victim VM. To determine attack accuracy, we investigate features in both the time and frequency domains. We conduct both theoretical and experimental study to demonstrate the effectiveness of such an attack.
Survey on Security Issues in Cloud Computing and Associated Mitigation Techniques
NASA Astrophysics Data System (ADS)
Bhadauria, Rohit; Sanyal, Sugata
2012-06-01
Cloud Computing holds the potential to eliminate the requirements for setting up of high-cost computing infrastructure for IT-based solutions and services that the industry uses. It promises to provide a flexible IT architecture, accessible through internet for lightweight portable devices. This would allow multi-fold increase in the capacity or capabilities of the existing and new software. In a cloud computing environment, the entire data reside over a set of networked resources, enabling the data to be accessed through virtual machines. Since these data-centers may lie in any corner of the world beyond the reach and control of users, there are multifarious security and privacy challenges that need to be understood and taken care of. Also, one can never deny the possibility of a server breakdown that has been witnessed, rather quite often in the recent times. There are various issues that need to be dealt with respect to security and privacy in a cloud computing scenario. This extensive survey paper aims to elaborate and analyze the numerous unresolved issues threatening the cloud computing adoption and diffusion affecting the various stake-holders linked to it.
Integration of High-Performance Computing into Cloud Computing Services
NASA Astrophysics Data System (ADS)
Vouk, Mladen A.; Sills, Eric; Dreher, Patrick
High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).
Energy Consumption Management of Virtual Cloud Computing Platform
NASA Astrophysics Data System (ADS)
Li, Lin
2017-11-01
For energy consumption management research on virtual cloud computing platforms, energy consumption management of virtual computers and cloud computing platform should be understood deeper. Only in this way can problems faced by energy consumption management be solved. In solving problems, the key to solutions points to data centers with high energy consumption, so people are in great need to use a new scientific technique. Virtualization technology and cloud computing have become powerful tools in people’s real life, work and production because they have strong strength and many advantages. Virtualization technology and cloud computing now is in a rapid developing trend. It has very high resource utilization rate. In this way, the presence of virtualization and cloud computing technologies is very necessary in the constantly developing information age. This paper has summarized, explained and further analyzed energy consumption management questions of the virtual cloud computing platform. It eventually gives people a clearer understanding of energy consumption management of virtual cloud computing platform and brings more help to various aspects of people’s live, work and son on.
Raising Virtual Laboratories in Australia onto global platforms
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Barker, M.; Fraser, R.; Evans, B. J. K.; Moloney, G.; Proctor, R.; Moise, A. F.; Hamish, H.
2016-12-01
Across the globe, Virtual Laboratories (VLs), Science Gateways (SGs), and Virtual Research Environments (VREs) are being developed that enable users who are not co-located to actively work together at various scales to share data, models, tools, software, workflows, best practices, etc. Outcomes range from enabling `long tail' researchers to more easily access specific data collections, to facilitating complex workflows on powerful supercomputers. In Australia, government funding has facilitated the development of a range of VLs through the National eResearch Collaborative Tools and Resources (NeCTAR) program. The VLs provide highly collaborative, research-domain oriented, integrated software infrastructures that meet user community needs. Twelve VLs have been funded since 2012, including the Virtual Geophysics Laboratory (VGL); Virtual Hazards, Impact and Risk Laboratory (VHIRL); Climate and Weather Science Laboratory (CWSLab); Marine Virtual Laboratory (MarVL); and Biodiversity and Climate Change Virtual Laboratory (BCCVL). These VLs share similar technical challenges, with common issues emerging on integration of tools, applications and access data collections via both cloud-based environments and other distributed resources. While each VL began with a focus on a specific research domain, communities of practice have now formed across the VLs around common issues, and facilitate identification of best practice case studies, and new standards. As a result, tools are now being shared where the VLs access data via data services using international standards such as ISO, OGC, W3C. The sharing of these approaches is starting to facilitate re-usability of infrastructure and is a step towards supporting interdisciplinary research. Whilst the focus of the VLs are Australia-centric, by using standards, these environments are able to be extended to analysis on other international datasets. Many VL datasets are subsets of global datasets and so extension to global is a small (and often requested) step. Similarly, most of the tools, software, and other technologies could be shared across infrastructures globally. Therefore, it is now time to better connect the Australian VLs with similar initiatives elsewhere to create international platforms that can contribute to global research challenges.
Virtual Astronomy: The Legacy of the Virtual Astronomical Observatory
NASA Astrophysics Data System (ADS)
Hanisch, Robert J.; Berriman, G. B.; Lazio, J.; Szalay, A. S.; Fabbiano, G.; Plante, R. L.; McGlynn, T. A.; Evans, J.; Emery Bunn, S.; Claro, M.; VAO Project Team
2014-01-01
Over the past ten years, the Virtual Astronomical Observatory (VAO, http://usvao.org) and its predecessor, the National Virtual Observatory (NVO), have developed and operated a software infrastructure consisting of standards and protocols for data and science software applications. The Virtual Observatory (VO) makes it possible to develop robust software for the discovery, access, and analysis of astronomical data. Every major publicly funded research organization in the US and worldwide has deployed at least some components of the VO infrastructure; tens of thousands of VO-enabled queries for data are invoked daily against catalog, image, and spectral data collections; and groups within the community have developed tools and applications building upon the VO infrastructure. Further, NVO and VAO have helped ensure access to data internationally by co-founding the International Virtual Observatory Alliance (IVOA, http://ivoa.net). The products of the VAO are being archived in a publicly accessible repository. Several science tools developed by the VAO will continue to be supported by the organizations that developed them: the Iris spectral energy distribution package (SAO), the Data Discovery Tool (STScI/MAST, HEASARC), and the scalable cross-comparison service (IPAC). The final year of VAO is focused on development of the data access protocol for data cubes, creation of Python language bindings to VO services, and deployment of a cloud-like data storage service that links to VO data discovery tools (SciDrive). We encourage the community to make use of these tools and services, to extend and improve them, and to carry on with the vision for virtual astronomy: astronomical research enabled by easy access to distributed data and computational resources. Funding for VAO development and operations has been provided jointly by NSF and NASA since May 2010. NSF funding will end in September 2014, though with the possibility of competitive solicitations for VO-based tool development. NASA intends to maintain core VO services such as the resource registry (the index of VO-accessible data collections), monitoring services, and a website as part of the remit of HEASARC, IPAC (IRSA, NED), and MAST.
Enhancing Security by System-Level Virtualization in Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Sun, Dawei; Chang, Guiran; Tan, Chunguang; Wang, Xingwei
Many trends are opening up the era of cloud computing, which will reshape the IT industry. Virtualization techniques have become an indispensable ingredient for almost all cloud computing system. By the virtual environments, cloud provider is able to run varieties of operating systems as needed by each cloud user. Virtualization can improve reliability, security, and availability of applications by using consolidation, isolation, and fault tolerance. In addition, it is possible to balance the workloads by using live migration techniques. In this paper, the definition of cloud computing is given; and then the service and deployment models are introduced. An analysis of security issues and challenges in implementation of cloud computing is identified. Moreover, a system-level virtualization case is established to enhance the security of cloud computing environments.
NASA Astrophysics Data System (ADS)
Montalto, F. A.; Yu, Z.; Soldner, K.; Israel, A.; Fritch, M.; Kim, Y.; White, S.
2017-12-01
Urban stormwater utilities are increasingly using decentralized "green" infrastructure (GI) systems to capture stormwater and achieve compliance with regulations. Because environmental conditions, and design varies by GSI facility, monitoring of GSI systems under a range of conditions is essential. Conventional monitoring efforts can be costly because in-field data logging requires intense data transmission rates. The Internet of Things (IoT) can be used to more cost-effectively collect, store, and publish GSI monitoring data. Using 3G mobile networks, a cloud-based database was built on an Amazon Web Services (AWS) EC2 virtual machine to store and publish data collected with environmental sensors deployed in the field. This database can store multi-dimensional time series data, as well as photos and other observations logged by citizen scientists through a public engagement mobile app through a new Application Programming Interface (API). Also on the AWS EC2 virtual machine, a real-time QAQC flagging algorithm was developed to validate the sensor data streams.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis.
Simonyan, Vahan; Mazumder, Raja
2014-09-30
The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis
Simonyan, Vahan; Mazumder, Raja
2014-01-01
The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis. PMID:25271953
A New Cloud Architecture of Virtual Trusted Platform Modules
NASA Astrophysics Data System (ADS)
Liu, Dongxi; Lee, Jack; Jang, Julian; Nepal, Surya; Zic, John
We propose and implement a cloud architecture of virtual Trusted Platform Modules (TPMs) to improve the usability of TPMs. In this architecture, virtual TPMs can be obtained from the TPM cloud on demand. Hence, the TPM functionality is available for applications that do not have physical TPMs in their local platforms. Moreover, the TPM cloud allows users to access their keys and data in the same virtual TPM even if they move to untrusted platforms. The TPM cloud is easy to access for applications in different languages since cloud computing delivers services in standard protocols. The functionality of the TPM cloud is demonstrated by applying it to implement the Needham-Schroeder public-key protocol for web authentications, such that the strong security provided by TPMs is integrated into high level applications. The chain of trust based on the TPM cloud is discussed and the security properties of the virtual TPMs in the cloud is analyzed.
Virtualization and cloud computing in dentistry.
Chow, Frank; Muftu, Ali; Shorter, Richard
2014-01-01
The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Hybrid Cloud Computing Environment for EarthCube and Geoscience Community
NASA Astrophysics Data System (ADS)
Yang, C. P.; Qin, H.
2016-12-01
The NSF EarthCube Integration and Test Environment (ECITE) has built a hybrid cloud computing environment to provides cloud resources from private cloud environments by using cloud system software - OpenStack and Eucalyptus, and also manages public cloud - Amazon Web Service that allow resource synchronizing and bursting between private and public cloud. On ECITE hybrid cloud platform, EarthCube and geoscience community can deploy and manage the applications by using base virtual machine images or customized virtual machines, analyze big datasets by using virtual clusters, and real-time monitor the virtual resource usage on the cloud. Currently, a number of EarthCube projects have deployed or started migrating their projects to this platform, such as CHORDS, BCube, CINERGI, OntoSoft, and some other EarthCube building blocks. To accomplish the deployment or migration, administrator of ECITE hybrid cloud platform prepares the specific needs (e.g. images, port numbers, usable cloud capacity, etc.) of each project in advance base on the communications between ECITE and participant projects, and then the scientists or IT technicians in those projects launch one or multiple virtual machines, access the virtual machine(s) to set up computing environment if need be, and migrate their codes, documents or data without caring about the heterogeneity in structure and operations among different cloud platforms.
Maintaining Traceability in an Evolving Distributed Computing Environment
NASA Astrophysics Data System (ADS)
Collier, I.; Wartel, R.
2015-12-01
The management of risk is fundamental to the operation of any distributed computing infrastructure. Identifying the cause of incidents is essential to prevent them from re-occurring. In addition, it is a goal to contain the impact of an incident while keeping services operational. For response to incidents to be acceptable this needs to be commensurate with the scale of the problem. The minimum level of traceability for distributed computing infrastructure usage is to be able to identify the source of all actions (executables, file transfers, pilot jobs, portal jobs, etc.) and the individual who initiated them. In addition, sufficiently fine-grained controls, such as blocking the originating user and monitoring to detect abnormal behaviour, are necessary for keeping services operational. It is essential to be able to understand the cause and to fix any problems before re-enabling access for the user. The aim is to be able to answer the basic questions who, what, where, and when concerning any incident. This requires retaining all relevant information, including timestamps and the digital identity of the user, sufficient to identify, for each service instance, and for every security event including at least the following: connect, authenticate, authorize (including identity changes) and disconnect. In traditional grid infrastructures (WLCG, EGI, OSG etc.) best practices and procedures for gathering and maintaining the information required to maintain traceability are well established. In particular, sites collect and store information required to ensure traceability of events at their sites. With the increased use of virtualisation and private and public clouds for HEP workloads established procedures, which are unable to see 'inside' running virtual machines no longer capture all the information required. Maintaining traceability will at least involve a shift of responsibility from sites to Virtual Organisations (VOs) bringing with it new requirements for their logging infrastructures. VOs indeed need to fulfil a new operational role and become fully active participants in the incident response process. We present an analysis of the changing requirements to maintain traceability for virtualised and cloud based workflows with particular reference to the work of the WLCG Traceability Working Group.
A modular (almost) automatic set-up for elastic multi-tenants cloud (micro)infrastructures
NASA Astrophysics Data System (ADS)
Amoroso, A.; Astorino, F.; Bagnasco, S.; Balashov, N. A.; Bianchi, F.; Destefanis, M.; Lusso, S.; Maggiora, M.; Pellegrino, J.; Yan, L.; Yan, T.; Zhang, X.; Zhao, X.
2017-10-01
An auto-installing tool on an usb drive can allow for a quick and easy automatic deployment of OpenNebula-based cloud infrastructures remotely managed by a central VMDIRAC instance. A single team, in the main site of an HEP Collaboration or elsewhere, can manage and run a relatively large network of federated (micro-)cloud infrastructures, making an highly dynamic and elastic use of computing resources. Exploiting such an approach can lead to modular systems of cloud-bursting infrastructures addressing complex real-life scenarios.
Celesti, Antonio; Fazio, Maria; Romano, Agata; Bramanti, Alessia; Bramanti, Placido; Villari, Massimo
2018-05-01
The Open Archive Information System (OAIS) is a reference model for organizing people and resources in a system, and it is already adopted in care centers and medical systems to efficiently manage clinical data, medical personnel, and patients. Archival storage systems are typically implemented using traditional relational database systems, but the relation-oriented technology strongly limits the efficiency in the management of huge amount of patients' clinical data, especially in emerging cloud-based, that are distributed. In this paper, we present an OAIS healthcare architecture useful to manage a huge amount of HL7 clinical documents in a scalable way. Specifically, it is based on a NoSQL column-oriented Data Base Management System deployed in the cloud, thus to benefit from a big tables and wide rows available over a virtual distributed infrastructure. We developed a prototype of the proposed architecture at the IRCCS, and we evaluated its efficiency in a real case of study.
Flexible services for the support of research.
Turilli, Matteo; Wallom, David; Williams, Chris; Gough, Steve; Curran, Neal; Tarrant, Richard; Bretherton, Dan; Powell, Andy; Johnson, Matt; Harmer, Terry; Wright, Peter; Gordon, John
2013-01-28
Cloud computing has been increasingly adopted by users and providers to promote a flexible, scalable and tailored access to computing resources. Nonetheless, the consolidation of this paradigm has uncovered some of its limitations. Initially devised by corporations with direct control over large amounts of computational resources, cloud computing is now being endorsed by organizations with limited resources or with a more articulated, less direct control over these resources. The challenge for these organizations is to leverage the benefits of cloud computing while dealing with limited and often widely distributed computing resources. This study focuses on the adoption of cloud computing by higher education institutions and addresses two main issues: flexible and on-demand access to a large amount of storage resources, and scalability across a heterogeneous set of cloud infrastructures. The proposed solutions leverage a federated approach to cloud resources in which users access multiple and largely independent cloud infrastructures through a highly customizable broker layer. This approach allows for a uniform authentication and authorization infrastructure, a fine-grained policy specification and the aggregation of accounting and monitoring. Within a loosely coupled federation of cloud infrastructures, users can access vast amount of data without copying them across cloud infrastructures and can scale their resource provisions when the local cloud resources become insufficient.
Virtual Labs (Science Gateways) as platforms for Free and Open Source Science
NASA Astrophysics Data System (ADS)
Lescinsky, David; Car, Nicholas; Fraser, Ryan; Friedrich, Carsten; Kemp, Carina; Squire, Geoffrey
2016-04-01
The Free and Open Source Software (FOSS) movement promotes community engagement in software development, as well as provides access to a range of sophisticated technologies that would be prohibitively expensive if obtained commercially. However, as geoinformatics and eResearch tools and services become more dispersed, it becomes more complicated to identify and interface between the many required components. Virtual Laboratories (VLs, also known as Science Gateways) simplify the management and coordination of these components by providing a platform linking many, if not all, of the steps in particular scientific processes. These enable scientists to focus on their science, rather than the underlying supporting technologies. We describe a modular, open source, VL infrastructure that can be reconfigured to create VLs for a wide range of disciplines. Development of this infrastructure has been led by CSIRO in collaboration with Geoscience Australia and the National Computational Infrastructure (NCI) with support from the National eResearch Collaboration Tools and Resources (NeCTAR) and the Australian National Data Service (ANDS). Initially, the infrastructure was developed to support the Virtual Geophysical Laboratory (VGL), and has subsequently been repurposed to create the Virtual Hazards Impact and Risk Laboratory (VHIRL) and the reconfigured Australian National Virtual Geophysics Laboratory (ANVGL). During each step of development, new capabilities and services have been added and/or enhanced. We plan on continuing to follow this model using a shared, community code base. The VL platform facilitates transparent and reproducible science by providing access to both the data and methodologies used during scientific investigations. This is further enhanced by the ability to set up and run investigations using computational resources accessed through the VL. Data is accessed using registries pointing to catalogues within public data repositories (notably including the NCI National Environmental Research Data Interoperability Platform), or by uploading data directly from user supplied addresses or files. Similarly, scientific software is accessed through registries pointing to software repositories (e.g., GitHub). Runs are configured by using or modifying default templates designed by subject matter experts. After the appropriate computational resources are identified by the user, Virtual Machines (VMs) are spun up and jobs are submitted to service providers (currently the NeCTAR public cloud or Amazon Web Services). Following completion of the jobs the results can be reviewed and downloaded if desired. By providing a unified platform for science, the VL infrastructure enables sophisticated provenance capture and management. The source of input data (including both collection and queries), user information, software information (version and configuration details) and output information are all captured and managed as a VL resource which can be linked to output data sets. This provenance resource provides a mechanism for publication and citation for Free and Open Source Science.
The EPOS Vision for the Open Science Cloud
NASA Astrophysics Data System (ADS)
Jeffery, Keith; Harrison, Matt; Cocco, Massimo
2016-04-01
Cloud computing offers dynamic elastic scalability for data processing on demand. For much research activity, demand for computing is uneven over time and so CLOUD computing offers both cost-effectiveness and capacity advantages. However, as reported repeatedly by the EC Cloud Expert Group, there are barriers to the uptake of Cloud Computing: (1) security and privacy; (2) interoperability (avoidance of lock-in); (3) lack of appropriate systems development environments for application programmers to characterise their applications to allow CLOUD middleware to optimize their deployment and execution. From CERN, the Helix-Nebula group has proposed the architecture for the European Open Science Cloud. They are discussing with other e-Infrastructure groups such as EGI (GRIDs), EUDAT (data curation), AARC (network authentication and authorisation) and also with the EIROFORUM group of 'international treaty' RIs (Research Infrastructures) and the ESFRI (European Strategic Forum for Research Infrastructures) RIs including EPOS. Many of these RIs are either e-RIs (electronic-RIs) or have an e-RI interface for access and use. The EPOS architecture is centred on a portal: ICS (Integrated Core Services). The architectural design already allows for access to e-RIs (which may include any or all of data, software, users and resources such as computers or instruments). Those within any one domain (subject area) of EPOS are considered within the TCS (Thematic Core Services). Those outside, or available across multiple domains of EPOS, are ICS-d (Integrated Core Services-Distributed) since the intention is that they will be used by any or all of the TCS via the ICS. Another such service type is CES (Computational Earth Science); effectively an ICS-d specializing in high performance computation, analytics, simulation or visualization offered by a TCS for others to use. Already discussions are underway between EPOS and EGI, EUDAT, AARC and Helix-Nebula for those offerings to be considered as ICS-ds by EPOS.. Provision of access to ICS-Ds from ICS-C concerns several aspects: (a) Technical : it may be more or less difficult to connect and pass from ICS-C to the ICS-d/ CES the 'package' (probably a virtual machine) of data and software; (b) Security/privacy : including passing personal information e.g. related to AAAI (Authentication, authorization, accounting Infrastructure); (c) financial and legal : such as payment, licence conditions; Appropriate interfaces from ICS-C to ICS-d are being designed to accommodate these aspects. The Open Science Cloud is timely because it provides a framework to discuss governance and sustainability for computational resource provision as well as an effective interpretation of federated approach to HPC(High Performance Computing) -HTC (High Throughput Computing). It will be a unique opportunity to share and adopt procurement policies to provide access to computational resources for RIs. The current state of discussions and expected roadmap for the EPOS-Open Science Cloud relationship are presented.
ACTRIS Aerosol, Clouds and Trace Gases Research Infrastructure
NASA Astrophysics Data System (ADS)
Pappalardo, Gelsomina
2018-04-01
The Aerosols, Clouds and Trace gases Research Infrastructure (ACTRIS) is a distributed infrastructure dedicated to high-quality observation of aerosols, clouds, trace gases and exploration of their interactions. It will deliver precision data, services and procedures regarding the 4D variability of clouds, short-lived atmospheric species and the physical, optical and chemical properties of aerosols to improve the current capacity to analyse, understand and predict past, current and future evolution of the atmospheric environment.
A prototype Infrastructure for Cloud-based distributed services in High Availability over WAN
NASA Astrophysics Data System (ADS)
Bulfon, C.; Carlino, G.; De Salvo, A.; Doria, A.; Graziosi, C.; Pardi, S.; Sanchez, A.; Carboni, M.; Bolletta, P.; Puccio, L.; Capone, V.; Merola, L.
2015-12-01
In this work we present the architectural and performance studies concerning a prototype of a distributed Tier2 infrastructure for HEP, instantiated between the two Italian sites of INFN-Romal and INFN-Napoli. The network infrastructure is based on a Layer-2 geographical link, provided by the Italian NREN (GARR), directly connecting the two remote LANs of the named sites. By exploiting the possibilities offered by the new distributed file systems, a shared storage area with synchronous copy has been set up. The computing infrastructure, based on an OpenStack facility, is using a set of distributed Hypervisors installed in both sites. The main parameter to be taken into account when managing two remote sites with a single framework is the effect of the latency, due to the distance and the end-to-end service overhead. In order to understand the capabilities and limits of our setup, the impact of latency has been investigated by means of a set of stress tests, including data I/O throughput, metadata access performance evaluation and network occupancy, during the life cycle of a Virtual Machine. A set of resilience tests has also been performed, in order to verify the stability of the system on the event of hardware or software faults. The results of this work show that the reliability and robustness of the chosen architecture are effective enough to build a production system and to provide common services. This prototype can also be extended to multiple sites with small changes of the network topology, thus creating a National Network of Cloud-based distributed services, in HA over WAN.
Volunteer Clouds and Citizen Cyberscience for LHC Physics
NASA Astrophysics Data System (ADS)
Aguado Sanchez, Carlos; Blomer, Jakob; Buncic, Predrag; Chen, Gang; Ellis, John; Garcia Quintas, David; Harutyunyan, Artem; Grey, Francois; Lombrana Gonzalez, Daniel; Marquina, Miguel; Mato, Pere; Rantala, Jarno; Schulz, Holger; Segal, Ben; Sharma, Archana; Skands, Peter; Weir, David; Wu, Jie; Wu, Wenjing; Yadav, Rohit
2011-12-01
Computing for the LHC, and for HEP more generally, is traditionally viewed as requiring specialized infrastructure and software environments, and therefore not compatible with the recent trend in "volunteer computing", where volunteers supply free processing time on ordinary PCs and laptops via standard Internet connections. In this paper, we demonstrate that with the use of virtual machine technology, at least some standard LHC computing tasks can be tackled with volunteer computing resources. Specifically, by presenting volunteer computing resources to HEP scientists as a "volunteer cloud", essentially identical to a Grid or dedicated cluster from a job submission perspective, LHC simulations can be processed effectively. This article outlines both the technical steps required for such a solution and the implications for LHC computing as well as for LHC public outreach and for participation by scientists from developing regions in LHC research.
NASA Astrophysics Data System (ADS)
Parfenov, D. I.; Bolodurina, I. P.
2018-05-01
The article presents the results of developing an approach to detecting and protecting against network attacks on the corporate infrastructure deployed on the multi-cloud platform. The proposed approach is based on the combination of two technologies: a softwareconfigurable network and virtualization of network functions. The approach for searching for anomalous traffic is to use a hybrid neural network consisting of a self-organizing Kohonen network and a multilayer perceptron. The study of the work of the prototype of the system for detecting attacks, the method of forming a learning sample, and the course of experiments are described. The study showed that using the proposed approach makes it possible to increase the effectiveness of the obfuscation of various types of attacks and at the same time does not reduce the performance of the network
Virtual Hubs for facilitating access to Open Data
NASA Astrophysics Data System (ADS)
Mazzetti, Paolo; Latre, Miguel Á.; Ernst, Julia; Brumana, Raffaella; Brauman, Stefan; Nativi, Stefano
2015-04-01
In October 2014 the ENERGIC-OD (European NEtwork for Redistributing Geospatial Information to user Communities - Open Data) project, funded by the European Union under the Competitiveness and Innovation framework Programme (CIP), has started. In response to the EU call, the general objective of the project is to "facilitate the use of open (freely available) geographic data from different sources for the creation of innovative applications and services through the creation of Virtual Hubs". In ENERGIC-OD, Virtual Hubs are conceived as information systems supporting the full life cycle of Open Data: publishing, discovery and access. They facilitate the use of Open Data by lowering and possibly removing the main barriers which hampers geo-information (GI) usage by end-users and application developers. Data and data services heterogeneity is recognized as one of the major barriers to Open Data (re-)use. It imposes end-users and developers to spend a lot of effort in accessing different infrastructures and harmonizing datasets. Such heterogeneity cannot be completely removed through the adoption of standard specifications for service interfaces, metadata and data models, since different infrastructures adopt different standards to answer to specific challenges and to address specific use-cases. Thus, beyond a certain extent, heterogeneity is irreducible especially in interdisciplinary contexts. ENERGIC-OD Virtual Hubs address heterogeneity adopting a mediation and brokering approach: specific components (brokers) are dedicated to harmonize service interfaces, metadata and data models, enabling seamless discovery and access to heterogeneous infrastructures and datasets. As an innovation project, ENERGIC-OD will integrate several existing technologies to implement Virtual Hubs as single points of access to geospatial datasets provided by new or existing platforms and infrastructures, including INSPIRE-compliant systems and Copernicus services. ENERGIC OD will deploy a set of five Virtual Hubs (VHs) at national level in France, Germany, Italy, Poland, Spain and an additional one at the European level. VHs will be provided according to the cloud Software-as-a-Services model. The main expected impact of VHs is the creation of new business opportunities opening up access to Research Data and Public Sector Information. Therefore, ENERGIC-OD addresses not only end-users, who will have the opportunity to access the VH through a geo-portal, but also application developers who will be able to access VH functionalities through simple Application Programming Interfaces (API). ENERGIC-OD Consortium will develop ten different applications on top of the deployed VHs. They aim to demonstrate how VHs facilitate the development of new and multidisciplinary applications based on the full exploitation of (open) GI, hence stimulating innovation and business activities.
Yoo, Sooyoung; Kim, Seok; Kim, Taegi; Kim, Jon Soo; Baek, Rong-Min; Suh, Chang Suk; Chung, Chin Youb; Hwang, Hee
2012-12-01
The cloud computing-based virtual desktop infrastructure (VDI) allows access to computing environments with no limitations in terms of time or place such that it can permit the rapid establishment of a mobile hospital environment. The objective of this study was to investigate the empirical issues to be considered when establishing a virtual mobile environment using VDI technology in a hospital setting and to examine the utility of the technology with an Apple iPad during a physician's rounds as a case study. Empirical implementation issues were derived from a 910-bed tertiary national university hospital that recently launched a VDI system. During the physicians' rounds, we surveyed patient satisfaction levels with the VDI-based mobile consultation service with the iPad and the relationship between these levels of satisfaction and hospital revisits, hospital recommendations, and the hospital brand image. Thirty-five inpatients (including their next-of-kin) and seven physicians participated in the survey. Implementation issues pertaining to the VDI system arose with regard to the highly availability system architecture, wireless network infrastructure, and screen resolution of the system. Other issues were related to privacy and security, mobile device management, and user education. When the system was used in rounds, patients and their next-of-kin expressed high satisfaction levels, and a positive relationship was noted as regards patients' decisions to revisit the hospital and whether the use of the VDI system improved the brand image of the hospital. Mobile hospital environments have the potential to benefit both physicians and patients. The issues related to the implementation of VDI system discussed here should be examined in advance for its successful adoption and implementation.
Yoo, Sooyoung; Kim, Seok; Kim, Taegi; Kim, Jon Soo; Baek, Rong-Min; Suh, Chang Suk; Chung, Chin Youb
2012-01-01
Objectives The cloud computing-based virtual desktop infrastructure (VDI) allows access to computing environments with no limitations in terms of time or place such that it can permit the rapid establishment of a mobile hospital environment. The objective of this study was to investigate the empirical issues to be considered when establishing a virtual mobile environment using VDI technology in a hospital setting and to examine the utility of the technology with an Apple iPad during a physician's rounds as a case study. Methods Empirical implementation issues were derived from a 910-bed tertiary national university hospital that recently launched a VDI system. During the physicians' rounds, we surveyed patient satisfaction levels with the VDI-based mobile consultation service with the iPad and the relationship between these levels of satisfaction and hospital revisits, hospital recommendations, and the hospital brand image. Thirty-five inpatients (including their next-of-kin) and seven physicians participated in the survey. Results Implementation issues pertaining to the VDI system arose with regard to the highly availability system architecture, wireless network infrastructure, and screen resolution of the system. Other issues were related to privacy and security, mobile device management, and user education. When the system was used in rounds, patients and their next-of-kin expressed high satisfaction levels, and a positive relationship was noted as regards patients' decisions to revisit the hospital and whether the use of the VDI system improved the brand image of the hospital. Conclusions Mobile hospital environments have the potential to benefit both physicians and patients. The issues related to the implementation of VDI system discussed here should be examined in advance for its successful adoption and implementation. PMID:23346476
Cloud Computing Applications in Support of Earth Science Activities at Marshall Space Flight Center
NASA Technical Reports Server (NTRS)
Molthan, Andrew L.; Limaye, Ashutosh S.; Srikishen, Jayanthi
2011-01-01
Currently, the NASA Nebula Cloud Computing Platform is available to Agency personnel in a pre-release status as the system undergoes a formal operational readiness review. Over the past year, two projects within the Earth Science Office at NASA Marshall Space Flight Center have been investigating the performance and value of Nebula s "Infrastructure as a Service", or "IaaS" concept and applying cloud computing concepts to advance their respective mission goals. The Short-term Prediction Research and Transition (SPoRT) Center focuses on the transition of unique NASA satellite observations and weather forecasting capabilities for use within the operational forecasting community through partnerships with NOAA s National Weather Service (NWS). SPoRT has evaluated the performance of the Weather Research and Forecasting (WRF) model on virtual machines deployed within Nebula and used Nebula instances to simulate local forecasts in support of regional forecast studies of interest to select NWS forecast offices. In addition to weather forecasting applications, rapidly deployable Nebula virtual machines have supported the processing of high resolution NASA satellite imagery to support disaster assessment following the historic severe weather and tornado outbreak of April 27, 2011. Other modeling and satellite analysis activities are underway in support of NASA s SERVIR program, which integrates satellite observations, ground-based data and forecast models to monitor environmental change and improve disaster response in Central America, the Caribbean, Africa, and the Himalayas. Leveraging SPoRT s experience, SERVIR is working to establish a real-time weather forecasting model for Central America. Other modeling efforts include hydrologic forecasts for Kenya, driven by NASA satellite observations and reanalysis data sets provided by the broader meteorological community. Forecast modeling efforts are supplemented by short-term forecasts of convective initiation, determined by geostationary satellite observations processed on virtual machines powered by Nebula.
NASA Astrophysics Data System (ADS)
Maffioletti, Sergio; Dawes, Nicholas; Bavay, Mathias; Sarni, Sofiane; Lehning, Michael
2013-04-01
The Swiss Experiment platform (SwissEx: http://www.swiss-experiment.ch) provides a distributed storage and processing infrastructure for environmental research experiments. The aim of the second phase project (the Open Support Platform for Environmental Research, OSPER, 2012-2015) is to develop the existing infrastructure to provide scientists with an improved workflow. This improved workflow will include pre-defined, documented and connected processing routines. A large-scale computing and data facility is required to provide reliable and scalable access to data for analysis, and it is desirable that such an infrastructure should be free of traditional data handling methods. Such an infrastructure has been developed using the cloud-based part of the Swiss national infrastructure SMSCG (http://www.smscg.ch) and Academic Cloud. The infrastructure under construction supports two main usage models: 1) Ad-hoc data analysis scripts: These scripts are simple processing scripts, written by the environmental researchers themselves, which can be applied to large data sets via the high power infrastructure. Examples of this type of script are spatial statistical analysis scripts (R-based scripts), mostly computed on raw meteorological and/or soil moisture data. These provide processed output in the form of a grid, a plot, or a kml. 2) Complex models: A more intense data analysis pipeline centered (initially) around the physical process model, Alpine3D, and the MeteoIO plugin; depending on the data set, this may require a tightly coupled infrastructure. SMSCG already supports Alpine3D executions as both regular grid jobs and as virtual software appliances. A dedicated appliance with the Alpine3D specific libraries has been created and made available through the SMSCG infrastructure. The analysis pipelines are activated and supervised by simple control scripts that, depending on the data fetched from the meteorological stations, launch new instances of the Alpine3D appliance, execute location-based subroutines at each grid point and store the results back into the central repository for post-processing. An optional extension of this infrastructure will be to provide a 'ring buffer'-type database infrastructure, such that model results (e.g. test runs made to check parameter dependency or for development) can be visualised and downloaded after completion without submitting them to a permanent storage infrastructure. Data organization Data collected from sensors are archived and classified in distributed sites connected with an open-source software middleware, GSN. Publicly available data are available through common web services and via a cloud storage server (based on Swift). Collocation of the data and processing in the cloud would eventually eliminate data transfer requirements. Execution control logic Execution of the data analysis pipelines (for both the R-based analysis and the Alpine3D simulations) has been implemented using the GC3Pie framework developed by UZH. (https://code.google.com/p/gc3pie/). This allows large-scale, fault-tolerant execution of the pipelines to be described in terms of software appliances. GC3Pie also allows supervision of the execution of large campaigns of appliances as a single simulation. This poster will present the fundamental architectural components of the data analysis pipelines together with initial experimental results.
FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.
Wilson, Carolyn A; Simonyan, Vahan
2014-01-01
Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and bioinformatics tools to be able to store and analyze large amounts of data. To address this need, we have developed the High-performance Integrated Virtual Environment, or HIVE. HIVE uses a combination of distributed cloud storage and distributed cloud computations to provide a platform that is both rapid and responsive to support the growing and increasingly diverse scientific and regulatory needs of FDA scientists in their evaluation of NGS in research and ultimately for evaluation of NGS data in regulatory submissions. © PDA, Inc. 2014.
Cloud access to interoperable IVOA-compliant VOSpace storage
NASA Astrophysics Data System (ADS)
Bertocco, S.; Dowler, P.; Gaudet, S.; Major, B.; Pasian, F.; Taffoni, G.
2018-07-01
Handling, processing and archiving the huge amount of data produced by the new generation of experiments and instruments in Astronomy and Astrophysics are among the more exciting challenges to address in designing the future data management infrastructures and computing services. We investigated the feasibility of a data management and computation infrastructure, available world-wide, with the aim of merging the FAIR data management provided by IVOA standards with the efficiency and reliability of a cloud approach. Our work involved the Canadian Advanced Network for Astronomy Research (CANFAR) infrastructure and the European EGI federated cloud (EFC). We designed and deployed a pilot data management and computation infrastructure that provides IVOA-compliant VOSpace storage resources and wide access to interoperable federated clouds. In this paper, we detail the main user requirements covered, the technical choices and the implemented solutions and we describe the resulting Hybrid cloud Worldwide infrastructure, its benefits and limitations.
NASA Astrophysics Data System (ADS)
Marcus, Kelvin
2014-06-01
The U.S Army Research Laboratory (ARL) has built a "Network Science Research Lab" to support research that aims to improve their ability to analyze, predict, design, and govern complex systems that interweave the social/cognitive, information, and communication network genres. Researchers at ARL and the Network Science Collaborative Technology Alliance (NS-CTA), a collaborative research alliance funded by ARL, conducted experimentation to determine if automated network monitoring tools and task-aware agents deployed within an emulated tactical wireless network could potentially increase the retrieval of relevant data from heterogeneous distributed information nodes. ARL and NS-CTA required the capability to perform this experimentation over clusters of heterogeneous nodes with emulated wireless tactical networks where each node could contain different operating systems, application sets, and physical hardware attributes. Researchers utilized the Dynamically Allocated Virtual Clustering Management System (DAVC) to address each of the infrastructure support requirements necessary in conducting their experimentation. The DAVC is an experimentation infrastructure that provides the means to dynamically create, deploy, and manage virtual clusters of heterogeneous nodes within a cloud computing environment based upon resource utilization such as CPU load, available RAM and hard disk space. The DAVC uses 802.1Q Virtual LANs (VLANs) to prevent experimentation crosstalk and to allow for complex private networks. Clusters created by the DAVC system can be utilized for software development, experimentation, and integration with existing hardware and software. The goal of this paper is to explore how ARL and the NS-CTA leveraged the DAVC to create, deploy and manage multiple experimentation clusters to support their experimentation goals.
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Fraser, R.; Evans, B. J. K.; Friedrich, C.; Klump, J. F.; Lescinsky, D. T.
2017-12-01
Virtual Research Environments (VREs) are now part of academic infrastructures. Online research workflows can be orchestrated whereby data can be accessed from multiple external repositories with processing taking place on public or private clouds, and centralised supercomputers using a mixture of user codes, and well-used community software and libraries. VREs enable distributed members of research teams to actively work together to share data, models, tools, software, workflows, best practices, infrastructures, etc. These environments and their components are increasingly able to support the needs of undergraduate teaching. External to the research sector, they can also be reused by citizen scientists, and be repurposed for industry users to help accelerate the diffusion and hence enable the translation of research innovations. The Virtual Geophysics Laboratory (VGL) in Australia was started in 2012, built using a collaboration between CSIRO, the National Computational Infrastructure (NCI) and Geoscience Australia, with support funding from the Australian Government Department of Education. VGL comprises three main modules that provide an interface to enable users to first select their required data; to choose a tool to process that data; and then access compute infrastructure for execution. VGL was initially built to enable a specific set of researchers in government agencies access to specific data sets and a limited number of tools. Over the years it has evolved into a multi-purpose Earth science platform with access to an increased variety of data (e.g., Natural Hazards, Geochemistry), a broader range of software packages, and an increasing diversity of compute infrastructures. This expansion has been possible because of the approach to loosely couple data, tools and compute resources via interfaces that are built on international standards and accessed as network-enabled services wherever possible. Built originally for researchers that were not fussy about general usability, increasing emphasis on User Interfaces (UIs) and stability will lead to increased uptake in the education and industry sectors. Simultaneously, improvements are being added to facilitate access to data and tools by experienced researchers who want direct access to both data and flexible workflows.
Cloud Computing Applications in Support of Earth Science Activities at Marshall Space Flight Center
NASA Astrophysics Data System (ADS)
Molthan, A.; Limaye, A. S.
2011-12-01
Currently, the NASA Nebula Cloud Computing Platform is available to Agency personnel in a pre-release status as the system undergoes a formal operational readiness review. Over the past year, two projects within the Earth Science Office at NASA Marshall Space Flight Center have been investigating the performance and value of Nebula's "Infrastructure as a Service", or "IaaS" concept and applying cloud computing concepts to advance their respective mission goals. The Short-term Prediction Research and Transition (SPoRT) Center focuses on the transition of unique NASA satellite observations and weather forecasting capabilities for use within the operational forecasting community through partnerships with NOAA's National Weather Service (NWS). SPoRT has evaluated the performance of the Weather Research and Forecasting (WRF) model on virtual machines deployed within Nebula and used Nebula instances to simulate local forecasts in support of regional forecast studies of interest to select NWS forecast offices. In addition to weather forecasting applications, rapidly deployable Nebula virtual machines have supported the processing of high resolution NASA satellite imagery to support disaster assessment following the historic severe weather and tornado outbreak of April 27, 2011. Other modeling and satellite analysis activities are underway in support of NASA's SERVIR program, which integrates satellite observations, ground-based data and forecast models to monitor environmental change and improve disaster response in Central America, the Caribbean, Africa, and the Himalayas. Leveraging SPoRT's experience, SERVIR is working to establish a real-time weather forecasting model for Central America. Other modeling efforts include hydrologic forecasts for Kenya, driven by NASA satellite observations and reanalysis data sets provided by the broader meteorological community. Forecast modeling efforts are supplemented by short-term forecasts of convective initiation, determined by geostationary satellite observations processed on virtual machines powered by Nebula. This presentation will provide an overview of these activities from a scientific and cloud computing applications perspective, identifying the strengths and weaknesses for deploying each project within an IaaS environment, and ways to collaborate with the Nebula or other cloud-user communities to collaborate on projects as they go forward.
Facilitating NASA Earth Science Data Processing Using Nebula Cloud Computing
NASA Astrophysics Data System (ADS)
Chen, A.; Pham, L.; Kempler, S.; Theobald, M.; Esfandiari, A.; Campino, J.; Vollmer, B.; Lynnes, C.
2011-12-01
Cloud Computing technology has been used to offer high-performance and low-cost computing and storage resources for both scientific problems and business services. Several cloud computing services have been implemented in the commercial arena, e.g. Amazon's EC2 & S3, Microsoft's Azure, and Google App Engine. There are also some research and application programs being launched in academia and governments to utilize Cloud Computing. NASA launched the Nebula Cloud Computing platform in 2008, which is an Infrastructure as a Service (IaaS) to deliver on-demand distributed virtual computers. Nebula users can receive required computing resources as a fully outsourced service. NASA Goddard Earth Science Data and Information Service Center (GES DISC) migrated several GES DISC's applications to the Nebula as a proof of concept, including: a) The Simple, Scalable, Script-based Science Processor for Measurements (S4PM) for processing scientific data; b) the Atmospheric Infrared Sounder (AIRS) data process workflow for processing AIRS raw data; and c) the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (GIOVANNI) for online access to, analysis, and visualization of Earth science data. This work aims to evaluate the practicability and adaptability of the Nebula. The initial work focused on the AIRS data process workflow to evaluate the Nebula. The AIRS data process workflow consists of a series of algorithms being used to process raw AIRS level 0 data and output AIRS level 2 geophysical retrievals. Migrating the entire workflow to the Nebula platform is challenging, but practicable. After installing several supporting libraries and the processing code itself, the workflow is able to process AIRS data in a similar fashion to its current (non-cloud) configuration. We compared the performance of processing 2 days of AIRS level 0 data through level 2 using a Nebula virtual computer and a local Linux computer. The result shows that Nebula has significantly better performance than the local machine. Much of the difference was due to newer equipment in the Nebula than the legacy computer, which is suggestive of a potential economic advantage beyond elastic power, i.e., access to up-to-date hardware vs. legacy hardware that must be maintained past its prime to amortize the cost. In addition to a trade study of advantages and challenges of porting complex processing to the cloud, a tutorial was developed to enable further progress in utilizing the Nebula for Earth Science applications and understanding better the potential for Cloud Computing in further data- and computing-intensive Earth Science research. In particular, highly bursty computing such as that experienced in the user-demand-driven Giovanni system may become more tractable in a Cloud environment. Our future work will continue to focus on migrating more GES DISC's applications/instances, e.g. Giovanni instances, to the Nebula platform and making matured migrated applications to be in operation on the Nebula.
Modeling the Cloud to Enhance Capabilities for Crises and Catastrophe Management
2016-11-16
order for cloud computing infrastructures to be successfully deployed in real world scenarios as tools for crisis and catastrophe management, where...Statement of the Problem Studied As cloud computing becomes the dominant computational infrastructure[1] and cloud technologies make a transition to hosting...1. Formulate rigorous mathematical models representing technological capabilities and resources in cloud computing for performance modeling and
An Effective Mechanism for Virtual Machine Placement using Aco in IAAS Cloud
NASA Astrophysics Data System (ADS)
Shenbaga Moorthy, Rajalakshmi; Fareentaj, U.; Divya, T. K.
2017-08-01
Cloud computing provides an effective way to dynamically provide numerous resources to meet customer demands. A major challenging problem for cloud providers is designing efficient mechanisms for optimal virtual machine Placement (OVMP). Such mechanisms enable the cloud providers to effectively utilize their available resources and obtain higher profits. In order to provide appropriate resources to the clients an optimal virtual machine placement algorithm is proposed. Virtual machine placement is NP-Hard problem. Such NP-Hard problem can be solved using heuristic algorithm. In this paper, Ant Colony Optimization based virtual machine placement is proposed. Our proposed system focuses on minimizing the cost spending in each plan for hosting virtual machines in a multiple cloud provider environment and the response time of each cloud provider is monitored periodically, in such a way to minimize delay in providing the resources to the users. The performance of the proposed algorithm is compared with greedy mechanism. The proposed algorithm is simulated in Eclipse IDE. The results clearly show that the proposed algorithm minimizes the cost, response time and also number of migrations.
2017-03-30
experimental evaluations for hosting DDDAS-like applications in public cloud infrastructures . Finally, we report on ongoing work towards using the DDDAS...developed and their experimental evaluations for hosting DDDAS-like applications in public cloud infrastructures . Finally, we report on ongoing work towards...Dynamic resource management, model learning, simulation-based optimizations, cloud infrastructures for DDDAS applications. I. INTRODUCTION Critical cyber
Managing virtual machines with Vac and Vcycle
NASA Astrophysics Data System (ADS)
McNab, A.; Love, P.; MacMahon, E.
2015-12-01
We compare the Vac and Vcycle virtual machine lifecycle managers and our experiences in providing production job execution services for ATLAS, CMS, LHCb, and the GridPP VO at sites in the UK, France and at CERN. In both the Vac and Vcycle systems, the virtual machines are created outside of the experiment's job submission and pilot framework. In the case of Vac, a daemon runs on each physical host which manages a pool of virtual machines on that host, and a peer-to-peer UDP protocol is used to achieve the desired target shares between experiments across the site. In the case of Vcycle, a daemon manages a pool of virtual machines on an Infrastructure-as-a-Service cloud system such as OpenStack, and has within itself enough information to create the types of virtual machines to achieve the desired target shares. Both systems allow unused shares for one experiment to temporarily taken up by other experiements with work to be done. The virtual machine lifecycle is managed with a minimum of information, gathered from the virtual machine creation mechanism (such as libvirt or OpenStack) and using the proposed Machine/Job Features API from WLCG. We demonstrate that the same virtual machine designs can be used to run production jobs on Vac and Vcycle/OpenStack sites for ATLAS, CMS, LHCb, and GridPP, and that these technologies allow sites to be operated in a reliable and robust way.
Security model for VM in cloud
NASA Astrophysics Data System (ADS)
Kanaparti, Venkataramana; Naveen K., R.; Rajani, S.; Padmvathamma, M.; Anitha, C.
2013-03-01
Cloud computing is a new approach emerged to meet ever-increasing demand for computing resources and to reduce operational costs and Capital Expenditure for IT services. As this new way of computation allows data and applications to be stored away from own corporate server, it brings more issues in security such as virtualization security, distributed computing, application security, identity management, access control and authentication. Even though Virtualization forms the basis for cloud computing it poses many threats in securing cloud. As most of Security threats lies at Virtualization layer in cloud we proposed this new Security Model for Virtual Machine in Cloud (SMVC) in which every process is authenticated by Trusted-Agent (TA) in Hypervisor as well as in VM. Our proposed model is designed to with-stand attacks by unauthorized process that pose threat to applications related to Data Mining, OLAP systems, Image processing which requires huge resources in cloud deployed on one or more VM's.
NASA Astrophysics Data System (ADS)
Murata, K. T.
2014-12-01
Data-intensive or data-centric science is 4th paradigm after observational and/or experimental science (1st paradigm), theoretical science (2nd paradigm) and numerical science (3rd paradigm). Science cloud is an infrastructure for 4th science methodology. The NICT science cloud is designed for big data sciences of Earth, space and other sciences based on modern informatics and information technologies [1]. Data flow on the cloud is through the following three techniques; (1) data crawling and transfer, (2) data preservation and stewardship, and (3) data processing and visualization. Original tools and applications of these techniques have been designed and implemented. We mash up these tools and applications on the NICT Science Cloud to build up customized systems for each project. In this paper, we discuss science data processing through these three steps. For big data science, data file deployment on a distributed storage system should be well designed in order to save storage cost and transfer time. We developed a high-bandwidth virtual remote storage system (HbVRS) and data crawling tool, NICTY/DLA and Wide-area Observation Network Monitoring (WONM) system, respectively. Data files are saved on the cloud storage system according to both data preservation policy and data processing plan. The storage system is developed via distributed file system middle-ware (Gfarm: GRID datafarm). It is effective since disaster recovery (DR) and parallel data processing are carried out simultaneously without moving these big data from storage to storage. Data files are managed on our Web application, WSDBank (World Science Data Bank). The big-data on the cloud are processed via Pwrake, which is a workflow tool with high-bandwidth of I/O. There are several visualization tools on the cloud; VirtualAurora for magnetosphere and ionosphere, VDVGE for google Earth, STICKER for urban environment data and STARStouch for multi-disciplinary data. There are 30 projects running on the NICT Science Cloud for Earth and space science. In 2003 56 refereed papers were published. At the end, we introduce a couple of successful results of Earth and space sciences using these three techniques carried out on the NICT Sciences Cloud. [1] http://sc-web.nict.go.jp
NASA Astrophysics Data System (ADS)
Maskey, Manil; Ramachandran, Rahul; Kuo, Kwo-Sen
2015-04-01
The Collaborative WorkBench (CWB) has been successfully developed to support collaborative science algorithm development. It incorporates many features that enable and enhance science collaboration, including the support for both asynchronous and synchronous modes of interactions in collaborations. With the former, members in a team can share a full range of research artifacts, e.g. data, code, visualizations, and even virtual machine images. With the latter, they can engage in dynamic interactions such as notification, instant messaging, file exchange, and, most notably, collaborative programming. CWB also implements behind-the-scene provenance capture as well as version control to relieve scientists of these chores. Furthermore, it has achieved a seamless integration between researchers' local compute environments and those of the Cloud. CWB has also been successfully extended to support instrument verification and validation. Adopted by almost every researcher, the current practice of downloading data to local compute resources for analysis results in much duplication and inefficiency. CWB leverages Cloud infrastructure to provide a central location for data used by an entire science team, thereby eliminating much of this duplication and waste. Furthermore, use of CWB in concert with this same Cloud infrastructure enables co-located analysis with data where opportunities of data-parallelism can be better exploited, thereby further improving efficiency. With its collaboration-enabling features apposite to steps throughout the scientific process, we expect CWB to fundamentally transform research collaboration and realize maximum science productivity.
[Porting Radiotherapy Software of Varian to Cloud Platform].
Zou, Lian; Zhang, Weisha; Liu, Xiangxiang; Xie, Zhao; Xie, Yaoqin
2017-09-30
To develop a low-cost private cloud platform of radiotherapy software. First, a private cloud platform which was based on OpenStack and the virtual GPU hardware was builded. Then on the private cloud platform, all the Varian radiotherapy software modules were installed to the virtual machine, and the corresponding function configuration was completed. Finally the software on the cloud was able to be accessed by virtual desktop client. The function test results of the cloud workstation show that a cloud workstation is equivalent to an isolated physical workstation, and any clients on the LAN can use the cloud workstation smoothly. The cloud platform transplantation in this study is economical and practical. The project not only improves the utilization rates of radiotherapy software, but also makes it possible that the cloud computing technology can expand its applications to the field of radiation oncology.
Bigdata Driven Cloud Security: A Survey
NASA Astrophysics Data System (ADS)
Raja, K.; Hanifa, Sabibullah Mohamed
2017-08-01
Cloud Computing (CC) is a fast-growing technology to perform massive-scale and complex computing. It eliminates the need to maintain expensive computing hardware, dedicated space, and software. Recently, it has been observed that massive growth in the scale of data or big data generated through cloud computing. CC consists of a front-end, includes the users’ computers and software required to access the cloud network, and back-end consists of various computers, servers and database systems that create the cloud. In SaaS (Software as-a-Service - end users to utilize outsourced software), PaaS (Platform as-a-Service-platform is provided) and IaaS (Infrastructure as-a-Service-physical environment is outsourced), and DaaS (Database as-a-Service-data can be housed within a cloud), where leading / traditional cloud ecosystem delivers the cloud services become a powerful and popular architecture. Many challenges and issues are in security or threats, most vital barrier for cloud computing environment. The main barrier to the adoption of CC in health care relates to Data security. When placing and transmitting data using public networks, cyber attacks in any form are anticipated in CC. Hence, cloud service users need to understand the risk of data breaches and adoption of service delivery model during deployment. This survey deeply covers the CC security issues (covering Data Security in Health care) so as to researchers can develop the robust security application models using Big Data (BD) on CC (can be created / deployed easily). Since, BD evaluation is driven by fast-growing cloud-based applications developed using virtualized technologies. In this purview, MapReduce [12] is a good example of big data processing in a cloud environment, and a model for Cloud providers.
NASA Astrophysics Data System (ADS)
Johnes, P.; Greene, S.; Freer, J. E.; Bloomfield, J.; Macleod, K.; Reaney, S. M.; Odoni, N. A.
2012-12-01
The best outcomes from watershed management arise where policy and mitigation efforts are underpinned by strong science evidence, but there are major resourcing problems associated with the scale of monitoring needed to effectively characterise the sources rates and impacts of nutrient enrichment nationally. The challenge is to increase national capability in predictive modelling of nutrient flux to waters, securing an effective mechanism for transferring knowledge and management tools from data-rich to data-poor regions. The inadequacy of existing tools and approaches to address these challenges provided the motivation for the Environmental Virtual Observatory programme (EVOp), an innovation from the UK Natural Environment Research Council (NERC). EVOp is exploring the use of a cloud-based infrastructure in catchment science, developing an exemplar to explore N and P fluxes to inland and coastal waters in the UK from grid to catchment and national scale. EVOp is bringing together for the first time national data sets, models and uncertainty analysis into cloud computing environments to explore and benchmark current predictive capability for national scale biogeochemical modelling. The objective is to develop national biogeochemical modelling capability, capitalising on extensive national investment in the development of science understanding and modelling tools to support integrated catchment management, and supporting knowledge transfer from data rich to data poor regions, The AERC export coefficient model (Johnes et al., 2007) has been adapted to function within the EVOp cloud environment, and on a geoclimatic basis, using a range of high resolution, geo-referenced digital datasets as an initial demonstration of the enhanced national capacity for N and P flux modelling using cloud computing infrastructure. Geoclimatic regions are landscape units displaying homogenous or quasi-homogenous functional behaviour in terms of process controls on N and P cycling, underpin this approach (Johnes & Butterfield, 2002). Ten regions have been defined across the UK using GIS manipulation of spatial data describing hydrogeology, runoff, topographical slope and soil parent material. The export coefficient model operates within this regional modelling framework, providing mapped, tabulated and statistical outputs at scales from 1km2 grid scale to river catchment, WFD river basin district, major coastal drainage units to the North Sea, North Atlantic and English Channel, to the international reporting units defined under OSPAR, the International Convention for the protection of the marine environment of the North-East Atlantic. Here the geoclimatic modelling framework is presented together with modelled fluxes for N and P for each scale of reporting unit, together with scenario analysis applied at regional scale and mapped at national scale. The ways in which the results can be used to further explore the primary drivers for spatial variation and identify waterbodies at risk, especially in unmonitored and data-poor catchments are discussed, and the technical and computational support of a cloud-based infrastructure is evaluated as a mechanism to explore potential water quality impacts of future mitigation strategies applied at catchment to national scale.
Software Engineering Infrastructure in a Large Virtual Campus
ERIC Educational Resources Information Center
Cristobal, Jesus; Merino, Jorge; Navarro, Antonio; Peralta, Miguel; Roldan, Yolanda; Silveira, Rosa Maria
2011-01-01
Purpose: The design, construction and deployment of a large virtual campus are a complex issue. Present virtual campuses are made of several software applications that complement e-learning platforms. In order to develop and maintain such virtual campuses, a complex software engineering infrastructure is needed. This paper aims to analyse the…
Distributed Processing of Sentinel-2 Products using the BIGEARTH Platform
NASA Astrophysics Data System (ADS)
Bacu, Victor; Stefanut, Teodor; Nandra, Constantin; Mihon, Danut; Gorgan, Dorian
2017-04-01
The constellation of observational satellites orbiting around Earth is constantly increasing, providing more data that need to be processed in order to extract meaningful information and knowledge from it. Sentinel-2 satellites, part of the Copernicus Earth Observation program, aim to be used in agriculture, forestry and many other land management applications. ESA's SNAP toolbox can be used to process data gathered by Sentinel-2 satellites but is limited to the resources provided by a stand-alone computer. In this paper we present a cloud based software platform that makes use of this toolbox together with other remote sensing software applications to process Sentinel-2 products. The BIGEARTH software platform [1] offers an integrated solution for processing Earth Observation data coming from different sources (such as satellites or on-site sensors). The flow of processing is defined as a chain of tasks based on the WorDeL description language [2]. Each task could rely on a different software technology (such as Grass GIS and ESA's SNAP) in order to process the input data. One important feature of the BIGEARTH platform comes from this possibility of interconnection and integration, throughout the same flow of processing, of the various well known software technologies. All this integration is transparent from the user perspective. The proposed platform extends the SNAP capabilities by enabling specialists to easily scale the processing over distributed architectures, according to their specific needs and resources. The software platform [3] can be used in multiple configurations. In the basic one the software platform runs as a standalone application inside a virtual machine. Obviously in this case the computational resources are limited but it will give an overview of the functionalities of the software platform, and also the possibility to define the flow of processing and later on to execute it on a more complex infrastructure. The most complex and robust configuration is based on cloud computing and allows the installation on a private or public cloud infrastructure. In this configuration, the processing resources can be dynamically allocated and the execution time can be considerably improved by the available virtual resources and the number of parallelizable sequences in the processing flow. The presentation highlights the benefits and issues of the proposed solution by analyzing some significant experimental use cases. Main references for further information: [1] BigEarth project, http://cgis.utcluj.ro/projects/bigearth [2] Constantin Nandra, Dorian Gorgan: "Defining Earth data batch processing tasks by means of a flexible workflow description language", ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-4, 59-66, (2016). [3] Victor Bacu, Teodor Stefanut, Dorian Gorgan, "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015).
On-demand Simulation of Atmospheric Transport Processes on the AlpEnDAC Cloud
NASA Astrophysics Data System (ADS)
Hachinger, S.; Harsch, C.; Meyer-Arnek, J.; Frank, A.; Heller, H.; Giemsa, E.
2016-12-01
The "Alpine Environmental Data Analysis Centre" (AlpEnDAC) develops a data-analysis platform for high-altitude research facilities within the "Virtual Alpine Observatory" project (VAO). This platform, with its web portal, will support use cases going much beyond data management: On user request, the data are augmented with "on-demand" simulation results, such as air-parcel trajectories for tracing down the source of pollutants when they appear in high concentration. The respective back-end mechanism uses the Compute Cloud of the Leibniz Supercomputing Centre (LRZ) to transparently calculate results requested by the user, as far as they have not yet been stored in AlpEnDAC. The queuing-system operation model common in supercomputing is replaced by a model in which Virtual Machines (VMs) on the cloud are automatically created/destroyed, providing the necessary computing power immediately on demand. From a security point of view, this allows to perform simulations in a sandbox defined by the VM configuration, without direct access to a computing cluster. Within few minutes, the user receives conveniently visualized results. The AlpEnDAC infrastructure is distributed among two participating institutes [front-end at German Aerospace Centre (DLR), simulation back-end at LRZ], requiring an efficient mechanism for synchronization of measured and augmented data. We discuss our iRODS-based solution for these data-management tasks as well as the general AlpEnDAC framework. Our cloud-based offerings aim at making scientific computing for our users much more convenient and flexible than it has been, and to allow scientists without a broad background in scientific computing to benefit from complex numerical simulations.
The diverse use of clouds by CMS
Andronis, Anastasios; Bauer, Daniela; Chaze, Olivier; ...
2015-12-23
The resources CMS is using are increasingly being offered as clouds. In Run 2 of the LHC the majority of CMS CERN resources, both in Meyrin and at the Wigner Computing Centre, will be presented as cloud resources on which CMS will have to build its own infrastructure. This infrastructure will need to run all of the CMS workflows including: Tier 0, production and user analysis. In addition, the CMS High Level Trigger will provide a compute resource comparable in scale to the total offered by the CMS Tier 1 sites, when it is not running as part of themore » trigger system. During these periods a cloud infrastructure will be overlaid on this resource, making it accessible for general CMS use. Finally, CMS is starting to utilise cloud resources being offered by individual institutes and is gaining experience to facilitate the use of opportunistically available cloud resources. Lastly, we present a snap shot of this infrastructure and its operation at the time of the CHEP2015 conference.« less
Cloud Environment Automation: from infrastructure deployment to application monitoring
NASA Astrophysics Data System (ADS)
Aiftimiei, C.; Costantini, A.; Bucchi, R.; Italiano, A.; Michelotto, D.; Panella, M.; Pergolesi, M.; Saletta, M.; Traldi, S.; Vistoli, C.; Zizzi, G.; Salomoni, D.
2017-10-01
The potential offered by the cloud paradigm is often limited by technical issues, rules and regulations. In particular, the activities related to the design and deployment of the Infrastructure as a Service (IaaS) cloud layer can be difficult to apply and time-consuming for the infrastructure maintainers. In this paper the research activity, carried out during the Open City Platform (OCP) research project [1], aimed at designing and developing an automatic tool for cloud-based IaaS deployment is presented. Open City Platform is an industrial research project funded by the Italian Ministry of University and Research (MIUR), started in 2014. It intends to research, develop and test new technological solutions open, interoperable and usable on-demand in the field of Cloud Computing, along with new sustainable organizational models that can be deployed for and adopted by the Public Administrations (PA). The presented work and the related outcomes are aimed at simplifying the deployment and maintenance of a complete IaaS cloud-based infrastructure.
2011-10-01
Fortunately, some products offer centralized management and deployment tools for local desktop implementation . Figure 5 illustrates the... implementation of a secure desktop infrastructure based on virtualization. It includes an overview of desktop virtualization, including an in-depth...environment in the data centre, whereas LHVD places it on the endpoint itself. Desktop virtualization implementation considerations and potential
An overview of the DII-HEP OpenStack based CMS data analysis
NASA Astrophysics Data System (ADS)
Osmani, L.; Tarkoma, S.; Eerola, P.; Komu, M.; Kortelainen, M. J.; Kraemer, O.; Lindén, T.; Toor, S.; White, J.
2015-05-01
An OpenStack based private cloud with the Cluster File System has been built and used with both CMS analysis and Monte Carlo simulation jobs in the Datacenter Indirection Infrastructure for Secure High Energy Physics (DII-HEP) project. On the cloud we run the ARC middleware that allows running CMS applications without changes on the job submission side. Our test results indicate that the adopted approach provides a scalable and resilient solution for managing resources without compromising on performance and high availability. To manage the virtual machines (VM) dynamically in an elastic fasion, we are testing the EMI authorization service (Argus) and the Execution Environment Service (Argus-EES). An OpenStackplugin has been developed for Argus-EES. The Host Identity Protocol (HIP) has been designed for mobile networks and it provides a secure method for IP multihoming. HIP separates the end-point identifier and locator role for IP address which increases the network availability for the applications. Our solution leverages HIP for traffic management. This presentation gives an update on the status of the work and our lessons learned in creating an OpenStackbased cloud for HEP.
Design and Development of ChemInfoCloud: An Integrated Cloud Enabled Platform for Virtual Screening.
Karthikeyan, Muthukumarasamy; Pandit, Deepak; Bhavasar, Arvind; Vyas, Renu
2015-01-01
The power of cloud computing and distributed computing has been harnessed to handle vast and heterogeneous data required to be processed in any virtual screening protocol. A cloud computing platorm ChemInfoCloud was built and integrated with several chemoinformatics and bioinformatics tools. The robust engine performs the core chemoinformatics tasks of lead generation, lead optimisation and property prediction in a fast and efficient manner. It has also been provided with some of the bioinformatics functionalities including sequence alignment, active site pose prediction and protein ligand docking. Text mining, NMR chemical shift (1H, 13C) prediction and reaction fingerprint generation modules for efficient lead discovery are also implemented in this platform. We have developed an integrated problem solving cloud environment for virtual screening studies that also provides workflow management, better usability and interaction with end users using container based virtualization, OpenVz.
Elastic Cloud Computing Infrastructures in the Open Cirrus Testbed Implemented via Eucalyptus
NASA Astrophysics Data System (ADS)
Baun, Christian; Kunze, Marcel
Cloud computing realizes the advantages and overcomes some restrictionsof the grid computing paradigm. Elastic infrastructures can easily be createdand managed by cloud users. In order to accelerate the research ondata center management and cloud services the OpenCirrusTM researchtestbed has been started by HP, Intel and Yahoo!. Although commercialcloud offerings are proprietary, Open Source solutions exist in the field ofIaaS with Eucalyptus, PaaS with AppScale and at the applications layerwith Hadoop MapReduce. This paper examines the I/O performance ofcloud computing infrastructures implemented with Eucalyptus in contrastto Amazon S3.
Cloudweaver: Adaptive and Data-Driven Workload Manager for Generic Clouds
NASA Astrophysics Data System (ADS)
Li, Rui; Chen, Lei; Li, Wen-Syan
Cloud computing denotes the latest trend in application development for parallel computing on massive data volumes. It relies on clouds of servers to handle tasks that used to be managed by an individual server. With cloud computing, software vendors can provide business intelligence and data analytic services for internet scale data sets. Many open source projects, such as Hadoop, offer various software components that are essential for building a cloud infrastructure. Current Hadoop (and many others) requires users to configure cloud infrastructures via programs and APIs and such configuration is fixed during the runtime. In this chapter, we propose a workload manager (WLM), called CloudWeaver, which provides automated configuration of a cloud infrastructure for runtime execution. The workload management is data-driven and can adapt to dynamic nature of operator throughput during different execution phases. CloudWeaver works for a single job and a workload consisting of multiple jobs running concurrently, which aims at maximum throughput using a minimum set of processors.
Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.
Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P
2010-12-22
Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
European grid services for global earth science
NASA Astrophysics Data System (ADS)
Brewer, S.; Sipos, G.
2012-04-01
This presentation will provide an overview of the distributed computing services that the European Grid Infrastructure (EGI) offers to the Earth Sciences community and also explain the processes whereby Earth Science users can engage with the infrastructure. One of the main overarching goals for EGI over the coming year is to diversify its user-base. EGI therefore - through the National Grid Initiatives (NGIs) that provide the bulk of resources that make up the infrastructure - offers a number of routes whereby users, either individually or as communities, can make use of its services. At one level there are two approaches to working with EGI: either users can make use of existing resources and contribute to their evolution and configuration; or alternatively they can work with EGI, and hence the NGIs, to incorporate their own resources into the infrastructure to take advantage of EGI's monitoring, networking and managing services. Adopting this approach does not imply a loss of ownership of the resources. Both of these approaches are entirely applicable to the Earth Sciences community. The former because researchers within this field have been involved with EGI (and previously EGEE) as a Heavy User Community and the latter because they have very specific needs, such as incorporating HPC services into their workflows, and these will require multi-skilled interventions to fully provide such services. In addition to the technical support services that EGI has been offering for the last year or so - the applications database, the training marketplace and the Virtual Organisation services - there now exists a dynamic short-term project framework that can be utilised to establish and operate services for Earth Science users. During this talk we will present a summary of various on-going projects that will be of interest to Earth Science users with the intention that suggestions for future projects will emerge from the subsequent discussions: • The Federated Cloud Task Force is already providing a cloud infrastructure through a few committed NGIs. This is being made available to research communities participating in the Task Force and the long-term aim is to integrate these national clouds into a pan-European infrastructure for scientific communities. • The MPI group provides support for application developers to port and scale up parallel applications to the global European Grid Infrastructure. • A lively portal developer and provider community that is able to setup and operate custom, application and/or community specific portals for members of the Earth Science community to interact with EGI. • A project to assess the possibilities for federated identity management in EGI and the readiness of EGI member states for federated authentication and authorisation mechanisms. • Operating resources and user support services to process data with new types of services and infrastructures, such as desktop grids, map-reduce frameworks, GPU clusters.
Future Naval Use of COTS Networking Infrastructure
2009-07-01
user to benefit from Google’s vast databases and computational resources. Obviously, the ability to harness the full power of the Cloud could be... Computing Impact Findings Action Items Take-Aways Appendices: Pages 54-68 A. Terms of Reference Document B. Sample Definitions of Cloud ...and definition of Cloud Computing . While Cloud Computing is developing in many variations – including Infrastructure as a Service (IaaS), Platform as
NASA Astrophysics Data System (ADS)
Meertens, C. M.; Boler, F. M.; Ertz, D. J.; Mencin, D.; Phillips, D.; Baker, S.
2017-12-01
UNAVCO, in its role as a NSF facility for geodetic infrastructure and data, has succeeded for over two decades using on-premises infrastructure, and while the promise of cloud-based infrastructure is well-established, significant questions about suitability of such infrastructure for facility-scale services remain. Primarily through the GeoSciCloud award from NSF EarthCube, UNAVCO is investigating the costs, advantages, and disadvantages of providing its geodetic data and services in the cloud versus using UNAVCO's on-premises infrastructure. (IRIS is a collaborator on the project and is performing its own suite of investigations). In contrast to the 2-3 year time scale for the research cycle, the time scale of operation and planning for NSF facilities is for a minimum of five years and for some services extends to a decade or more. Planning for on-premises infrastructure is deliberate, and migrations typically take months to years to fully implement. Migrations to a cloud environment can only go forward with similar deliberate planning and understanding of all costs and benefits. The EarthCube GeoSciCloud project is intended to address the uncertainties of facility-level operations in the cloud. Investigations are being performed in a commercial cloud environment (Amazon AWS) during the first year of the project and in a private cloud environment (NSF XSEDE resource at the Texas Advanced Computing Center) during the second year. These investigations are expected to illuminate the potential as well as the limitations of running facility scale production services in the cloud. The work includes running parallel equivalent cloud-based services to on premises services and includes: data serving via ftp from a large data store, operation of a metadata database, production scale processing of multiple months of geodetic data, web services delivery of quality checked data and products, large-scale compute services for event post-processing, and serving real time data from a network of 700-plus GPS stations. The evaluation is based on a suite of metrics that we have developed to elucidate the effectiveness of cloud-based services in price, performance, and management. Services are currently running in AWS and evaluation is underway.
A scalable infrastructure for CMS data analysis based on OpenStack Cloud and Gluster file system
NASA Astrophysics Data System (ADS)
Toor, S.; Osmani, L.; Eerola, P.; Kraemer, O.; Lindén, T.; Tarkoma, S.; White, J.
2014-06-01
The challenge of providing a resilient and scalable computational and data management solution for massive scale research environments requires continuous exploration of new technologies and techniques. In this project the aim has been to design a scalable and resilient infrastructure for CERN HEP data analysis. The infrastructure is based on OpenStack components for structuring a private Cloud with the Gluster File System. We integrate the state-of-the-art Cloud technologies with the traditional Grid middleware infrastructure. Our test results show that the adopted approach provides a scalable and resilient solution for managing resources without compromising on performance and high availability.
NASA Astrophysics Data System (ADS)
Angius, S.; Bisegni, C.; Ciuffetti, P.; Di Pirro, G.; Foggetta, L. G.; Galletti, F.; Gargana, R.; Gioscio, E.; Maselli, D.; Mazzitelli, G.; Michelotti, A.; Orrù, R.; Pistoni, M.; Spagnoli, F.; Spigone, D.; Stecchi, A.; Tonto, T.; Tota, M. A.; Catani, L.; Di Giulio, C.; Salina, G.; Buzzi, P.; Checcucci, B.; Lubrano, P.; Piccini, M.; Fattibene, E.; Michelotto, M.; Cavallaro, S. R.; Diana, B. F.; Enrico, F.; Pulvirenti, S.
2016-01-01
The paper is aimed to present the !CHAOS open source project aimed to develop a prototype of a national private Cloud Computing infrastructure, devoted to accelerator control systems and large experiments of High Energy Physics (HEP). The !CHAOS project has been financed by MIUR (Italian Ministry of Research and Education) and aims to develop a new concept of control system and data acquisition framework by providing, with a high level of aaabstraction, all the services needed for controlling and managing a large scientific, or non-scientific, infrastructure. A beta version of the !CHAOS infrastructure will be released at the end of December 2015 and will run on private Cloud infrastructures based on OpenStack.
ERIC Educational Resources Information Center
Conn, Samuel S.; Reichgelt, Han
2013-01-01
Cloud computing represents an architecture and paradigm of computing designed to deliver infrastructure, platforms, and software as constructible computing resources on demand to networked users. As campuses are challenged to better accommodate academic needs for applications and computing environments, cloud computing can provide an accommodating…
Teaching Cybersecurity Using the Cloud
ERIC Educational Resources Information Center
Salah, Khaled; Hammoud, Mohammad; Zeadally, Sherali
2015-01-01
Cloud computing platforms can be highly attractive to conduct course assignments and empower students with valuable and indispensable hands-on experience. In particular, the cloud can offer teaching staff and students (whether local or remote) on-demand, elastic, dedicated, isolated, (virtually) unlimited, and easily configurable virtual machines.…
Virtual Business Operating Environment in the Cloud: Conceptual Architecture and Challenges
NASA Astrophysics Data System (ADS)
Nezhad, Hamid R. Motahari; Stephenson, Bryan; Singhal, Sharad; Castellanos, Malu
Advances in service oriented architecture (SOA) have brought us close to the once imaginary vision of establishing and running a virtual business, a business in which most or all of its business functions are outsourced to online services. Cloud computing offers a realization of SOA in which IT resources are offered as services that are more affordable, flexible and attractive to businesses. In this paper, we briefly study advances in cloud computing, and discuss the benefits of using cloud services for businesses and trade-offs that they have to consider. We then present 1) a layered architecture for the virtual business, and 2) a conceptual architecture for a virtual business operating environment. We discuss the opportunities and research challenges that are ahead of us in realizing the technical components of this conceptual architecture. We conclude by giving the outlook and impact of cloud services on both large and small businesses.
SIMPLEX: Cloud-Enabled Pipeline for the Comprehensive Analysis of Exome Sequencing Data
Fischer, Maria; Snajder, Rene; Pabinger, Stephan; Dander, Andreas; Schossig, Anna; Zschocke, Johannes; Trajanoski, Zlatko; Stocker, Gernot
2012-01-01
In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a) initial quality control, (b) intelligent data filtering and pre-processing, (c) sequence alignment to a reference genome, (d) SNP and DIP detection, (e) functional annotation of variants using different approaches, and (f) detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2). The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome. PMID:22870267
Cloud-based hospital information system as a service for grassroots healthcare institutions.
Yao, Qin; Han, Xiong; Ma, Xi-Kun; Xue, Yi-Feng; Chen, Yi-Jun; Li, Jing-Song
2014-09-01
Grassroots healthcare institutions (GHIs) are the smallest administrative levels of medical institutions, where most patients access health services. The latest report from the National Bureau of Statistics of China showed that 96.04 % of 950,297 medical institutions in China were at the grassroots level in 2012, including county-level hospitals, township central hospitals, community health service centers, and rural clinics. In developing countries, these institutions are facing challenges involving a shortage of funds and talent, inconsistent medical standards, inefficient information sharing, and difficulties in management during the adoption of health information technologies (HIT). Because of the necessity and gravity for GHIs, our aim is to provide hospital information services for GHIs using Cloud computing technologies and service modes. In this medical scenario, the computing resources are pooled by means of a Cloud-based Virtual Desktop Infrastructure (VDI) to serve multiple GHIs, with different hospital information systems dynamically assigned and reassigned according to demand. This paper is concerned with establishing a Cloud-based Hospital Information Service Center to provide hospital information software as a service (HI-SaaS) with the aim of providing GHIs with an attractive and high-performance medical information service. Compared with individually establishing all hospital information systems, this approach is more cost-effective and affordable for GHIs and does not compromise HIT performance.
Abstracting application deployment on Cloud infrastructures
NASA Astrophysics Data System (ADS)
Aiftimiei, D. C.; Fattibene, E.; Gargana, R.; Panella, M.; Salomoni, D.
2017-10-01
Deploying a complex application on a Cloud-based infrastructure can be a challenging task. In this contribution we present an approach for Cloud-based deployment of applications and its present or future implementation in the framework of several projects, such as “!CHAOS: a cloud of controls” [1], a project funded by MIUR (Italian Ministry of Research and Education) to create a Cloud-based deployment of a control system and data acquisition framework, “INDIGO-DataCloud” [2], an EC H2020 project targeting among other things high-level deployment of applications on hybrid Clouds, and “Open City Platform”[3], an Italian project aiming to provide open Cloud solutions for Italian Public Administrations. We considered to use an orchestration service to hide the complex deployment of the application components, and to build an abstraction layer on top of the orchestration one. Through Heat [4] orchestration service, we prototyped a dynamic, on-demand, scalable platform of software components, based on OpenStack infrastructures. On top of the orchestration service we developed a prototype of a web interface exploiting the Heat APIs. The user can start an instance of the application without having knowledge about the underlying Cloud infrastructure and services. Moreover, the platform instance can be customized by choosing parameters related to the application such as the size of a File System or the number of instances of a NoSQL DB cluster. As soon as the desired platform is running, the web interface offers the possibility to scale some infrastructure components. In this contribution we describe the solution design and implementation, based on the application requirements, the details of the development of both the Heat templates and of the web interface, together with possible exploitation strategies of this work in Cloud data centers.
Economic analysis of cloud-based desktop virtualization implementation at a hospital
2012-01-01
Background Cloud-based desktop virtualization infrastructure (VDI) is known as providing simplified management of application and desktop, efficient management of physical resources, and rapid service deployment, as well as connection to the computer environment at anytime, anywhere with anydevice. However, the economic validity of investing in the adoption of the system at a hospital has not been established. Methods This study computed the actual investment cost of the hospital-wide VDI implementation at the 910-bed Seoul National University Bundang Hospital in Korea and the resulting effects (i.e., reductions in PC errors and difficulties, application and operating system update time, and account management time). Return on investment (ROI), net present value (NPV), and internal rate of return (IRR) indexes used for corporate investment decision-making were used for the economic analysis of VDI implementation. Results The results of five-year cost-benefit analysis given for 400 Virtual Machines (VMs; i.e., 1,100 users in the case of SNUBH) showed that the break-even point was reached in the fourth year of the investment. At that point, the ROI was 122.6%, the NPV was approximately US$192,000, and the IRR showed an investment validity of 10.8%. From our sensitivity analysis to changing the number of VMs (in terms of number of users), the greater the number of adopted VMs was the more investable the system was. Conclusions This study confirms that the emerging VDI can have an economic impact on hospital information system (HIS) operation and utilization in a tertiary hospital setting. PMID:23110661
Economic analysis of cloud-based desktop virtualization implementation at a hospital.
Yoo, Sooyoung; Kim, Seok; Kim, Taeki; Baek, Rong-Min; Suh, Chang Suk; Chung, Chin Youb; Hwang, Hee
2012-10-30
Cloud-based desktop virtualization infrastructure (VDI) is known as providing simplified management of application and desktop, efficient management of physical resources, and rapid service deployment, as well as connection to the computer environment at anytime, anywhere with any device. However, the economic validity of investing in the adoption of the system at a hospital has not been established. This study computed the actual investment cost of the hospital-wide VDI implementation at the 910-bed Seoul National University Bundang Hospital in Korea and the resulting effects (i.e., reductions in PC errors and difficulties, application and operating system update time, and account management time). Return on investment (ROI), net present value (NPV), and internal rate of return (IRR) indexes used for corporate investment decision-making were used for the economic analysis of VDI implementation. The results of five-year cost-benefit analysis given for 400 Virtual Machines (VMs; i.e., 1,100 users in the case of SNUBH) showed that the break-even point was reached in the fourth year of the investment. At that point, the ROI was 122.6%, the NPV was approximately US$192,000, and the IRR showed an investment validity of 10.8%. From our sensitivity analysis to changing the number of VMs (in terms of number of users), the greater the number of adopted VMs was the more investable the system was. This study confirms that the emerging VDI can have an economic impact on hospital information system (HIS) operation and utilization in a tertiary hospital setting.
Evaluating open-source cloud computing solutions for geosciences
NASA Astrophysics Data System (ADS)
Huang, Qunying; Yang, Chaowei; Liu, Kai; Xia, Jizhe; Xu, Chen; Li, Jing; Gui, Zhipeng; Sun, Min; Li, Zhenglong
2013-09-01
Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.
NASA Astrophysics Data System (ADS)
Yang, Wei; Hall, Trevor
2012-12-01
The Internet is entering an era of cloud computing to provide more cost effective, eco-friendly and reliable services to consumer and business users and the nature of the Internet traffic will undertake a fundamental transformation. Consequently, the current Internet will no longer suffice for serving cloud traffic in metro areas. This work proposes an infrastructure with a unified control plane that integrates simple packet aggregation technology with optical express through the interoperation between IP routers and electrical traffic controllers in optical metro networks. The proposed infrastructure provides flexible, intelligent, and eco-friendly bandwidth on demand for cloud computing in metro areas.
Space Science Cloud: a Virtual Space Science Research Platform Based on Cloud Model
NASA Astrophysics Data System (ADS)
Hu, Xiaoyan; Tong, Jizhou; Zou, Ziming
Through independent and co-operational science missions, Strategic Pioneer Program (SPP) on Space Science, the new initiative of space science program in China which was approved by CAS and implemented by National Space Science Center (NSSC), dedicates to seek new discoveries and new breakthroughs in space science, thus deepen the understanding of universe and planet earth. In the framework of this program, in order to support the operations of space science missions and satisfy the demand of related research activities for e-Science, NSSC is developing a virtual space science research platform based on cloud model, namely the Space Science Cloud (SSC). In order to support mission demonstration, SSC integrates interactive satellite orbit design tool, satellite structure and payloads layout design tool, payload observation coverage analysis tool, etc., to help scientists analyze and verify space science mission designs. Another important function of SSC is supporting the mission operations, which runs through the space satellite data pipelines. Mission operators can acquire and process observation data, then distribute the data products to other systems or issue the data and archives with the services of SSC. In addition, SSC provides useful data, tools and models for space researchers. Several databases in the field of space science are integrated and an efficient retrieve system is developing. Common tools for data visualization, deep processing (e.g., smoothing and filtering tools), analysis (e.g., FFT analysis tool and minimum variance analysis tool) and mining (e.g., proton event correlation analysis tool) are also integrated to help the researchers to better utilize the data. The space weather models on SSC include magnetic storm forecast model, multi-station middle and upper atmospheric climate model, solar energetic particle propagation model and so on. All the services above-mentioned are based on the e-Science infrastructures of CAS e.g. cloud storage and cloud computing. SSC provides its users with self-service storage and computing resources at the same time.At present, the prototyping of SSC is underway and the platform is expected to be put into trial operation in August 2014. We hope that as SSC develops, our vision of Digital Space may come true someday.
NASA Astrophysics Data System (ADS)
Bada, Adedayo; Wang, Qi; Alcaraz-Calero, Jose M.; Grecos, Christos
2016-04-01
This paper proposes a new approach to improving the application of 3D video rendering and streaming by jointly exploring and optimizing both cloud-based virtualization and web-based delivery. The proposed web service architecture firstly establishes a software virtualization layer based on QEMU (Quick Emulator), an open-source virtualization software that has been able to virtualize system components except for 3D rendering, which is still in its infancy. The architecture then explores the cloud environment to boost the speed of the rendering at the QEMU software virtualization layer. The capabilities and inherent limitations of Virgil 3D, which is one of the most advanced 3D virtual Graphics Processing Unit (GPU) available, are analyzed through benchmarking experiments and integrated into the architecture to further speed up the rendering. Experimental results are reported and analyzed to demonstrate the benefits of the proposed approach.
Use of containerisation as an alternative to full virtualisation in grid environments.
NASA Astrophysics Data System (ADS)
Long, Robin
2015-12-01
Virtualisation is a key tool on the grid. It can be used to provide varying work environments or as part of a cloud infrastructure. Virtualisation itself carries certain overheads that decrease the performance of the system through requiring extra resources to virtualise the software and hardware stack, and CPU-cycles wasted instantiating or destroying virtual machines for each job. With the rise and improvements in containerisation, where only the software stack is kept separate and no hardware or kernel virtualisation is used, there is scope for speed improvements and efficiency increases over standard virtualisation. We compare containerisation and virtualisation, including a comparison against bare-metal machines as a benchmark.
NASA Technical Reports Server (NTRS)
Ido, Haisam; Burns, Rich
2015-01-01
The NASA Goddard Space Science Mission Operations project (SSMO) is performing a technical cost-benefit analysis for centralizing and consolidating operations of a diverse set of missions into a unified and integrated technical infrastructure. The presentation will focus on the notion of normalizing spacecraft operations processes, workflows, and tools. It will also show the processes of creating a standardized open architecture, creating common security models and implementations, interfaces, services, automations, notifications, alerts, logging, publish, subscribe and middleware capabilities. The presentation will also discuss how to leverage traditional capabilities, along with virtualization, cloud computing services, control groups and containers, and possibly Big Data concepts.
Cloud computing can simplify HIT infrastructure management.
Glaser, John
2011-08-01
Software as a Service (SaaS), built on cloud computing technology, is emerging as the forerunner in IT infrastructure because it helps healthcare providers reduce capital investments. Cloud computing leads to predictable, monthly, fixed operating expenses for hospital IT staff. Outsourced cloud computing facilities are state-of-the-art data centers boasting some of the most sophisticated networking equipment on the market. The SaaS model helps hospitals safeguard against technology obsolescence, minimizes maintenance requirements, and simplifies management.
NASA Astrophysics Data System (ADS)
Klump, Jens; Fraser, Ryan; Wyborn, Lesley; Friedrich, Carsten; Squire, Geoffrey; Barker, Michelle; Moloney, Glenn
2017-04-01
The researcher of today is likely to be part of a team distributed over multiple sites that will access data from an external repository and then process the data on a public or private cloud or even on a large centralised supercomputer. They are increasingly likely to use a mixture of their own code, third party software and libraries, or even access global community codes. These components will be connected into a Virtual Research Environments (VREs) that will enable members of the research team who are not co-located to actively work together at various scales to share data, models, tools, software, workflows, best practices, infrastructures, etc. Many VRE's are built in isolation: designed to meet a specific research program with components tightly coupled and not capable of being repurposed for other use cases - they are becoming 'stovepipes'. The limited number of users of some VREs also means that the cost of maintenance per researcher can be unacceptably high. The alternative is to develop service-oriented Science Platforms that enable multiple communities to develop specialised solutions for specific research programs. The platforms can offer access to data, software tools and processing infrastructures (cloud, supercomputers) through globally distributed, interconnected modules. In Australia, the Virtual Geophysics Laboratory (VGL) was initially built to enable a specific set of researchers in government agencies access to specific data sets and a limited number of tools, that is now rapidly evolving into a multi-purpose Earth science platform with access to an increased variety of data, a broader range of tools, users from more sectors and a diversity of computational infrastructures. The expansion has been relatively easy, because of the architecture whereby data, tools and compute resources are loosely coupled via interfaces that are built on international standards and accessed as services wherever possible. In recent years, investments in discoverability and accessibility of data via online services in Australia mean that data resources can be easily added to the virtual environments as and when required. Another key to increasing to reusability and uptake of the VRE is the capability to capturing workflows so that they can be reused and repurposed both within and beyond the community that that defined the original use case. Unfortunately, Software-as-a-Service in the research sector is not yet mature. In response, we developed a Scientific Software solutions Center (SSSC) that enables researchers to discover, deploy and then share computational codes, code snippets or processes both in a human and machine-readable manner. Growth has come not only from within the Earth science community but from the Australian Virtual Laboratory community which is building VREs for a diversity of communities such as astronomy, genomics, environment, humanities, climate etc. Components such as access control, provenance, visualisation, accounting etc. are common to all scientific domains and sharing of these across multiple domains reduces costs, but more importantly increases the ability to undertake interdisciplinary science. These efforts are transitioning VREs to more sustainable Service-oriented Science Platforms that can be delivered in an agile, adaptable manner for broader community interests.
Virtualization for the LHCb Online system
NASA Astrophysics Data System (ADS)
Bonaccorsi, Enrico; Brarda, Loic; Moine, Gary; Neufeld, Niko
2011-12-01
Virtualization has long been advertised by the IT-industry as a way to cut down cost, optimise resource usage and manage the complexity in large data-centers. The great number and the huge heterogeneity of hardware, both industrial and custom-made, has up to now led to reluctance in the adoption of virtualization in the IT infrastructure of large experiment installations. Our experience in the LHCb experiment has shown that virtualization improves the availability and the manageability of the whole system. We have done an evaluation of available hypervisors / virtualization solutions and find that the Microsoft HV technology provides a high level of maturity and flexibility for our purpose. We present the results of these comparison tests, describing in detail, the architecture of our virtualization infrastructure with a special emphasis on the security for services visible to the outside world. Security is achieved by a sophisticated combination of VLANs, firewalls and virtual routing - the cost and benefits of this solution are analysed. We have adapted our cluster management tools, notably Quattor, for the needs of virtual machines and this allows us to migrate smoothly services on physical machines to the virtualized infrastructure. The procedures for migration will also be described. In the final part of the document we describe our recent R&D activities aiming to replacing the SAN-backend for the virtualization by a cheaper iSCSI solution - this will allow to move all servers and related services to the virtualized infrastructure, excepting the ones doing hardware control via non-commodity PCI plugin cards.
Grid accounting service: state and future development
NASA Astrophysics Data System (ADS)
Levshina, T.; Sehgal, C.; Bockelman, B.; Weitzel, D.; Guru, A.
2014-06-01
During the last decade, large-scale federated distributed infrastructures have been continually developed and expanded. One of the crucial components of a cyber-infrastructure is an accounting service that collects data related to resource utilization and identity of users using resources. The accounting service is important for verifying pledged resource allocation per particular groups and users, providing reports for funding agencies and resource providers, and understanding hardware provisioning requirements. It can also be used for end-to-end troubleshooting as well as billing purposes. In this work we describe Gratia, a federated accounting service jointly developed at Fermilab and Holland Computing Center at University of Nebraska-Lincoln. The Open Science Grid, Fermilab, HCC, and several other institutions have used Gratia in production for several years. The current development activities include expanding Virtual Machines provisioning information, XSEDE allocation usage accounting, and Campus Grids resource utilization. We also identify the direction of future work: improvement and expansion of Cloud accounting, persistent and elastic storage space allocation, and the incorporation of WAN and LAN network metrics.
Data Mining as a Service (DMaaS)
NASA Astrophysics Data System (ADS)
Tejedor, E.; Piparo, D.; Mascetti, L.; Moscicki, J.; Lamanna, M.; Mato, P.
2016-10-01
Data Mining as a Service (DMaaS) is a software and computing infrastructure that allows interactive mining of scientific data in the cloud. It allows users to run advanced data analyses by leveraging the widely adopted Jupyter notebook interface. Furthermore, the system makes it easier to share results and scientific code, access scientific software, produce tutorials and demonstrations as well as preserve the analyses of scientists. This paper describes how a first pilot of the DMaaS service is being deployed at CERN, starting from the notebook interface that has been fully integrated with the ROOT analysis framework, in order to provide all the tools for scientists to run their analyses. Additionally, we characterise the service backend, which combines a set of IT services such as user authentication, virtual computing infrastructure, mass storage, file synchronisation, development portals or batch systems. The added value acquired by the combination of the aforementioned categories of services is discussed, focusing on the opportunities offered by the CERNBox synchronisation service and its massive storage backend, EOS.
García-Peñalvo, Francisco J.; Pérez-Blanco, Jonás Samuel; Martín-Suárez, Ana
2014-01-01
This paper discusses how cloud-based architectures can extend and enhance the functionality of the training environments based on virtual worlds and how, from this cloud perspective, we can provide support to analysis of training processes in the area of health, specifically in the field of training processes in quality assurance for pharmaceutical laboratories, presenting a tool for data retrieval and analysis that allows facing the knowledge discovery in the happenings inside the virtual worlds. PMID:24778593
A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data
NASA Astrophysics Data System (ADS)
Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.
2017-12-01
Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.
Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong
2015-01-01
In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands. PMID:25691896
Galaxy CloudMan: delivering cloud compute clusters.
Afgan, Enis; Baker, Dannon; Coraor, Nate; Chapman, Brad; Nekrutenko, Anton; Taylor, James
2010-12-21
Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is "cloud computing", which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate "as is" use by experimental biologists. We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.
Galaxy CloudMan: delivering cloud compute clusters
2010-01-01
Background Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is “cloud computing”, which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate “as is” use by experimental biologists. Results We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon’s EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. Conclusions The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge. PMID:21210983
LEMON - LHC Era Monitoring for Large-Scale Infrastructures
NASA Astrophysics Data System (ADS)
Marian, Babik; Ivan, Fedorko; Nicholas, Hook; Hector, Lansdale Thomas; Daniel, Lenkes; Miroslav, Siket; Denis, Waldron
2011-12-01
At the present time computer centres are facing a massive rise in virtualization and cloud computing as these solutions bring advantages to service providers and consolidate the computer centre resources. However, as a result the monitoring complexity is increasing. Computer centre management requires not only to monitor servers, network equipment and associated software but also to collect additional environment and facilities data (e.g. temperature, power consumption, cooling efficiency, etc.) to have also a good overview of the infrastructure performance. The LHC Era Monitoring (Lemon) system is addressing these requirements for a very large scale infrastructure. The Lemon agent that collects data on every client and forwards the samples to the central measurement repository provides a flexible interface that allows rapid development of new sensors. The system allows also to report on behalf of remote devices such as switches and power supplies. Online and historical data can be visualized via a web-based interface or retrieved via command-line tools. The Lemon Alarm System component can be used for notifying the operator about error situations. In this article, an overview of the Lemon monitoring is provided together with a description of the CERN LEMON production instance. No direct comparison is made with other monitoring tool.
Exploring the Strategies for a Community College Transition into a Cloud-Computing Environment
ERIC Educational Resources Information Center
DeBary, Narges
2017-01-01
The use of the Internet has resulted in the birth of an innovative virtualization technology called cloud computing. Virtualization can tremendously improve the instructional and operational systems of a community college. Although the incidental adoption of the cloud solutions in the community colleges of higher education has been increased,…
Maestro: an orchestration framework for large-scale WSN simulations.
Riliskis, Laurynas; Osipov, Evgeny
2014-03-18
Contemporary wireless sensor networks (WSNs) have evolved into large and complex systems and are one of the main technologies used in cyber-physical systems and the Internet of Things. Extensive research on WSNs has led to the development of diverse solutions at all levels of software architecture, including protocol stacks for communications. This multitude of solutions is due to the limited computational power and restrictions on energy consumption that must be accounted for when designing typical WSN systems. It is therefore challenging to develop, test and validate even small WSN applications, and this process can easily consume significant resources. Simulations are inexpensive tools for testing, verifying and generally experimenting with new technologies in a repeatable fashion. Consequently, as the size of the systems to be tested increases, so does the need for large-scale simulations. This article describes a tool called Maestro for the automation of large-scale simulation and investigates the feasibility of using cloud computing facilities for such task. Using tools that are built into Maestro, we demonstrate a feasible approach for benchmarking cloud infrastructure in order to identify cloud Virtual Machine (VM)instances that provide an optimal balance of performance and cost for a given simulation.
Machine learning patterns for neuroimaging-genetic studies in the cloud.
Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand
2014-01-01
Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.
Maestro: An Orchestration Framework for Large-Scale WSN Simulations
Riliskis, Laurynas; Osipov, Evgeny
2014-01-01
Contemporary wireless sensor networks (WSNs) have evolved into large and complex systems and are one of the main technologies used in cyber-physical systems and the Internet of Things. Extensive research on WSNs has led to the development of diverse solutions at all levels of software architecture, including protocol stacks for communications. This multitude of solutions is due to the limited computational power and restrictions on energy consumption that must be accounted for when designing typical WSN systems. It is therefore challenging to develop, test and validate even small WSN applications, and this process can easily consume significant resources. Simulations are inexpensive tools for testing, verifying and generally experimenting with new technologies in a repeatable fashion. Consequently, as the size of the systems to be tested increases, so does the need for large-scale simulations. This article describes a tool called Maestro for the automation of large-scale simulation and investigates the feasibility of using cloud computing facilities for such task. Using tools that are built into Maestro, we demonstrate a feasible approach for benchmarking cloud infrastructure in order to identify cloud Virtual Machine (VM)instances that provide an optimal balance of performance and cost for a given simulation. PMID:24647123
Use of Docker for deployment and testing of astronomy software
NASA Astrophysics Data System (ADS)
Morris, D.; Voutsinas, S.; Hambly, N. C.; Mann, R. G.
2017-07-01
We describe preliminary investigations of using Docker for the deployment and testing of astronomy software. Docker is a relatively new containerization technology that is developing rapidly and being adopted across a range of domains. It is based upon virtualization at operating system level, which presents many advantages in comparison to the more traditional hardware virtualization that underpins most cloud computing infrastructure today. A particular strength of Docker is its simple format for describing and managing software containers, which has benefits for software developers, system administrators and end users. We report on our experiences from two projects - a simple activity to demonstrate how Docker works, and a more elaborate set of services that demonstrates more of its capabilities and what they can achieve within an astronomical context - and include an account of how we solved problems through interaction with Docker's very active open source development community, which is currently the key to the most effective use of this rapidly-changing technology.
NASA Astrophysics Data System (ADS)
Bolton, Richard W.; Dewey, Allen; Horstmann, Paul W.; Laurentiev, John
1997-01-01
This paper examines the role virtual enterprises will have in supporting future business engagements and resulting technology requirements. Two representative end-user scenarios are proposed that define the requirements for 'plug-and-play' information infrastructure frameworks and architectures necessary to enable 'virtual enterprises' in US manufacturing industries. The scenarios provide a high- level 'needs analysis' for identifying key technologies, defining a reference architecture, and developing compliant reference implementations. Virtual enterprises are short- term consortia or alliances of companies formed to address fast-changing opportunities. Members of a virtual enterprise carry out their tasks as if they all worked for a single organization under 'one roof', using 'plug-and-play' information infrastructure frameworks and architectures to access and manage all information needed to support the product cycle. 'Plug-and-play' information infrastructure frameworks and architectures are required to enhance collaboration between companies corking together on different aspects of a manufacturing process. This new form of collaborative computing will decrease cycle-time and increase responsiveness to change.
Kononowicz, Andrzej A; Berman, Anne H; Stathakarou, Natalia; McGrath, Cormac; Bartyński, Tomasz; Nowakowski, Piotr; Malawski, Maciej; Zary, Nabil
2015-09-10
Massive open online courses (MOOCs) have been criticized for focusing on presentation of short video clip lectures and asking theoretical multiple-choice questions. A potential way of vitalizing these educational activities in the health sciences is to introduce virtual patients. Experiences from such extensions in MOOCs have not previously been reported in the literature. This study analyzes technical challenges and solutions for offering virtual patients in health-related MOOCs and describes patterns of virtual patient use in one such course. Our aims are to reduce the technical uncertainty related to these extensions, point to aspects that could be optimized for a better learner experience, and raise prospective research questions by describing indicators of virtual patient use on a massive scale. The Behavioral Medicine MOOC was offered by Karolinska Institutet, a medical university, on the EdX platform in the autumn of 2014. Course content was enhanced by two virtual patient scenarios presented in the OpenLabyrinth system and hosted on the VPH-Share cloud infrastructure. We analyzed web server and session logs and a participant satisfaction survey. Navigation pathways were summarized using a visual analytics tool developed for the purpose of this study. The number of course enrollments reached 19,236. At the official closing date, 2317 participants (12.1% of total enrollment) had declared completing the first virtual patient assignment and 1640 (8.5%) participants confirmed completion of the second virtual patient assignment. Peak activity involved 359 user sessions per day. The OpenLabyrinth system, deployed on four virtual servers, coped well with the workload. Participant survey respondents (n=479) regarded the activity as a helpful exercise in the course (83.1%). Technical challenges reported involved poor or restricted access to videos in certain areas of the world and occasional problems with lost sessions. The visual analyses of user pathways display the parts of virtual patient scenarios that elicited less interest and may have been perceived as nonchallenging options. Analyzing the user navigation pathways allowed us to detect indications of both surface and deep approaches to the content material among the MOOC participants. This study reported on first inclusion of virtual patients in a MOOC. It adds to the body of knowledge by demonstrating how a biomedical cloud provider service can ensure technical capacity and flexible design of a virtual patient platform on a massive scale. The study also presents a new way of analyzing the use of branched virtual patients by visualization of user navigation pathways. Suggestions are offered on improvements to the design of virtual patients in MOOCs.
Berman, Anne H; Stathakarou, Natalia; McGrath, Cormac; Bartyński, Tomasz; Nowakowski, Piotr; Malawski, Maciej; Zary, Nabil
2015-01-01
Background Massive open online courses (MOOCs) have been criticized for focusing on presentation of short video clip lectures and asking theoretical multiple-choice questions. A potential way of vitalizing these educational activities in the health sciences is to introduce virtual patients. Experiences from such extensions in MOOCs have not previously been reported in the literature. Objective This study analyzes technical challenges and solutions for offering virtual patients in health-related MOOCs and describes patterns of virtual patient use in one such course. Our aims are to reduce the technical uncertainty related to these extensions, point to aspects that could be optimized for a better learner experience, and raise prospective research questions by describing indicators of virtual patient use on a massive scale. Methods The Behavioral Medicine MOOC was offered by Karolinska Institutet, a medical university, on the EdX platform in the autumn of 2014. Course content was enhanced by two virtual patient scenarios presented in the OpenLabyrinth system and hosted on the VPH-Share cloud infrastructure. We analyzed web server and session logs and a participant satisfaction survey. Navigation pathways were summarized using a visual analytics tool developed for the purpose of this study. Results The number of course enrollments reached 19,236. At the official closing date, 2317 participants (12.1% of total enrollment) had declared completing the first virtual patient assignment and 1640 (8.5%) participants confirmed completion of the second virtual patient assignment. Peak activity involved 359 user sessions per day. The OpenLabyrinth system, deployed on four virtual servers, coped well with the workload. Participant survey respondents (n=479) regarded the activity as a helpful exercise in the course (83.1%). Technical challenges reported involved poor or restricted access to videos in certain areas of the world and occasional problems with lost sessions. The visual analyses of user pathways display the parts of virtual patient scenarios that elicited less interest and may have been perceived as nonchallenging options. Analyzing the user navigation pathways allowed us to detect indications of both surface and deep approaches to the content material among the MOOC participants. Conclusions This study reported on first inclusion of virtual patients in a MOOC. It adds to the body of knowledge by demonstrating how a biomedical cloud provider service can ensure technical capacity and flexible design of a virtual patient platform on a massive scale. The study also presents a new way of analyzing the use of branched virtual patients by visualization of user navigation pathways. Suggestions are offered on improvements to the design of virtual patients in MOOCs. PMID:27731844
EGI-EUDAT integration activity - Pair data and high-throughput computing resources together
NASA Astrophysics Data System (ADS)
Scardaci, Diego; Viljoen, Matthew; Vitlacil, Dejan; Fiameni, Giuseppe; Chen, Yin; sipos, Gergely; Ferrari, Tiziana
2016-04-01
EGI (www.egi.eu) is a publicly funded e-infrastructure put together to give scientists access to more than 530,000 logical CPUs, 200 PB of disk capacity and 300 PB of tape storage to drive research and innovation in Europe. The infrastructure provides both high throughput computing and cloud compute/storage capabilities. Resources are provided by about 350 resource centres which are distributed across 56 countries in Europe, the Asia-Pacific region, Canada and Latin America. EUDAT (www.eudat.eu) is a collaborative Pan-European infrastructure providing research data services, training and consultancy for researchers, research communities, research infrastructures and data centres. EUDAT's vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, combining the richness of numerous community-specific data repositories with the permanence and persistence of some of Europe's largest scientific data centres. EGI and EUDAT, in the context of their flagship projects, EGI-Engage and EUDAT2020, started in March 2015 a collaboration to harmonise the two infrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. The main objective of this work is to provide end-users with a seamless access to an integrated infrastructure offering both EGI and EUDAT services and, then, pairing data and high-throughput computing resources together. To define the roadmap of this collaboration, EGI and EUDAT selected a set of relevant user communities, already collaborating with both infrastructures, which could bring requirements and help to assign the right priorities to each of them. In this way, from the beginning, this activity has been really driven by the end users. The identified user communities are relevant European Research infrastructure in the field of Earth Science (EPOS and ICOS), Bioinformatics (BBMRI and ELIXIR) and Space Physics (EISCAT-3D). The first outcome of this activity has been the definition of a generic use case that captures the typical user scenario with respect the integrated use of the EGI and EUDAT infrastructures. This generic use case allows a user to instantiate a set of Virtual Machine images on the EGI Federated Cloud to perform computational jobs that analyse data previously stored on EUDAT long-term storage systems. The results of such analysis can be staged back to EUDAT storages, and if needed, allocated with Permanent identifyers (PIDs) for future use. The implementation of this generic use case requires the following integration activities between EGI and EUDAT: (1) harmonisation of the user authentication and authorisation models, (2) implementing interface connectors between the relevant EGI and EUDAT services, particularly EGI Cloud compute facilities and EUDAT long-term storage and PID systems. In the presentation, the collected user requirements and the implementation status of the universal use case will be showed. Furthermore, how the universal use case is currently applied to satisfy EPOS and ICOS needs will be described.
NASA Astrophysics Data System (ADS)
Aneri, Parikh; Sumathy, S.
2017-11-01
Cloud computing provides services over the internet and provides application resources and data to the users based on their demand. Base of the Cloud Computing is consumer provider model. Cloud provider provides resources which consumer can access using cloud computing model in order to build their application based on their demand. Cloud data center is a bulk of resources on shared pool architecture for cloud user to access. Virtualization is the heart of the Cloud computing model, it provides virtual machine as per application specific configuration and those applications are free to choose their own configuration. On one hand, there is huge number of resources and on other hand it has to serve huge number of requests effectively. Therefore, resource allocation policy and scheduling policy play very important role in allocation and managing resources in this cloud computing model. This paper proposes the load balancing policy using Hungarian algorithm. Hungarian Algorithm provides dynamic load balancing policy with a monitor component. Monitor component helps to increase cloud resource utilization by managing the Hungarian algorithm by monitoring its state and altering its state based on artificial intelligent. CloudSim used in this proposal is an extensible toolkit and it simulates cloud computing environment.
PACE: Proactively Secure Accumulo with Cryptographic Enforcement
2017-05-27
Abstract—Cloud-hosted databases have many compelling ben- efits, including high availability , flexible resource allocation, and resiliency to attack...infrastructure to the cloud. This move is motivated by the cloud’s increased availability , flexibility, and resilience [1]. Most importantly, the cloud enables...a level of availability and performance that would be impossible for many companies to achieve using their own infrastructure. For example, using a
On-demand provisioning of HEP compute resources on cloud sites and shared HPC centers
NASA Astrophysics Data System (ADS)
Erli, G.; Fischer, F.; Fleig, G.; Giffels, M.; Hauth, T.; Quast, G.; Schnepf, M.; Heese, J.; Leppert, K.; Arnaez de Pedro, J.; Sträter, R.
2017-10-01
This contribution reports on solutions, experiences and recent developments with the dynamic, on-demand provisioning of remote computing resources for analysis and simulation workflows. Local resources of a physics institute are extended by private and commercial cloud sites, ranging from the inclusion of desktop clusters over institute clusters to HPC centers. Rather than relying on dedicated HEP computing centers, it is nowadays more reasonable and flexible to utilize remote computing capacity via virtualization techniques or container concepts. We report on recent experience from incorporating a remote HPC center (NEMO Cluster, Freiburg University) and resources dynamically requested from the commercial provider 1&1 Internet SE into our intitute’s computing infrastructure. The Freiburg HPC resources are requested via the standard batch system, allowing HPC and HEP applications to be executed simultaneously, such that regular batch jobs run side by side to virtual machines managed via OpenStack [1]. For the inclusion of the 1&1 commercial resources, a Python API and SDK as well as the possibility to upload images were available. Large scale tests prove the capability to serve the scientific use case in the European 1&1 datacenters. The described environment at the Institute of Experimental Nuclear Physics (IEKP) at KIT serves the needs of researchers participating in the CMS and Belle II experiments. In total, resources exceeding half a million CPU hours have been provided by remote sites.
Toward a web-based real-time radiation treatment planning system in a cloud computing environment.
Na, Yong Hum; Suh, Tae-Suk; Kapp, Daniel S; Xing, Lei
2013-09-21
To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an 'on-demand' basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture's constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm(2)) from the Varian TrueBeam(TM) STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.
Toward a web-based real-time radiation treatment planning system in a cloud computing environment
NASA Astrophysics Data System (ADS)
Hum Na, Yong; Suh, Tae-Suk; Kapp, Daniel S.; Xing, Lei
2013-09-01
To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an ‘on-demand’ basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture’s constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm2) from the Varian TrueBeamTM STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.
ERIC Educational Resources Information Center
Liao, Yuan
2011-01-01
The virtualization of computing resources, as represented by the sustained growth of cloud computing, continues to thrive. Information Technology departments are building their private clouds due to the perception of significant cost savings by managing all physical computing resources from a single point and assigning them to applications or…
NASA Astrophysics Data System (ADS)
Hu, X.; Zou, Z.
2017-12-01
For the next decades, comprehensive big data application environment is the dominant direction of cyberinfrastructure development on space science. To make the concept of such BIG cyberinfrastructure (e.g. Digital Space) a reality, these aspects of capability should be focused on and integrated, which includes science data system, digital space engine, big data application (tools and models) and the IT infrastructure. In the past few years, CAS Chinese Space Science Data Center (CSSDC) has made a helpful attempt in this direction. A cloud-enabled virtual research platform on space science, called Solar-Terrestrial and Astronomical Research Network (STAR-Network), has been developed to serve the full lifecycle of space science missions and research activities. It integrated a wide range of disciplinary and interdisciplinary resources, to provide science-problem-oriented data retrieval and query service, collaborative mission demonstration service, mission operation supporting service, space weather computing and Analysis service and other self-help service. This platform is supported by persistent infrastructure, including cloud storage, cloud computing, supercomputing and so on. Different variety of resource are interconnected: the science data can be displayed on the browser by visualization tools, the data analysis tools and physical models can be drived by the applicable science data, the computing results can be saved on the cloud, for example. So far, STAR-Network has served a series of space science mission in China, involving Strategic Pioneer Program on Space Science (this program has invested some space science satellite as DAMPE, HXMT, QUESS, and more satellite will be launched around 2020) and Meridian Space Weather Monitor Project. Scientists have obtained some new findings by using the science data from these missions with STAR-Network's contribution. We are confident that STAR-Network is an exciting practice of new cyberinfrastructure architecture on space science.
Open Source Dataturbine (OSDT) Android Sensorpod in Environmental Observing Systems
NASA Astrophysics Data System (ADS)
Fountain, T. R.; Shin, P.; Tilak, S.; Trinh, T.; Smith, J.; Kram, S.
2014-12-01
The OSDT Android SensorPod is a custom-designed mobile computing platform for assembling wireless sensor networks for environmental monitoring applications. Funded by an award from the Gordon and Betty Moore Foundation, the OSDT SensorPod represents a significant technological advance in the application of mobile and cloud computing technologies to near-real-time applications in environmental science, natural resources management, and disaster response and recovery. It provides a modular architecture based on open standards and open-source software that allows system developers to align their projects with industry best practices and technology trends, while avoiding commercial vendor lock-in to expensive proprietary software and hardware systems. The integration of mobile and cloud-computing infrastructure represents a disruptive technology in the field of environmental science, since basic assumptions about technology requirements are now open to revision, e.g., the roles of special purpose data loggers and dedicated site infrastructure. The OSDT Android SensorPod was designed with these considerations in mind, and the resulting system exhibits the following characteristics - it is flexible, efficient and robust. The system was developed and tested in the three science applications: 1) a fresh water limnology deployment in Wisconsin, 2) a near coastal marine science deployment at the UCSD Scripps Pier, and 3) a terrestrial ecological deployment in the mountains of Taiwan. As part of a public education and outreach effort, a Facebook page with daily ocean pH measurements from the UCSD Scripps pier was developed. Wireless sensor networks and the virtualization of data and network services is the future of environmental science infrastructure. The OSDT Android SensorPod was designed and developed to harness these new technology developments for environmental monitoring applications.
Global Software Development with Cloud Platforms
NASA Astrophysics Data System (ADS)
Yara, Pavan; Ramachandran, Ramaseshan; Balasubramanian, Gayathri; Muthuswamy, Karthik; Chandrasekar, Divya
Offshore and outsourced distributed software development models and processes are facing challenges, previously unknown, with respect to computing capacity, bandwidth, storage, security, complexity, reliability, and business uncertainty. Clouds promise to address these challenges by adopting recent advances in virtualization, parallel and distributed systems, utility computing, and software services. In this paper, we envision a cloud-based platform that addresses some of these core problems. We outline a generic cloud architecture, its design and our first implementation results for three cloud forms - a compute cloud, a storage cloud and a cloud-based software service- in the context of global distributed software development (GSD). Our ”compute cloud” provides computational services such as continuous code integration and a compile server farm, ”storage cloud” offers storage (block or file-based) services with an on-line virtual storage service, whereas the on-line virtual labs represent a useful cloud service. We note some of the use cases for clouds in GSD, the lessons learned with our prototypes and identify challenges that must be conquered before realizing the full business benefits. We believe that in the future, software practitioners will focus more on these cloud computing platforms and see clouds as a means to supporting a ecosystem of clients, developers and other key stakeholders.
Cost-Effective Cloud Computing: A Case Study Using the Comparative Genomics Tool, Roundup
Kudtarkar, Parul; DeLuca, Todd F.; Fusaro, Vincent A.; Tonellato, Peter J.; Wall, Dennis P.
2010-01-01
Background Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource—Roundup—using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Methods Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon’s Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. Results We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon’s computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure. PMID:21258651
Identification of Program Signatures from Cloud Computing System Telemetry Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nichols, Nicole M.; Greaves, Mark T.; Smith, William P.
Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment. In this paper we demonstrate the ability of billing metrics to identify programs, in an active cloud computing environment, including multiple virtual machines running on the same hypervisor. The openmore » source cloud computing platform OpenStack, is used for private cloud management at Pacific Northwest National Laboratory. OpenStack provides a billing tool (Ceilometer) to collect system telemetry measurements. We identify four different programs running on four virtual machines under the same cloud user account. Programs were identified with up to 95% accuracy. This accuracy is dependent on the distinctiveness of telemetry measurements for the specific programs we tested. Future work will examine the scalability of this approach for a larger selection of programs to better understand the uniqueness needed to identify a program. Additionally, future work should address the separation of signatures when multiple programs are running on the same virtual machine.« less
The HEPiX Virtualisation Working Group: Towards a Grid of Clouds
NASA Astrophysics Data System (ADS)
Cass, Tony
2012-12-01
The use of virtual machine images, as for example with Cloud services such as Amazon's Elastic Compute Cloud, is attractive for users as they have a guaranteed execution environment, something that cannot today be provided across sites participating in computing grids such as the Worldwide LHC Computing Grid. However, Grid sites often operate within computer security frameworks which preclude the use of remotely generated images. The HEPiX Virtualisation Working Group was setup with the objective to enable use of remotely generated virtual machine images at Grid sites and, to this end, has introduced the idea of trusted virtual machine images which are guaranteed to be secure and configurable by sites such that security policy commitments can be met. This paper describes the requirements and details of these trusted virtual machine images and presents a model for their use to facilitate the integration of Grid- and Cloud-based computing environments for High Energy Physics.
Enabling Research Network Connectivity to Clouds with Virtual Router Technology
NASA Astrophysics Data System (ADS)
Seuster, R.; Casteels, K.; Leavett-Brown, CR; Paterson, M.; Sobie, RJ
2017-10-01
The use of opportunistic cloud resources by HEP experiments has significantly increased over the past few years. Clouds that are owned or managed by the HEP community are connected to the LHCONE network or the research network with global access to HEP computing resources. Private clouds, such as those supported by non-HEP research funds are generally connected to the international research network; however, commercial clouds are either not connected to the research network or only connect to research sites within their national boundaries. Since research network connectivity is a requirement for HEP applications, we need to find a solution that provides a high-speed connection. We are studying a solution with a virtual router that will address the use case when a commercial cloud has research network connectivity in a limited region. In this situation, we host a virtual router in our HEP site and require that all traffic from the commercial site transit through the virtual router. Although this may increase the network path and also the load on the HEP site, it is a workable solution that would enable the use of the remote cloud for low I/O applications. We are exploring some simple open-source solutions. In this paper, we present the results of our studies and how it will benefit our use of private and public clouds for HEP computing.
NASA Astrophysics Data System (ADS)
Lengert, Wolfgang; Farres, Jordi; Lanari, Riccardo; Casu, Francesco; Manunta, Michele; Lassalle-Balier, Gerard
2014-05-01
Helix Nebula has established a growing public private partnership of more than 30 commercial cloud providers, SMEs, and publicly funded research organisations and e-infrastructures. The Helix Nebula strategy is to establish a federated cloud service across Europe. Three high-profile flagships, sponsored by CERN (high energy physics), EMBL (life sciences) and ESA/DLR/CNES/CNR (earth science), have been deployed and extensively tested within this federated environment. The commitments behind these initial flagships have created a critical mass that attracts suppliers and users to the initiative, to work together towards an "Information as a Service" market place. Significant progress in implementing the following 4 programmatic goals (as outlined in the strategic Plan Ref.1) has been achieved: × Goal #1 Establish a Cloud Computing Infrastructure for the European Research Area (ERA) serving as a platform for innovation and evolution of the overall infrastructure. × Goal #2 Identify and adopt suitable policies for trust, security and privacy on a European-level can be provided by the European Cloud Computing framework and infrastructure. × Goal #3 Create a light-weight governance structure for the future European Cloud Computing Infrastructure that involves all the stakeholders and can evolve over time as the infrastructure, services and user-base grows. × Goal #4 Define a funding scheme involving the three stake-holder groups (service suppliers, users, EC and national funding agencies) into a Public-Private-Partnership model to implement a Cloud Computing Infrastructure that delivers a sustainable business environment adhering to European level policies. Now in 2014 a first version of this generic cross-domain e-infrastructure is ready to go into operations building on federation of European industry and contributors (data, tools, knowledge, ...). This presentation describes how Helix Nebula is being used in the domain of earth science focusing on geohazards. The so called "Supersite Exploitation Platform" (SSEP) provides scientists an overarching federated e-infrastructure with a very fast access to (i) large volume of data (EO/non-space data), (ii) computing resources (e.g. hybrid cloud/grid), (iii) processing software (e.g. toolboxes, RTMs, retrieval baselines, visualization routines), and (iv) general platform capabilities (e.g. user management and access control, accounting, information portal, collaborative tools, social networks etc.). In this federation each data provider remains in full control of the implementation of its data policy. This presentation outlines the Architecture (technical and services) supporting very heterogeneous science domains as well as the procedures for new-comers to join the Helix Nebula Market Place. Ref.1 http://cds.cern.ch/record/1374172/files/CERN-OPEN-2011-036.pdf
Thundercloud: Domain specific information security training for the smart grid
NASA Astrophysics Data System (ADS)
Stites, Joseph
In this paper, we describe a cloud-based virtual smart grid test bed: ThunderCloud, which is intended to be used for domain-specific security training applicable to the smart grid environment. The test bed consists of virtual machines connected using a virtual internal network. ThunderCloud is remotely accessible, allowing students to undergo educational exercises online. We also describe a series of practical exercises that we have developed for providing the domain-specific training using ThunderCloud. The training exercises and attacks are designed to be realistic and to reflect known vulnerabilities and attacks reported in the smart grid environment. We were able to use ThunderCloud to offer practical domain-specific security training for smart grid environment to computer science students at little or no cost to the department and no risk to any real networks or systems.
Design, Results, Evolution and Status of the ATLAS Simulation at Point1 Project
NASA Astrophysics Data System (ADS)
Ballestrero, S.; Batraneanu, S. M.; Brasolin, F.; Contescu, C.; Fazio, D.; Di Girolamo, A.; Lee, C. J.; Pozo Astigarraga, M. E.; Scannicchio, D. A.; Sedov, A.; Twomey, M. S.; Wang, F.; Zaytsev, A.
2015-12-01
During the LHC Long Shutdown 1 (LSI) period, that started in 2013, the Simulation at Point1 (Sim@P1) project takes advantage, in an opportunistic way, of the TDAQ (Trigger and Data Acquisition) HLT (High-Level Trigger) farm of the ATLAS experiment. This farm provides more than 1300 compute nodes, which are particularly suited for running event generation and Monte Carlo production jobs that are mostly CPU and not I/O bound. It is capable of running up to 2700 Virtual Machines (VMs) each with 8 CPU cores, for a total of up to 22000 parallel jobs. This contribution gives a review of the design, the results, and the evolution of the Sim@P1 project, operating a large scale OpenStack based virtualized platform deployed on top of the ATLAS TDAQ HLT farm computing resources. During LS1, Sim@P1 was one of the most productive ATLAS sites: it delivered more than 33 million CPU-hours and it generated more than 1.1 billion Monte Carlo events. The design aspects are presented: the virtualization platform exploited by Sim@P1 avoids interferences with TDAQ operations and it guarantees the security and the usability of the ATLAS private network. The cloud mechanism allows the separation of the needed support on both infrastructural (hardware, virtualization layer) and logical (Grid site support) levels. This paper focuses on the operational aspects of such a large system during the upcoming LHC Run 2 period: simple, reliable, and efficient tools are needed to quickly switch from Sim@P1 to TDAQ mode and back, to exploit the resources when they are not used for the data acquisition, even for short periods. The evolution of the central OpenStack infrastructure is described, as it was upgraded from Folsom to the Icehouse release, including the scalability issues addressed.
Software architecture and design of the web services facilitating climate model diagnostic analysis
NASA Astrophysics Data System (ADS)
Pan, L.; Lee, S.; Zhang, J.; Tang, B.; Zhai, C.; Jiang, J. H.; Wang, W.; Bao, Q.; Qi, M.; Kubar, T. L.; Teixeira, J.
2015-12-01
Climate model diagnostic analysis is a computationally- and data-intensive task because it involves multiple numerical model outputs and satellite observation data that can both be high resolution. We have built an online tool that facilitates this process. The tool is called Climate Model Diagnostic Analyzer (CMDA). It employs the web service technology and provides a web-based user interface. The benefits of these choices include: (1) No installation of any software other than a browser, hence it is platform compatable; (2) Co-location of computation and big data on the server side, and small results and plots to be downloaded on the client side, hence high data efficiency; (3) multi-threaded implementation to achieve parallel performance on multi-core servers; and (4) cloud deployment so each user has a dedicated virtual machine. In this presentation, we will focus on the computer science aspects of this tool, namely the architectural design, the infrastructure of the web services, the implementation of the web-based user interface, the mechanism of provenance collection, the approach to virtualization, and the Amazon Cloud deployment. As an example, We will describe our methodology to transform an existing science application code into a web service using a Python wrapper interface and Python web service frameworks (i.e., Flask, Gunicorn, and Tornado). Another example is the use of Docker, a light-weight virtualization container, to distribute and deploy CMDA onto an Amazon EC2 instance. Our tool of CMDA has been successfully used in the 2014 Summer School hosted by the JPL Center for Climate Science. Students had positive feedbacks in general and we will report their comments. An enhanced version of CMDA with several new features, some requested by the 2014 students, will be used in the 2015 Summer School soon.
Infrastructure Systems for Advanced Computing in E-science applications
NASA Astrophysics Data System (ADS)
Terzo, Olivier
2013-04-01
In the e-science field are growing needs for having computing infrastructure more dynamic and customizable with a model of use "on demand" that follow the exact request in term of resources and storage capacities. The integration of grid and cloud infrastructure solutions allows us to offer services that can adapt the availability in terms of up scaling and downscaling resources. The main challenges for e-sciences domains will on implement infrastructure solutions for scientific computing that allow to adapt dynamically the demands of computing resources with a strong emphasis on optimizing the use of computing resources for reducing costs of investments. Instrumentation, data volumes, algorithms, analysis contribute to increase the complexity for applications who require high processing power and storage for a limited time and often exceeds the computational resources that equip the majority of laboratories, research Unit in an organization. Very often it is necessary to adapt or even tweak rethink tools, algorithms, and consolidate existing applications through a phase of reverse engineering in order to adapt them to a deployment on Cloud infrastructure. For example, in areas such as rainfall monitoring, meteorological analysis, Hydrometeorology, Climatology Bioinformatics Next Generation Sequencing, Computational Electromagnetic, Radio occultation, the complexity of the analysis raises several issues such as the processing time, the scheduling of tasks of processing, storage of results, a multi users environment. For these reasons, it is necessary to rethink the writing model of E-Science applications in order to be already adapted to exploit the potentiality of cloud computing services through the uses of IaaS, PaaS and SaaS layer. An other important focus is on create/use hybrid infrastructure typically a federation between Private and public cloud, in fact in this way when all resources owned by the organization are all used it will be easy with a federate cloud infrastructure to add some additional resources form the Public cloud for following the needs in term of computational and storage resources and release them where process are finished. Following the hybrid model, the scheduling approach is important for managing both cloud models. Thanks to this model infrastructure every time resources are available for additional request in term of IT capacities that can used "on demand" for a limited time without having to proceed to purchase additional servers.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-07
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
2016-04-01
infrastructure . The work is motivated by the fact that today’s clouds are very static, uniform, and predictable, allowing attackers who identify a...vulnerability in one of the services or infrastructure components to spread their effect to other, mission-critical services. Our goal is to integrate into...clouds by elevating continuous change, evolution, and misinformation as first-rate design principles of the cloud’s infrastructure . Our work is
NASA Astrophysics Data System (ADS)
Bolodurina, I. P.; Parfenov, D. I.
2017-10-01
The goal of our investigation is optimization of network work in virtual data center. The advantage of modern infrastructure virtualization lies in the possibility to use software-defined networks. However, the existing optimization of algorithmic solutions does not take into account specific features working with multiple classes of virtual network functions. The current paper describes models characterizing the basic structures of object of virtual data center. They including: a level distribution model of software-defined infrastructure virtual data center, a generalized model of a virtual network function, a neural network model of the identification of virtual network functions. We also developed an efficient algorithm for the optimization technology of containerization of virtual network functions in virtual data center. We propose an efficient algorithm for placing virtual network functions. In our investigation we also generalize the well renowned heuristic and deterministic algorithms of Karmakar-Karp.
The JASMIN Cloud: specialised and hybrid to meet the needs of the Environmental Sciences Community
NASA Astrophysics Data System (ADS)
Kershaw, Philip; Lawrence, Bryan; Churchill, Jonathan; Pritchard, Matt
2014-05-01
Cloud computing provides enormous opportunities for the research community. The large public cloud providers provide near-limitless scaling capability. However, adapting Cloud to scientific workloads is not without its problems. The commodity nature of the public cloud infrastructure can be at odds with the specialist requirements of the research community. Issues such as trust, ownership of data, WAN bandwidth and costing models make additional barriers to more widespread adoption. Alongside the application of public cloud for scientific applications, a number of private cloud initiatives are underway in the research community of which the JASMIN Cloud is one example. Here, cloud service models are being effectively super-imposed over more established services such as data centres, compute cluster facilities and Grids. These have the potential to deliver the specialist infrastructure needed for the science community coupled with the benefits of a Cloud service model. The JASMIN facility based at the Rutherford Appleton Laboratory was established in 2012 to support the data analysis requirements of the climate and Earth Observation community. In its first year of operation, the 5PB of available storage capacity was filled and the hosted compute capability used extensively. JASMIN has modelled the concept of a centralised large-volume data analysis facility. Key characteristics have enabled success: peta-scale fast disk connected via low latency networks to compute resources and the use of virtualisation for effective management of the resources for a range of users. A second phase is now underway funded through NERC's (Natural Environment Research Council) Big Data initiative. This will see significant expansion to the resources available with a doubling of disk-based storage to 12PB and an increase of compute capacity by a factor of ten to over 3000 processing cores. This expansion is accompanied by a broadening in the scope for JASMIN, as a service available to the entire UK environmental science community. Experience with the first phase demonstrated the range of user needs. A trade-off is needed between access privileges to resources, flexibility of use and security. This has influenced the form and types of service under development for the new phase. JASMIN will deploy a specialised private cloud organised into "Managed" and "Unmanaged" components. In the Managed Cloud, users have direct access to the storage and compute resources for optimal performance but for reasons of security, via a more restrictive PaaS (Platform-as-a-Service) interface. The Unmanaged Cloud is deployed in an isolated part of the network but co-located with the rest of the infrastructure. This enables greater liberty to tenants - full IaaS (Infrastructure-as-a-Service) capability to provision customised infrastructure - whilst at the same time protecting more sensitive parts of the system from direct access using these elevated privileges. The private cloud will be augmented with cloud-bursting capability so that it can exploit the resources available from public clouds, making it effectively a hybrid solution. A single interface will overlay the functionality of both the private cloud and external interfaces to public cloud providers giving users the flexibility to migrate resources between infrastructures as requirements dictate.
Evolutionary Multiobjective Query Workload Optimization of Cloud Data Warehouses
Dokeroglu, Tansel; Sert, Seyyit Alper; Cinar, Muhammet Serkan
2014-01-01
With the advent of Cloud databases, query optimizers need to find paretooptimal solutions in terms of response time and monetary cost. Our novel approach minimizes both objectives by deploying alternative virtual resources and query plans making use of the virtual resource elasticity of the Cloud. We propose an exact multiobjective branch-and-bound and a robust multiobjective genetic algorithm for the optimization of distributed data warehouse query workloads on the Cloud. In order to investigate the effectiveness of our approach, we incorporate the devised algorithms into a prototype system. Finally, through several experiments that we have conducted with different workloads and virtual resource configurations, we conclude remarkable findings of alternative deployments as well as the advantages and disadvantages of the multiobjective algorithms we propose. PMID:24892048
NASA Technical Reports Server (NTRS)
Teng, William; Lynnes, Christopher
2014-01-01
Once a research or application problem has been identified, one logical next step is to search for available relevant data products. Thus, an early component of a potential GEOSS-AI system, in the continuum between observations and end point research, applications, and decision making, would be one that enables transparent data discovery and access by users. Such a component might be effected via the systems data agents. Presumably, some kind of data cataloging has already been implemented, e.g., in the GEOSS Common Infrastructure (GCI). Both the agents and cataloging could also leverage existing resources external to the system. The system would have some means to accept and integrate user-contributed agents. The need or desirability for some data format internal to the system should be evaluated. Another early component would be one that facilitates browsing visualization of the data, as well as some basic analyses.Three ongoing projects at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) provide possible proto-examples of potential data access and visualization components of a cloud-based GEOSS-AI system. 1. Reorganizing data archived as time-step arrays to point-time series (data rods), as well as leveraging the NASA Simple Subset Wizard (SSW), to significantly increase the number of data products available, at multiple NASA data centers, for production as on-the-fly (virtual) data rods. SSWs data discovery is based on OpenSearch. Both pre-generated and virtual data rods are accessible via Web services. 2. Developing Web Feature Services to publish the metadata, and expose the locations, of pre-generated and virtual data rods in the GEOSS Portal and enable direct access of the data via Web services. SSW is also leveraged to increase the availability of both NASA and non-NASA data.3.Federating NASA Giovanni (Geospatial Interactive Online Visualization and Analysis Interface), for multi-sensor data exploration, that would allow each cooperating data center, currently the NASA Distributed Active Archive Centers (DAACs), to configure its own Giovanni deployment, while also allowing all the deployments to incorporate each others data. A federated Giovanni comprises Giovanni Virtual Machines, which can be run on local servers or in the cloud.
The Virtual Watershed Observatory: Cyberinfrastructure for Model-Data Integration and Access
NASA Astrophysics Data System (ADS)
Duffy, C.; Leonard, L. N.; Giles, L.; Bhatt, G.; Yu, X.
2011-12-01
The Virtual Watershed Observatory (VWO) is a concept where scientists, water managers, educators and the general public can create a virtual observatory from integrated hydrologic model results, national databases and historical or real-time observations via web services. In this paper, we propose a prototype for automated and virtualized web services software using national data products for climate reanalysis, soils, geology, terrain and land cover. The VWO has the broad purpose of making accessible water resource simulations, real-time data assimilation, calibration and archival at the scale of HUC 12 watersheds (Hydrologic Unit Code) anywhere in the continental US. Our prototype for model-data integration focuses on creating tools for fast data storage from selected national databases, as well as the computational resources necessary for a dynamic, distributed watershed simulation. The paper will describe cyberinfrastructure tools and workflow that attempts to resolve the problem of model-data accessibility and scalability such that individuals, research teams, managers and educators can create a WVO in a desired context. Examples are given for the NSF-funded Shale Hills Critical Zone Observatory and the European Critical Zone Observatories within the SoilTrEC project. In the future implementation of WVO services will benefit from the development of a cloud cyber infrastructure as the prototype evolves to data and model intensive computation for continental scale water resource predictions.
NASA Astrophysics Data System (ADS)
Delipetrev, Blagoj
2016-04-01
Presently, most of the existing software is desktop-based, designed to work on a single computer, which represents a major limitation in many ways, starting from limited computer processing, storage power, accessibility, availability, etc. The only feasible solution lies in the web and cloud. This abstract presents research and development of a cloud computing geospatial application for water resources based on free and open source software and open standards using hybrid deployment model of public - private cloud, running on two separate virtual machines (VMs). The first one (VM1) is running on Amazon web services (AWS) and the second one (VM2) is running on a Xen cloud platform. The presented cloud application is developed using free and open source software, open standards and prototype code. The cloud application presents a framework how to develop specialized cloud geospatial application that needs only a web browser to be used. This cloud application is the ultimate collaboration geospatial platform because multiple users across the globe with internet connection and browser can jointly model geospatial objects, enter attribute data and information, execute algorithms, and visualize results. The presented cloud application is: available all the time, accessible from everywhere, it is scalable, works in a distributed computer environment, it creates a real-time multiuser collaboration platform, the programing languages code and components are interoperable, and it is flexible in including additional components. The cloud geospatial application is implemented as a specialized water resources application with three web services for 1) data infrastructure (DI), 2) support for water resources modelling (WRM), 3) user management. The web services are running on two VMs that are communicating over the internet providing services to users. The application was tested on the Zletovica river basin case study with concurrent multiple users. The application is a state-of-the-art cloud geospatial collaboration platform. The presented solution is a prototype and can be used as a foundation for developing of any specialized cloud geospatial applications. Further research will be focused on distributing the cloud application on additional VMs, testing the scalability and availability of services.
Job Scheduling with Efficient Resource Monitoring in Cloud Datacenter
Loganathan, Shyamala; Mukherjee, Saswati
2015-01-01
Cloud computing is an on-demand computing model, which uses virtualization technology to provide cloud resources to users in the form of virtual machines through internet. Being an adaptable technology, cloud computing is an excellent alternative for organizations for forming their own private cloud. Since the resources are limited in these private clouds maximizing the utilization of resources and giving the guaranteed service for the user are the ultimate goal. For that, efficient scheduling is needed. This research reports on an efficient data structure for resource management and resource scheduling technique in a private cloud environment and discusses a cloud model. The proposed scheduling algorithm considers the types of jobs and the resource availability in its scheduling decision. Finally, we conducted simulations using CloudSim and compared our algorithm with other existing methods, like V-MCT and priority scheduling algorithms. PMID:26473166
Job Scheduling with Efficient Resource Monitoring in Cloud Datacenter.
Loganathan, Shyamala; Mukherjee, Saswati
2015-01-01
Cloud computing is an on-demand computing model, which uses virtualization technology to provide cloud resources to users in the form of virtual machines through internet. Being an adaptable technology, cloud computing is an excellent alternative for organizations for forming their own private cloud. Since the resources are limited in these private clouds maximizing the utilization of resources and giving the guaranteed service for the user are the ultimate goal. For that, efficient scheduling is needed. This research reports on an efficient data structure for resource management and resource scheduling technique in a private cloud environment and discusses a cloud model. The proposed scheduling algorithm considers the types of jobs and the resource availability in its scheduling decision. Finally, we conducted simulations using CloudSim and compared our algorithm with other existing methods, like V-MCT and priority scheduling algorithms.
Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J.; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius
2016-01-01
The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data. PMID:28785418
Connor, Thomas R; Loman, Nicholas J; Thompson, Simon; Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius; Sheppard, Samuel K; Pallen, Mark J
2016-09-01
The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.
Investigating the Use of Cloudbursts for High-Throughput Medical Image Registration
Kim, Hyunjoo; Parashar, Manish; Foran, David J.; Yang, Lin
2010-01-01
This paper investigates the use of clouds and autonomic cloudbursting to support a medical image registration. The goal is to enable a virtual computational cloud that integrates local computational environments and public cloud services on-the-fly, and support image registration requests from different distributed researcher groups with varied computational requirements and QoS constraints. The virtual cloud essentially implements shared and coordinated task-spaces, which coordinates the scheduling of jobs submitted by a dynamic set of research groups to their local job queues. A policy-driven scheduling agent uses the QoS constraints along with performance history and the state of the resources to determine the appropriate size and mix of the public and private cloud resource that should be allocated to a specific request. The virtual computational cloud and the medical image registration service have been developed using the CometCloud engine and have been deployed on a combination of private clouds at Rutgers University and the Cancer Institute of New Jersey and Amazon EC2. An experimental evaluation is presented and demonstrates the effectiveness of autonomic cloudbursts and policy-based autonomic scheduling for this application. PMID:20640235
Three-Dimensional Space to Assess Cloud Interoperability
2013-03-01
12 1. Portability and Mobility ...collection of network-enabled services that guarantees to provide a scalable, easy accessible, reliable, and personalized computing infrastructure , based on...are used in research to describe cloud models, such as SaaS (Software as a Service), PaaS (Platform as a service), IaaS ( Infrastructure as a Service
Towards a Multi-Mission, Airborne Science Data System Environment
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Hardman, S.; Law, E.; Freeborn, D.; Kay-Im, E.; Lau, G.; Oswald, J.
2011-12-01
NASA earth science instruments are increasingly relying on airborne missions. However, traditionally, there has been limited common infrastructure support available to principal investigators in the area of science data systems. As a result, each investigator has been required to develop their own computing infrastructures for the science data system. Typically there is little software reuse and many projects lack sufficient resources to provide a robust infrastructure to capture, process, distribute and archive the observations acquired from airborne flights. At NASA's Jet Propulsion Laboratory (JPL), we have been developing a multi-mission data system infrastructure for airborne instruments called the Airborne Cloud Computing Environment (ACCE). ACCE encompasses the end-to-end lifecycle covering planning, provisioning of data system capabilities, and support for scientific analysis in order to improve the quality, cost effectiveness, and capabilities to enable new scientific discovery and research in earth observation. This includes improving data system interoperability across each instrument. A principal characteristic is being able to provide an agile infrastructure that is architected to allow for a variety of configurations of the infrastructure from locally installed compute and storage services to provisioning those services via the "cloud" from cloud computer vendors such as Amazon.com. Investigators often have different needs that require a flexible configuration. The data system infrastructure is built on the Apache's Object Oriented Data Technology (OODT) suite of components which has been used for a number of spaceborne missions and provides a rich set of open source software components and services for constructing science processing and data management systems. In 2010, a partnership was formed between the ACCE team and the Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) mission to support the data processing and data management needs. A principal goal is to provide support for the Fourier Transform Spectrometer (FTS) instrument which will produce over 700,000 soundings over the life of their three-year mission. The cost to purchase and operate a cluster-based system in order to generate Level 2 Full Physics products from this data was prohibitive. Through an evaluation of cloud computing solutions, Amazon's Elastic Compute Cloud (EC2) was selected for the CARVE deployment. As the ACCE infrastructure is developed and extended to form an infrastructure for airborne missions, the experience of working with CARVE has provided a number of lessons learned and has proven to be important in reinforcing the unique aspects of airborne missions and the importance of the ACCE infrastructure in developing a cost effective, flexible multi-mission capability that leverages emerging capabilities in cloud computing, workflow management, and distributed computing.
JINR cloud infrastructure evolution
NASA Astrophysics Data System (ADS)
Baranov, A. V.; Balashov, N. A.; Kutovskiy, N. A.; Semenov, R. N.
2016-09-01
To fulfil JINR commitments in different national and international projects related to the use of modern information technologies such as cloud and grid computing as well as to provide a modern tool for JINR users for their scientific research a cloud infrastructure was deployed at Laboratory of Information Technologies of Joint Institute for Nuclear Research. OpenNebula software was chosen as a cloud platform. Initially it was set up in simple configuration with single front-end host and a few cloud nodes. Some custom development was done to tune JINR cloud installation to fit local needs: web form in the cloud web-interface for resources request, a menu item with cloud utilization statistics, user authentication via Kerberos, custom driver for OpenVZ containers. Because of high demand in that cloud service and its resources over-utilization it was re-designed to cover increasing users' needs in capacity, availability and reliability. Recently a new cloud instance has been deployed in high-availability configuration with distributed network file system and additional computing power.
ERIC Educational Resources Information Center
Miseviciene, Regina; Ambraziene, Danute; Tuminauskas, Raimundas; Pažereckas, Nerijus
2012-01-01
Many factors influence education nowadays. Educational institutions are faced with budget cuttings, outdated IT, data security management and the willingness to integrate remote learning at home. Virtualization technologies provide innovative solutions to the problems. The paper presents an original educational infrastructure using virtualization…
Grid accounting service: state and future development
DOE Office of Scientific and Technical Information (OSTI.GOV)
Levshina, T.; Sehgal, C.; Bockelman, B.
2014-01-01
During the last decade, large-scale federated distributed infrastructures have been continually developed and expanded. One of the crucial components of a cyber-infrastructure is an accounting service that collects data related to resource utilization and identity of users using resources. The accounting service is important for verifying pledged resource allocation per particular groups and users, providing reports for funding agencies and resource providers, and understanding hardware provisioning requirements. It can also be used for end-to-end troubleshooting as well as billing purposes. In this work we describe Gratia, a federated accounting service jointly developed at Fermilab and Holland Computing Center at Universitymore » of Nebraska-Lincoln. The Open Science Grid, Fermilab, HCC, and several other institutions have used Gratia in production for several years. The current development activities include expanding Virtual Machines provisioning information, XSEDE allocation usage accounting, and Campus Grids resource utilization. We also identify the direction of future work: improvement and expansion of Cloud accounting, persistent and elastic storage space allocation, and the incorporation of WAN and LAN network metrics.« less
A secure and efficiently searchable health information architecture.
Yasnoff, William A
2016-06-01
Patient-centric repositories of health records are an important component of health information infrastructure. However, patient information in a single repository is potentially vulnerable to loss of the entire dataset from a single unauthorized intrusion. A new health record storage architecture, the personal grid, eliminates this risk by separately storing and encrypting each person's record. The tradeoff for this improved security is that a personal grid repository must be sequentially searched since each record must be individually accessed and decrypted. To allow reasonable search times for large numbers of records, parallel processing with hundreds (or even thousands) of on-demand virtual servers (now available in cloud computing environments) is used. Estimated search times for a 10 million record personal grid using 500 servers vary from 7 to 33min depending on the complexity of the query. Since extremely rapid searching is not a critical requirement of health information infrastructure, the personal grid may provide a practical and useful alternative architecture that eliminates the large-scale security vulnerabilities of traditional databases by sacrificing unnecessary searching speed. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Murphy, K. J.; Baynes, K.; Lynnes, C.
2016-12-01
With the volume of Earth observation data expanding rapidly, cloud computing is quickly changing the way Earth observation data is processed, analyzed, and visualized. The cloud infrastructure provides the flexibility to scale up to large volumes of data and handle high velocity data streams efficiently. Having freely available Earth observation data collocated on a cloud infrastructure creates opportunities for innovation and value-added data re-use in ways unforeseen by the original data provider. These innovations spur new industries and applications and spawn new scientific pathways that were previously limited due to data volume and computational infrastructure issues. NASA, in collaboration with Amazon, Google, and Microsoft, have jointly developed a set of recommendations to enable efficient transfer of Earth observation data from existing data systems to a cloud computing infrastructure. The purpose of these recommendations is to provide guidelines against which all data providers can evaluate existing data systems and be used to improve any issues uncovered to enable efficient search, access, and use of large volumes of data. Additionally, these guidelines ensure that all cloud providers utilize a common methodology for bulk-downloading data from data providers thus preventing the data providers from building custom capabilities to meet the needs of individual cloud providers. The intent is to share these recommendations with other Federal agencies and organizations that serve Earth observation to enable efficient search, access, and use of large volumes of data. Additionally, the adoption of these recommendations will benefit data users interested in moving large volumes of data from data systems to any other location. These data users include the cloud providers, cloud users such as scientists, and other users working in a high performance computing environment who need to move large volumes of data.
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.
Thackston, Russell; Fortenberry, Ryan C
2015-05-05
The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Robinson, Niall; Tomlinson, Jacob; Prudden, Rachel; Hilson, Alex; Arribas, Alberto
2017-04-01
The Met Office Informatics Lab is a small multidisciplinary team which sits between science, technology and design. Our mission is simply "to make Met Office data useful" - a deliberately broad objective. Our prototypes often trial cutting edge technologies, and so far have included projects such as virtual reality data visualisation in the web browser, bots and natural language interfaces, and artificially intelligent weather warnings. In this talk we focus on our latest project, Jade, a big data analysis platform in the cloud. It is a powerful, flexible and simple to use implementation which makes extensive use of technologies such as Jupyter, Dask, containerisation, Infrastructure as Code, and auto-scaling. Crucially, Jade is flexible enough to be used for a diverse set of applications: it can present weather forecast information to meteorologists and allow climate scientists to analyse big data sets, but it is also effective for analysing non-geospatial data. As well as making data useful, the Informatics Lab also trials new working practises. In this presentation, we will talk about our experience of making a group like the Lab successful.
Cloud-Based Virtual Laboratory for Network Security Education
ERIC Educational Resources Information Center
Xu, Le; Huang, Dijiang; Tsai, Wei-Tek
2014-01-01
Hands-on experiments are essential for computer network security education. Existing laboratory solutions usually require significant effort to build, configure, and maintain and often do not support reconfigurability, flexibility, and scalability. This paper presents a cloud-based virtual laboratory education platform called V-Lab that provides a…
Introducing the Cloud in an Introductory IT Course
ERIC Educational Resources Information Center
Woods, David M.
2018-01-01
Cloud computing is a rapidly emerging topic, but should it be included in an introductory IT course? The magnitude of cloud computing use, especially cloud infrastructure, along with students' limited knowledge of the topic support adding cloud content to the IT curriculum. There are several arguments that support including cloud computing in an…
NASA Astrophysics Data System (ADS)
Bolodurina, I. P.; Parfenov, D. I.
2018-01-01
We have elaborated a neural network model of virtual network flow identification based on the statistical properties of flows circulating in the network of the data center and characteristics that describe the content of packets transmitted through network objects. This enabled us to establish the optimal set of attributes to identify virtual network functions. We have established an algorithm for optimizing the placement of virtual data functions using the data obtained in our research. Our approach uses a hybrid method of visualization using virtual machines and containers, which enables to reduce the infrastructure load and the response time in the network of the virtual data center. The algorithmic solution is based on neural networks, which enables to scale it at any number of the network function copies.
Enabling a Scientific Cloud Marketplace: VGL (Invited)
NASA Astrophysics Data System (ADS)
Fraser, R.; Woodcock, R.; Wyborn, L. A.; Vote, J.; Rankine, T.; Cox, S. J.
2013-12-01
The Virtual Geophysics Laboratory (VGL) provides a flexible, web based environment where researchers can browse data and use a variety of scientific software packaged into tool kits that run in the Cloud. Both data and tool kits are published by multiple researchers and registered with the VGL infrastructure forming a data and application marketplace. The VGL provides the basic work flow of Discovery and Access to the disparate data sources and a Library for tool kits and scripting to drive the scientific codes. Computation is then performed on the Research or Commercial Clouds. Provenance information is collected throughout the work flow and can be published alongside the results allowing for experiment comparison and sharing with other researchers. VGL's "mix and match" approach to data, computational resources and scientific codes, enables a dynamic approach to scientific collaboration. VGL allows scientists to publish their specific contribution, be it data, code, compute or work flow, knowing the VGL framework will provide other components needed for a complete application. Other scientists can choose the pieces that suit them best to assemble an experiment. The coarse grain workflow of the VGL framework combined with the flexibility of the scripting library and computational toolkits allows for significant customisation and sharing amongst the community. The VGL utilises the cloud computational and storage resources from the Australian academic research cloud provided by the NeCTAR initiative and a large variety of data accessible from national and state agencies via the Spatial Information Services Stack (SISS - http://siss.auscope.org). VGL v1.2 screenshot - http://vgl.auscope.org
Multiplexing Low and High QoS Workloads in Virtual Environments
NASA Astrophysics Data System (ADS)
Verboven, Sam; Vanmechelen, Kurt; Broeckhove, Jan
Virtualization technology has introduced new ways for managing IT infrastructure. The flexible deployment of applications through self-contained virtual machine images has removed the barriers for multiplexing, suspending and migrating applications with their entire execution environment, allowing for a more efficient use of the infrastructure. These developments have given rise to an important challenge regarding the optimal scheduling of virtual machine workloads. In this paper, we specifically address the VM scheduling problem in which workloads that require guaranteed levels of CPU performance are mixed with workloads that do not require such guarantees. We introduce a framework to analyze this scheduling problem and evaluate to what extent such mixed service delivery is beneficial for a provider of virtualized IT infrastructure. Traditionally providers offer IT resources under a guaranteed and fixed performance profile, which can lead to underutilization. The findings of our simulation study show that through proper tuning of a limited set of parameters, the proposed scheduling algorithm allows for a significant increase in utilization without sacrificing on performance dependability.
ATLAS user analysis on private cloud resources at GoeGrid
NASA Astrophysics Data System (ADS)
Glaser, F.; Nadal Serrano, J.; Grabowski, J.; Quadt, A.
2015-12-01
User analysis job demands can exceed available computing resources, especially before major conferences. ATLAS physics results can potentially be slowed down due to the lack of resources. For these reasons, cloud research and development activities are now included in the skeleton of the ATLAS computing model, which has been extended by using resources from commercial and private cloud providers to satisfy the demands. However, most of these activities are focused on Monte-Carlo production jobs, extending the resources at Tier-2. To evaluate the suitability of the cloud-computing model for user analysis jobs, we developed a framework to launch an ATLAS user analysis cluster in a cloud infrastructure on demand and evaluated two solutions. The first solution is entirely integrated in the Grid infrastructure by using the same mechanism, which is already in use at Tier-2: A designated Panda-Queue is monitored and additional worker nodes are launched in a cloud environment and assigned to a corresponding HTCondor queue according to the demand. Thereby, the use of cloud resources is completely transparent to the user. However, using this approach, submitted user analysis jobs can still suffer from a certain delay introduced by waiting time in the queue and the deployed infrastructure lacks customizability. Therefore, our second solution offers the possibility to easily deploy a totally private, customizable analysis cluster on private cloud resources belonging to the university.
NASA Astrophysics Data System (ADS)
Bada, Adedayo; Alcaraz-Calero, Jose M.; Wang, Qi; Grecos, Christos
2014-05-01
This paper describes a comprehensive empirical performance evaluation of 3D video processing employing the physical/virtual architecture implemented in a cloud environment. Different virtualization technologies, virtual video cards and various 3D benchmarks tools have been utilized in order to analyse the optimal performance in the context of 3D online gaming applications. This study highlights 3D video rendering performance under each type of hypervisors, and other factors including network I/O, disk I/O and memory usage. Comparisons of these factors under well-known virtual display technologies such as VNC, Spice and Virtual 3D adaptors reveal the strengths and weaknesses of the various hypervisors with respect to 3D video rendering and streaming.
CADC and CANFAR: Extending the role of the data centre
NASA Astrophysics Data System (ADS)
Gaudet, Severin
2015-12-01
Over the past six years, the CADC has moved beyond the astronomy archive data centre to a multi-service system for the community. This evolution is based on two major initiatives. The first is the adoption of International Virtual Observatory Alliance (IVOA) standards in both the system and data architecture of the CADC, including a common characterization data model. The second is the Canadian Advanced Network for Astronomical Research (CANFAR), a digital infrastructure combining the Canadian national research network (CANARIE), cloud processing and storage resources (Compute Canada) and a data centre (Canadian Astronomy Data Centre) into a unified ecosystem for storage and processing for the astronomy community. This talk will describe the architecture and integration of IVOA and CANFAR services into CADC operations, the operational experiences, the lessons learned and future directions
Distributed Trust Management for Validating SLA Choreographies
NASA Astrophysics Data System (ADS)
Haq, Irfan Ul; Alnemr, Rehab; Paschke, Adrian; Schikuta, Erich; Boley, Harold; Meinel, Christoph
For business workflow automation in a service-enriched environment such as a grid or a cloud, services scattered across heterogeneous Virtual Organizations (VOs) can be aggregated in a producer-consumer manner, building hierarchical structures of added value. In order to preserve the supply chain, the Service Level Agreements (SLAs) corresponding to the underlying choreography of services should also be incrementally aggregated. This cross-VO hierarchical SLA aggregation requires validation, for which a distributed trust system becomes a prerequisite. Elaborating our previous work on rule-based SLA validation, we propose a hybrid distributed trust model. This new model is based on Public Key Infrastructure (PKI) and reputation-based trust systems. It helps preventing SLA violations by identifying violation-prone services at service selection stage and actively contributes in breach management at the time of penalty enforcement.
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...
2017-09-29
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
Cloud computing applications for biomedical science: A perspective.
Navale, Vivek; Bourne, Philip E
2018-06-01
Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.
Cloud computing applications for biomedical science: A perspective
2018-01-01
Biomedical research has become a digital data–intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research. PMID:29902176
Cloud Computing and Its Applications in GIS
NASA Astrophysics Data System (ADS)
Kang, Cao
2011-12-01
Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)
NASA Astrophysics Data System (ADS)
Berzano, D.; Blomer, J.; Buncic, P.; Charalampidis, I.; Ganis, G.; Meusel, R.
2015-12-01
Cloud resources nowadays contribute an essential share of resources for computing in high-energy physics. Such resources can be either provided by private or public IaaS clouds (e.g. OpenStack, Amazon EC2, Google Compute Engine) or by volunteers computers (e.g. LHC@Home 2.0). In any case, experiments need to prepare a virtual machine image that provides the execution environment for the physics application at hand. The CernVM virtual machine since version 3 is a minimal and versatile virtual machine image capable of booting different operating systems. The virtual machine image is less than 20 megabyte in size. The actual operating system is delivered on demand by the CernVM File System. CernVM 3 has matured from a prototype to a production environment. It is used, for instance, to run LHC applications in the cloud, to tune event generators using a network of volunteer computers, and as a container for the historic Scientific Linux 5 and Scientific Linux 4 based software environments in the course of long-term data preservation efforts of the ALICE, CMS, and ALEPH experiments. We present experience and lessons learned from the use of CernVM at scale. We also provide an outlook on the upcoming developments. These developments include adding support for Scientific Linux 7, the use of container virtualization, such as provided by Docker, and the streamlining of virtual machine contextualization towards the cloud-init industry standard.
Cloud Based Earth Observation Data Exploitation Platforms
NASA Astrophysics Data System (ADS)
Romeo, A.; Pinto, S.; Loekken, S.; Marin, A.
2017-12-01
In the last few years data produced daily by several private and public Earth Observation (EO) satellites reached the order of tens of Terabytes, representing for scientists and commercial application developers both a big opportunity for their exploitation and a challenge for their management. New IT technologies, such as Big Data and cloud computing, enable the creation of web-accessible data exploitation platforms, which offer to scientists and application developers the means to access and use EO data in a quick and cost effective way. RHEA Group is particularly active in this sector, supporting the European Space Agency (ESA) in the Exploitation Platforms (EP) initiative, developing technology to build multi cloud platforms for the processing and analysis of Earth Observation data, and collaborating with larger European initiatives such as the European Plate Observing System (EPOS) and the European Open Science Cloud (EOSC). An EP is a virtual workspace, providing a user community with access to (i) large volume of data, (ii) algorithm development and integration environment, (iii) processing software and services (e.g. toolboxes, visualization routines), (iv) computing resources, (v) collaboration tools (e.g. forums, wiki, etc.). When an EP is dedicated to a specific Theme, it becomes a Thematic Exploitation Platform (TEP). Currently, ESA has seven TEPs in a pre-operational phase dedicated to geo-hazards monitoring and prevention, costal zones, forestry areas, hydrology, polar regions, urban areas and food security. On the technology development side, solutions like the multi cloud EO data processing platform provides the technology to integrate ICT resources and EO data from different vendors in a single platform. In particular it offers (i) Multi-cloud data discovery, (ii) Multi-cloud data management and access and (iii) Multi-cloud application deployment. This platform has been demonstrated with the EGI Federated Cloud, Innovation Platform Testbed Poland and the Amazon Web Services cloud. This work will present an overview of the TEPs and the multi-cloud EO data processing platform, and discuss their main achievements and their impacts in the context of distributed Research Infrastructures such as EPOS and EOSC.
Cloud Infrastructures for In Silico Drug Discovery: Economic and Practical Aspects
Clematis, Andrea; Quarati, Alfonso; Cesini, Daniele; Milanesi, Luciano; Merelli, Ivan
2013-01-01
Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements. PMID:24106693
Dynamic VM Provisioning for TORQUE in a Cloud Environment
NASA Astrophysics Data System (ADS)
Zhang, S.; Boland, L.; Coddington, P.; Sevior, M.
2014-06-01
Cloud computing, also known as an Infrastructure-as-a-Service (IaaS), is attracting more interest from the commercial and educational sectors as a way to provide cost-effective computational infrastructure. It is an ideal platform for researchers who must share common resources but need to be able to scale up to massive computational requirements for specific periods of time. This paper presents the tools and techniques developed to allow the open source TORQUE distributed resource manager and Maui cluster scheduler to dynamically integrate OpenStack cloud resources into existing high throughput computing clusters.
ERIC Educational Resources Information Center
Brooks, Tyson T.
2013-01-01
This thesis identifies three essays which contribute to the foundational understanding of the vulnerabilities and risk towards potentially implementing wireless grid Edgeware technology in a virtualized cloud environment. Since communication networks and devices are subject to becoming the target of exploitation by hackers (e.g. individuals who…
Cloud services for the Fermilab scientific stakeholders
Timm, S.; Garzoglio, G.; Mhashilkar, P.; ...
2015-12-23
As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Cloud services for the Fermilab scientific stakeholders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timm, S.; Garzoglio, G.; Mhashilkar, P.
As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Network testbed creation and validation
Thai, Tan Q.; Urias, Vincent; Van Leeuwen, Brian P.; Watts, Kristopher K.; Sweeney, Andrew John
2017-03-21
Embodiments of network testbed creation and validation processes are described herein. A "network testbed" is a replicated environment used to validate a target network or an aspect of its design. Embodiments describe a network testbed that comprises virtual testbed nodes executed via a plurality of physical infrastructure nodes. The virtual testbed nodes utilize these hardware resources as a network "fabric," thereby enabling rapid configuration and reconfiguration of the virtual testbed nodes without requiring reconfiguration of the physical infrastructure nodes. Thus, in contrast to prior art solutions which require a tester manually build an emulated environment of physically connected network devices, embodiments receive or derive a target network description and build out a replica of this description using virtual testbed nodes executed via the physical infrastructure nodes. This process allows for the creation of very large (e.g., tens of thousands of network elements) and/or very topologically complex test networks.
Network testbed creation and validation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thai, Tan Q.; Urias, Vincent; Van Leeuwen, Brian P.
Embodiments of network testbed creation and validation processes are described herein. A "network testbed" is a replicated environment used to validate a target network or an aspect of its design. Embodiments describe a network testbed that comprises virtual testbed nodes executed via a plurality of physical infrastructure nodes. The virtual testbed nodes utilize these hardware resources as a network "fabric," thereby enabling rapid configuration and reconfiguration of the virtual testbed nodes without requiring reconfiguration of the physical infrastructure nodes. Thus, in contrast to prior art solutions which require a tester manually build an emulated environment of physically connected network devices,more » embodiments receive or derive a target network description and build out a replica of this description using virtual testbed nodes executed via the physical infrastructure nodes. This process allows for the creation of very large (e.g., tens of thousands of network elements) and/or very topologically complex test networks.« less
NASA Astrophysics Data System (ADS)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt; Larson, Krista; Sfiligoi, Igor; Rynge, Mats
2014-06-01
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared over the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by "Big Data" will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared overmore » the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by 'Big Data' will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.« less
EduCloud: PaaS versus IaaS Cloud Usage for an Advanced Computer Science Course
ERIC Educational Resources Information Center
Vaquero, L. M.
2011-01-01
The cloud has become a widely used term in academia and the industry. Education has not remained unaware of this trend, and several educational solutions based on cloud technologies are already in place, especially for software as a service cloud. However, an evaluation of the educational potential of infrastructure and platform clouds has not…
Bioinformatics clouds for big data manipulation.
Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang
2012-11-28
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.
Optical fibre multi-parameter sensing with secure cloud based signal capture and processing
NASA Astrophysics Data System (ADS)
Newe, Thomas; O'Connell, Eoin; Meere, Damien; Yuan, Hongwei; Leen, Gabriel; O'Keeffe, Sinead; Lewis, Elfed
2016-05-01
Recent advancements in cloud computing technologies in the context of optical and optical fibre based systems are reported. The proliferation of real time and multi-channel based sensor systems represents significant growth in data volume. This coupled with a growing need for security presents many challenges and presents a huge opportunity for an evolutionary step in the widespread application of these sensing technologies. A tiered infrastructural system approach is adopted that is designed to facilitate the delivery of Optical Fibre-based "SENsing as a Service- SENaaS". Within this infrastructure, novel optical sensing platforms, deployed within different environments, are interfaced with a Cloud-based backbone infrastructure which facilitates the secure collection, storage and analysis of real-time data. Feedback systems, which harness this data to affect a change within the monitored location/environment/condition, are also discussed. The cloud based system presented here can also be used with chemical and physical sensors that require real-time data analysis, processing and feedback.
NASA Astrophysics Data System (ADS)
Druken, K. A.; Trenham, C. E.; Steer, A.; Evans, B. J. K.; Richards, C. J.; Smillie, J.; Allen, C.; Pringle, S.; Wang, J.; Wyborn, L. A.
2016-12-01
The Australian National Computational Infrastructure (NCI) provides access to petascale data in climate, weather, Earth observations, and genomics, and terascale data in astronomy, geophysics, ecology and land use, as well as social sciences. The data is centralized in a closely integrated High Performance Computing (HPC), High Performance Data (HPD) and cloud facility. Despite this, there remain significant barriers for many users to find and access the data: simply hosting a large volume of data is not helpful if researchers are unable to find, access, and use the data for their particular need. Use cases demonstrate we need to support a diverse range of users who are increasingly crossing traditional research discipline boundaries. To support their varying experience, access needs and research workflows, NCI has implemented an integrated data platform providing a range of services that enable users to interact with our data holdings. These services include: - A GeoNetwork catalog built on standardized Data Management Plans to search collection metadata, and find relevant datasets; - Web data services to download or remotely access data via OPeNDAP, WMS, WCS and other protocols; - Virtual Desktop Infrastructure (VDI) built on a highly integrated on-site cloud with access to both the HPC peak machine and research data collections. The VDI is a fully featured environment allowing visualization, code development and analysis to take place in an interactive desktop environment; and - A Learning Management System (LMS) containing User Guides, Use Case examples and Jupyter Notebooks structured into courses, so that users can self-teach how to use these facilities with examples from our system across a range of disciplines. We will briefly present these components, and discuss how we engage with data custodians and consumers to develop standardized data structures and services that support the range of needs. We will also highlight some key developments that have improved user experience in utilizing the services, particularly enabling transdisciplinary science. This work combines with other developments at NCI to increase the confidence of scientists from any field to undertake research and analysis on these important data collections regardless of their preferred work environment or level of skill.
A PACS archive architecture supported on cloud services.
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2012-05-01
Diagnostic imaging procedures have continuously increased over the last decade and this trend may continue in coming years, creating a great impact on storage and retrieval capabilities of current PACS. Moreover, many smaller centers do not have financial resources or requirements that justify the acquisition of a traditional infrastructure. Alternative solutions, such as cloud computing, may help address this emerging need. A tremendous amount of ubiquitous computational power, such as that provided by Google and Amazon, are used every day as a normal commodity. Taking advantage of this new paradigm, an architecture for a Cloud-based PACS archive that provides data privacy, integrity, and availability is proposed. The solution is independent from the cloud provider and the core modules were successfully instantiated in examples of two cloud computing providers. Operational metrics for several medical imaging modalities were tabulated and compared for Google Storage, Amazon S3, and LAN PACS. A PACS-as-a-Service archive that provides storage of medical studies using the Cloud was developed. The results show that the solution is robust and that it is possible to store, query, and retrieve all desired studies in a similar way as in a local PACS approach. Cloud computing is an emerging solution that promises high scalability of infrastructures, software, and applications, according to a "pay-as-you-go" business model. The presented architecture uses the cloud to setup medical data repositories and can have a significant impact on healthcare institutions by reducing IT infrastructures.
Deploying Crowd-Sourced Formal Verification Systems in a DoD Network
2013-09-01
INTENTIONALLY LEFT BLANK 1 I. INTRODUCTION A. INTRODUCTION In 2014 cyber attacks on critical infrastructure are expected to increase...CSFV systems on the Internet‒‒possibly using cloud infrastructure (Dean, 2013). By using Amazon Compute Cloud (EC2) systems, DARPA will use ordinary...through standard access methods. Those clients could be mobile phones, laptops, netbooks, tablet computers or personal digital assistants (PDAs) (Smoot
NASA Astrophysics Data System (ADS)
Niggemann, F.; Appel, F.; Bach, H.; de la Mar, J.; Schirpke, B.; Dutting, K.; Rucker, G.; Leimbach, D.
2015-04-01
To address the challenges of effective data handling faced by Small and Medium Sized Enterprises (SMEs) a cloud-based infrastructure for accessing and processing of Earth Observation(EO)-data has been developed within the project APPS4GMES(www.apps4gmes.de). To gain homogenous multi mission data access an Input Data Portal (IDP) been implemented on this infrastructure. The IDP consists of an Open Geospatial Consortium (OGC) conformant catalogue, a consolidation module for format conversion and an OGC-conformant ordering framework. Metadata of various EO-sources and with different standards is harvested and transferred to an OGC conformant Earth Observation Product standard and inserted into the catalogue by a Metadata Harvester. The IDP can be accessed for search and ordering of the harvested datasets by the services implemented on the cloud infrastructure. Different land-surface services have been realised by the project partners, using the implemented IDP and cloud infrastructure. Results of these are customer ready products, as well as pre-products (e.g. atmospheric corrected EO data), serving as a basis for other services. Within the IDP an automated access to ESA's Sentinel-1 Scientific Data Hub has been implemented. Searching and downloading of the SAR data can be performed in an automated way. With the implementation of the Sentinel-1 Toolbox and own software, for processing of the datasets for further use, for example for Vista's snow monitoring, delivering input for the flood forecast services, can also be performed in an automated way. For performance tests of the cloud environment a sophisticated model based atmospheric correction and pre-classification service has been implemented. Tests conducted an automated synchronised processing of one entire Landsat 8 (LS-8) coverage for Germany and performance comparisons to standard desktop systems. Results of these tests, showing a performance improvement by the factor of six, proved the high flexibility and computing power of the cloud environment. To make full use of the cloud capabilities a possibility for automated upscaling of the hardware resources has been implemented. Together with the IDP infrastructure fast and automated processing of various satellite sources to deliver market ready products can be realised, thus increasing customer needs and numbers can be satisfied without loss of accuracy and quality.
Open Science in the Cloud: Towards a Universal Platform for Scientific and Statistical Computing
NASA Astrophysics Data System (ADS)
Chine, Karim
The UK, through the e-Science program, the US through the NSF-funded cyber infrastructure and the European Union through the ICT Calls aimed to provide "the technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge".1 The Grid (Foster, 2002; Foster; Kesselman, Nick, & Tuecke, 2002), foreseen as a major accelerator of discovery, didn't meet the expectations it had excited at its beginnings and was not adopted by the broad population of research professionals. The Grid is a good tool for particle physicists and it has allowed them to tackle the tremendous computational challenges inherent to their field. However, as a technology and paradigm for delivering computing on demand, it doesn't work and it can't be fixed. On one hand, "the abstractions that Grids expose - to the end-user, to the deployers and to application developers - are inappropriate and they need to be higher level" (Jha, Merzky, & Fox), and on the other hand, academic Grids are inherently economically unsustainable. They can't compete with a service outsourced to the Industry whose quality and price would be driven by market forces. The virtualization technologies and their corollary, the Infrastructure-as-a-Service (IaaS) style cloud, hold the promise to enable what the Grid failed to deliver: a sustainable environment for computational sciences that would lower the barriers for accessing federated computational resources, software tools and data; enable collaboration and resources sharing and provide the building blocks of a ubiquitous platform for traceable and reproducible computational research.
Solar Resource Assessment with Sky Imagery and a Virtual Testbed for Sky Imager Solar Forecasting
NASA Astrophysics Data System (ADS)
Kurtz, Benjamin Bernard
In recent years, ground-based sky imagers have emerged as a promising tool for forecasting solar energy on short time scales (0 to 30 minutes ahead). Following the development of sky imager hardware and algorithms at UC San Diego, we present three new or improved algorithms for sky imager forecasting and forecast evaluation. First, we present an algorithm for measuring irradiance with a sky imager. Sky imager forecasts are often used in conjunction with other instruments for measuring irradiance, so this has the potential to decrease instrumentation costs and logistical complexity. In particular, the forecast algorithm itself often relies on knowledge of the current irradiance which can now be provided directly from the sky images. Irradiance measurements are accurate to within about 10%. Second, we demonstrate a virtual sky imager testbed that can be used for validating and enhancing the forecast algorithm. The testbed uses high-quality (but slow) simulations to produce virtual clouds and sky images. Because virtual cloud locations are known, much more advanced validation procedures are possible with the virtual testbed than with measured data. In this way, we are able to determine that camera geometry and non-uniform evolution of the cloud field are the two largest sources of forecast error. Finally, with the assistance of the virtual sky imager testbed, we develop improvements to the cloud advection model used for forecasting. The new advection schemes are 10-20% better at short time horizons.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure
NASA Astrophysics Data System (ADS)
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-01
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
Costs of cloud computing for a biometry department. A case study.
Knaus, J; Hieke, S; Binder, H; Schwarzer, G
2013-01-01
"Cloud" computing providers, such as the Amazon Web Services (AWS), offer stable and scalable computational resources based on hardware virtualization, with short, usually hourly, billing periods. The idea of pay-as-you-use seems appealing for biometry research units which have only limited access to university or corporate data center resources or grids. This case study compares the costs of an existing heterogeneous on-site hardware pool in a Medical Biometry and Statistics department to a comparable AWS offer. The "total cost of ownership", including all direct costs, is determined for the on-site hardware, and hourly prices are derived, based on actual system utilization during the year 2011. Indirect costs, which are difficult to quantify are not included in this comparison, but nevertheless some rough guidance from our experience is given. To indicate the scale of costs for a methodological research project, a simulation study of a permutation-based statistical approach is performed using AWS and on-site hardware. In the presented case, with a system utilization of 25-30 percent and 3-5-year amortization, on-site hardware can result in smaller costs, compared to hourly rental in the cloud dependent on the instance chosen. Renting cloud instances with sufficient main memory is a deciding factor in this comparison. Costs for on-site hardware may vary, depending on the specific infrastructure at a research unit, but have only moderate impact on the overall comparison and subsequent decision for obtaining affordable scientific computing resources. Overall utilization has a much stronger impact as it determines the actual computing hours needed per year. Taking this into ac count, cloud computing might still be a viable option for projects with limited maturity, or as a supplement for short peaks in demand.
IBM Cloud Computing Powering a Smarter Planet
NASA Astrophysics Data System (ADS)
Zhu, Jinzy; Fang, Xing; Guo, Zhe; Niu, Meng Hua; Cao, Fan; Yue, Shuang; Liu, Qin Yu
With increasing need for intelligent systems supporting the world's businesses, Cloud Computing has emerged as a dominant trend to provide a dynamic infrastructure to make such intelligence possible. The article introduced how to build a smarter planet with cloud computing technology. First, it introduced why we need cloud, and the evolution of cloud technology. Secondly, it analyzed the value of cloud computing and how to apply cloud technology. Finally, it predicted the future of cloud in the smarter planet.
Data Center Consolidation: A Step towards Infrastructure Clouds
NASA Astrophysics Data System (ADS)
Winter, Markus
Application service providers face enormous challenges and rising costs in managing and operating a growing number of heterogeneous system and computing landscapes. Limitations of traditional computing environments force IT decision-makers to reorganize computing resources within the data center, as continuous growth leads to an inefficient utilization of the underlying hardware infrastructure. This paper discusses a way for infrastructure providers to improve data center operations based on the findings of a case study on resource utilization of very large business applications and presents an outlook beyond server consolidation endeavors, transforming corporate data centers into compute clouds.
GEO Supersites Data Exploitation Platform
NASA Astrophysics Data System (ADS)
Lengert, W.; Popp, H.-J.; Gleyzes, J.-P.
2012-04-01
In the framework of the GEO Geohazard Supersite initiative, an international partnership of organizations and scientists involved in the monitoring and assessment of geohazards has been established. The mission is to advance the scientific understanding of geohazards by improving geohazard monitoring through the combination of in-situ and space-based data, and by facilitating the access to data relevant for geohazard research. The stakeholders are: (1) governmental organizations or research institutions responsible for the ground-based monitoring of earthquake and volcanic areas, (2) space agencies and satellite operators providing satellite data, (3) the global geohazard scientific community. The 10.000's of ESA's SAR products are accessible, since beginning 2008, using ESA's "Virtual Archive", a Cloud Computing assets, allowing the global community an utmost downloading performance of these high volume data sets for mass-market costs. In the GEO collaborative context, the management of ESA's "Virtual Archive" and the ordering of these large data sets is being performed by UNAVCO, who is also coordinating the data demand for the several hundreds of co-PIs. ESA is envisaging to provide scientists and developers access to a highly elastic operational e-infrastructure, providing interdisciplinary data on a large scale as well as tools ensuring innovation and a permanent evolution of the products. Consequently, this science environment will help in defining and testing new applications and technologies fostering innovation and new science findings. In Europe, the collaboration between EPOS, "European Plate Observatory System" lead by INGV, and ESA with support of DLR, ASI, and CNES are the main institutional stakeholders for the GEO Supersites contributing also to a unifying e-infrastructure. The overarching objective of the Geohazard Supersites is: "To implement a sustainable Global Earthquake Observation System and a Global Volcano Observation System as part of the Global Earth Observation System of Systems (GEOSS)." This presentation will outline the overall concept, objectives, and examples of the e-infrastructure, which is currently being set up for the GEO Supersite initiative helping to advance science.
Local storage federation through XRootD architecture for interactive distributed analysis
NASA Astrophysics Data System (ADS)
Colamaria, F.; Colella, D.; Donvito, G.; Elia, D.; Franco, A.; Luparello, G.; Maggi, G.; Miniello, G.; Vallero, S.; Vino, G.
2015-12-01
A cloud-based Virtual Analysis Facility (VAF) for the ALICE experiment at the LHC has been deployed in Bari. Similar facilities are currently running in other Italian sites with the aim to create a federation of interoperating farms able to provide their computing resources for interactive distributed analysis. The use of cloud technology, along with elastic provisioning of computing resources as an alternative to the grid for running data intensive analyses, is the main challenge of these facilities. One of the crucial aspects of the user-driven analysis execution is the data access. A local storage facility has the disadvantage that the stored data can be accessed only locally, i.e. from within the single VAF. To overcome such a limitation a federated infrastructure, which provides full access to all the data belonging to the federation independently from the site where they are stored, has been set up. The federation architecture exploits both cloud computing and XRootD technologies, in order to provide a dynamic, easy-to-use and well performing solution for data handling. It should allow the users to store the files and efficiently retrieve the data, since it implements a dynamic distributed cache among many datacenters in Italy connected to one another through the high-bandwidth national network. Details on the preliminary architecture implementation and performance studies are discussed.
NASA Technical Reports Server (NTRS)
Gill, Roger; Schnase, John L.
2012-01-01
The Invasive Species Forecasting System (ISFS) is an online decision support system that allows users to load point occurrence field sample data for a plant species of interest and quickly generate habitat suitability maps for geographic regions of interest, such as a national park, monument, forest, or refuge. Target customers for ISFS are natural resource managers and decision makers who have a need for scientifically valid, model- based predictions of the habitat suitability of plant species of management concern. In a joint project involving NASA and the Maryland Department of Natural Resources, ISFS has been used to model the potential distribution of Wavyleaf Basketgrass in Maryland's Chesapeake Bay Watershed. Maximum entropy techniques are used to generate predictive maps using predictor datasets derived from remotely sensed data and climate simulation outputs. The workflow to run a model is implemented in an iRODS microservice using a custom ISFS file driver that clips and re-projects data to geographic regions of interest, then shells out to perform MaxEnt processing on the input data. When the model completes, all output files and maps from the model run are registered in iRODS and made accessible to the user. The ISFS user interface is a web browser that uses the iRODS PHP client to interact with the ISFS/iRODS- server. ISFS is designed to reside in a VMware virtual machine running SLES 11 and iRODS 3.0. The ISFS virtual machine is hosted in a VMware vSphere private cloud infrastructure to deliver the online service.
cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei
2016-01-01
Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509
Bioinformatics clouds for big data manipulation
2012-01-01
Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475
An imperialist competitive algorithm for virtual machine placement in cloud computing
NASA Astrophysics Data System (ADS)
Jamali, Shahram; Malektaji, Sepideh; Analoui, Morteza
2017-05-01
Cloud computing, the recently emerged revolution in IT industry, is empowered by virtualisation technology. In this paradigm, the user's applications run over some virtual machines (VMs). The process of selecting proper physical machines to host these virtual machines is called virtual machine placement. It plays an important role on resource utilisation and power efficiency of cloud computing environment. In this paper, we propose an imperialist competitive-based algorithm for the virtual machine placement problem called ICA-VMPLC. The base optimisation algorithm is chosen to be ICA because of its ease in neighbourhood movement, good convergence rate and suitable terminology. The proposed algorithm investigates search space in a unique manner to efficiently obtain optimal placement solution that simultaneously minimises power consumption and total resource wastage. Its final solution performance is compared with several existing methods such as grouping genetic and ant colony-based algorithms as well as bin packing heuristic. The simulation results show that the proposed method is superior to other tested algorithms in terms of power consumption, resource wastage, CPU usage efficiency and memory usage efficiency.
Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian
2011-01-01
The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.
Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian
2011-01-01
Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928
Model-as-a-service (MaaS) using the cloud service innovation platform (CSIP)
USDA-ARS?s Scientific Manuscript database
Cloud infrastructures for modelling activities such as data processing, performing environmental simulations, or conducting model calibrations/optimizations provide a cost effective alternative to traditional high performance computing approaches. Cloud-based modelling examples emerged into the more...
Testing as a Service with HammerCloud
NASA Astrophysics Data System (ADS)
Medrano Llamas, Ramón; Barrand, Quentin; Elmsheuser, Johannes; Legger, Federica; Sciacca, Gianfranco; Sciabà, Andrea; van der Ster, Daniel
2014-06-01
HammerCloud was designed and born under the needs of the grid community to test the resources and automate operations from a user perspective. The recent developments in the IT space propose a shift to the software defined data centres, in which every layer of the infrastructure can be offered as a service. Testing and monitoring is an integral part of the development, validation and operations of big systems, like the grid. This area is not escaping the paradigm shift and we are starting to perceive as natural the Testing as a Service (TaaS) offerings, which allow testing any infrastructure service, such as the Infrastructure as a Service (IaaS) platforms being deployed in many grid sites, both from the functional and stressing perspectives. This work will review the recent developments in HammerCloud and its evolution to a TaaS conception, in particular its deployment on the Agile Infrastructure platform at CERN and the testing of many IaaS providers across Europe in the context of experiment requirements. The first section will review the architectural changes that a service running in the cloud needs, such an orchestration service or new storage requirements in order to provide functional and stress testing. The second section will review the first tests of infrastructure providers on the perspective of the challenges discovered from the architectural point of view. Finally, the third section will evaluate future requirements of scalability and features to increase testing productivity.
INDIGO: Building a DataCloud Framework to support Open Science
NASA Astrophysics Data System (ADS)
Chen, Yin; de Lucas, Jesus Marco; Aguilar, Fenando; Fiore, Sandro; Rossi, Massimiliano; Ferrari, Tiziana
2016-04-01
New solutions are required to support Data Intensive Science in the emerging panorama of e-infrastructures, including Grid, Cloud and HPC services. The architecture proposed by the INDIGO-DataCloud (INtegrating Distributed data Infrastructures for Global ExplOitation) (https://www.indigo-datacloud.eu/) H2020 project, provides the path to integrate IaaS resources and PaaS platforms to provide SaaS solutions, while satisfying the requirements posed by different Research Communities, including several in Earth Science. This contribution introduces the INDIGO DataCloud architecture, describes the methodology followed to assure the integration of the requirements from different research communities, including examples like ENES, LifeWatch or EMSO, and how they will build their solutions using different INDIGO components.
Flexible Description and Adaptive Processing of Earth Observation Data through the BigEarth Platform
NASA Astrophysics Data System (ADS)
Gorgan, Dorian; Bacu, Victor; Stefanut, Teodor; Nandra, Cosmin; Mihon, Danut
2016-04-01
The Earth Observation data repositories extending periodically by several terabytes become a critical issue for organizations. The management of the storage capacity of such big datasets, accessing policy, data protection, searching, and complex processing require high costs that impose efficient solutions to balance the cost and value of data. Data can create value only when it is used, and the data protection has to be oriented toward allowing innovation that sometimes depends on creative people, which achieve unexpected valuable results through a flexible and adaptive manner. The users need to describe and experiment themselves different complex algorithms through analytics in order to valorize data. The analytics uses descriptive and predictive models to gain valuable knowledge and information from data analysis. Possible solutions for advanced processing of big Earth Observation data are given by the HPC platforms such as cloud. With platforms becoming more complex and heterogeneous, the developing of applications is even harder and the efficient mapping of these applications to a suitable and optimum platform, working on huge distributed data repositories, is challenging and complex as well, even by using specialized software services. From the user point of view, an optimum environment gives acceptable execution times, offers a high level of usability by hiding the complexity of computing infrastructure, and supports an open accessibility and control to application entities and functionality. The BigEarth platform [1] supports the entire flow of flexible description of processing by basic operators and adaptive execution over cloud infrastructure [2]. The basic modules of the pipeline such as the KEOPS [3] set of basic operators, the WorDeL language [4], the Planner for sequential and parallel processing, and the Executor through virtual machines, are detailed as the main components of the BigEarth platform [5]. The presentation exemplifies the development of some Earth Observation oriented applications based on flexible description of processing, and adaptive and portable execution over Cloud infrastructure. Main references for further information: [1] BigEarth project, http://cgis.utcluj.ro/projects/bigearth [2] Gorgan, D., "Flexible and Adaptive Processing of Earth Observation Data over High Performance Computation Architectures", International Conference and Exhibition Satellite 2015, August 17-19, Houston, Texas, USA. [3] Mihon, D., Bacu, V., Colceriu, V., Gorgan, D., "Modeling of Earth Observation Use Cases through the KEOPS System", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 455-460, (2015). [4] Nandra, C., Gorgan, D., "Workflow Description Language for Defining Big Earth Data Processing Tasks", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 461-468, (2015). [5] Bacu, V., Stefan, T., Gorgan, D., "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015).
Using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.
2016-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and software systems developed for NASA data repositories were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Object storage services are provided through all the leading public (Amazon Web Service, Microsoft Azure, Google Cloud, etc.) and private (Open Stack) clouds, and may provide a more cost-effective means of storing large data collections online. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows superior performance for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
Large-scale virtual screening on public cloud resources with Apache Spark.
Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola
2017-01-01
Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.
The ALICE Software Release Validation cluster
NASA Astrophysics Data System (ADS)
Berzano, D.; Krzewicki, M.
2015-12-01
One of the most important steps of software lifecycle is Quality Assurance: this process comprehends both automatic tests and manual reviews, and all of them must pass successfully before the software is approved for production. Some tests, such as source code static analysis, are executed on a single dedicated service: in High Energy Physics, a full simulation and reconstruction chain on a distributed computing environment, backed with a sample “golden” dataset, is also necessary for the quality sign off. The ALICE experiment uses dedicated and virtualized computing infrastructures for the Release Validation in order not to taint the production environment (i.e. CVMFS and the Grid) with non-validated software and validation jobs: the ALICE Release Validation cluster is a disposable virtual cluster appliance based on CernVM and the Virtual Analysis Facility, capable of deploying on demand, and with a single command, a dedicated virtual HTCondor cluster with an automatically scalable number of virtual workers on any cloud supporting the standard EC2 interface. Input and output data are externally stored on EOS, and a dedicated CVMFS service is used to provide the software to be validated. We will show how the Release Validation Cluster deployment and disposal are completely transparent for the Release Manager, who simply triggers the validation from the ALICE build system's web interface. CernVM 3, based entirely on CVMFS, permits to boot any snapshot of the operating system in time: we will show how this allows us to certify each ALICE software release for an exact CernVM snapshot, addressing the problem of Long Term Data Preservation by ensuring a consistent environment for software execution and data reprocessing in the future.
An Overview of Cloud Computing in Distributed Systems
NASA Astrophysics Data System (ADS)
Divakarla, Usha; Kumari, Geetha
2010-11-01
Cloud computing is the emerging trend in the field of distributed computing. Cloud computing evolved from grid computing and distributed computing. Cloud plays an important role in huge organizations in maintaining huge data with limited resources. Cloud also helps in resource sharing through some specific virtual machines provided by the cloud service provider. This paper gives an overview of the cloud organization and some of the basic security issues pertaining to the cloud.
The Experiment Factory: Standardizing Behavioral Experiments.
Sochat, Vanessa V; Eisenberg, Ian W; Enkavi, A Zeynep; Li, Jamie; Bissett, Patrick G; Poldrack, Russell A
2016-01-01
The administration of behavioral and experimental paradigms for psychology research is hindered by lack of a coordinated effort to develop and deploy standardized paradigms. While several frameworks (Mason and Suri, 2011; McDonnell et al., 2012; de Leeuw, 2015; Lange et al., 2015) have provided infrastructure and methods for individual research groups to develop paradigms, missing is a coordinated effort to develop paradigms linked with a system to easily deploy them. This disorganization leads to redundancy in development, divergent implementations of conceptually identical tasks, disorganized and error-prone code lacking documentation, and difficulty in replication. The ongoing reproducibility crisis in psychology and neuroscience research (Baker, 2015; Open Science Collaboration, 2015) highlights the urgency of this challenge: reproducible research in behavioral psychology is conditional on deployment of equivalent experiments. A large, accessible repository of experiments for researchers to develop collaboratively is most efficiently accomplished through an open source framework. Here we present the Experiment Factory, an open source framework for the development and deployment of web-based experiments. The modular infrastructure includes experiments, virtual machines for local or cloud deployment, and an application to drive these components and provide developers with functions and tools for further extension. We release this infrastructure with a deployment (http://www.expfactory.org) that researchers are currently using to run a set of over 80 standardized web-based experiments on Amazon Mechanical Turk. By providing open source tools for both deployment and development, this novel infrastructure holds promise to bring reproducibility to the administration of experiments, and accelerate scientific progress by providing a shared community resource of psychological paradigms.
The Experiment Factory: Standardizing Behavioral Experiments
Sochat, Vanessa V.; Eisenberg, Ian W.; Enkavi, A. Zeynep; Li, Jamie; Bissett, Patrick G.; Poldrack, Russell A.
2016-01-01
The administration of behavioral and experimental paradigms for psychology research is hindered by lack of a coordinated effort to develop and deploy standardized paradigms. While several frameworks (Mason and Suri, 2011; McDonnell et al., 2012; de Leeuw, 2015; Lange et al., 2015) have provided infrastructure and methods for individual research groups to develop paradigms, missing is a coordinated effort to develop paradigms linked with a system to easily deploy them. This disorganization leads to redundancy in development, divergent implementations of conceptually identical tasks, disorganized and error-prone code lacking documentation, and difficulty in replication. The ongoing reproducibility crisis in psychology and neuroscience research (Baker, 2015; Open Science Collaboration, 2015) highlights the urgency of this challenge: reproducible research in behavioral psychology is conditional on deployment of equivalent experiments. A large, accessible repository of experiments for researchers to develop collaboratively is most efficiently accomplished through an open source framework. Here we present the Experiment Factory, an open source framework for the development and deployment of web-based experiments. The modular infrastructure includes experiments, virtual machines for local or cloud deployment, and an application to drive these components and provide developers with functions and tools for further extension. We release this infrastructure with a deployment (http://www.expfactory.org) that researchers are currently using to run a set of over 80 standardized web-based experiments on Amazon Mechanical Turk. By providing open source tools for both deployment and development, this novel infrastructure holds promise to bring reproducibility to the administration of experiments, and accelerate scientific progress by providing a shared community resource of psychological paradigms. PMID:27199843
A Strategic Approach to Network Defense: Framing the Cloud
2011-03-10
accepted network defensive principles, to reduce risks associated with emerging virtualization capabilities and scalability of cloud computing . This expanded...defensive framework can assist enterprise networking and cloud computing architects to better design more secure systems.
Using IKAROS as a data transfer and management utility within the KM3NeT computing model
NASA Astrophysics Data System (ADS)
Filippidis, Christos; Cotronis, Yiannis; Markou, Christos
2016-04-01
KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino detectors that - located at the bottom of the Mediterranean Sea - will open a new window on the universe and answer fundamental questions both in particle physics and astrophysics. IKAROS is a framework that enables creating scalable storage formations on-demand and helps addressing several limitations that the current file systems face when dealing with very large scale infrastructures. It enables creating ad-hoc nearby storage formations and can use a huge number of I/O nodes in order to increase the available bandwidth (I/O and network). IKAROS unifies remote and local access in the overall data flow, by permitting direct access to each I/O node. In this way we can handle the overall data flow at the network layer, limiting the interaction with the operating system. This approach allows virtually connecting, at the users level, the several different computing facilities used (Grids, Clouds, HPCs, Data Centers, Local computing Clusters and personal storage devices), on-demand, based on the needs, by using well known standards and protocols, like HTTP.
High-performance scientific computing in the cloud
NASA Astrophysics Data System (ADS)
Jorissen, Kevin; Vila, Fernando; Rehr, John
2011-03-01
Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
Modeling the Virtual Machine Launching Overhead under Fermicloud
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele; Wu, Hao; Ren, Shangping
FermiCloud is a private cloud developed by the Fermi National Accelerator Laboratory for scientific workflows. The Cloud Bursting module of the FermiCloud enables the FermiCloud, when more computational resources are needed, to automatically launch virtual machines to available resources such as public clouds. One of the main challenges in developing the cloud bursting module is to decide when and where to launch a VM so that all resources are most effectively and efficiently utilized and the system performance is optimized. However, based on FermiCloud’s system operational data, the VM launching overhead is not a constant. It varies with physical resourcemore » (CPU, memory, I/O device) utilization at the time when a VM is launched. Hence, to make judicious decisions as to when and where a VM should be launched, a VM launch overhead reference model is needed. The paper is to develop a VM launch overhead reference model based on operational data we have obtained on FermiCloud and uses the reference model to guide the cloud bursting process.« less
Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets
Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L
2014-01-01
Background As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Methods Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Results Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Conclusions Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. PMID:24464852
Towards Cloud-based Asynchronous Elasticity for Iterative HPC Applications
NASA Astrophysics Data System (ADS)
da Rosa Righi, Rodrigo; Facco Rodrigues, Vinicius; André da Costa, Cristiano; Kreutz, Diego; Heiss, Hans-Ulrich
2015-10-01
Elasticity is one of the key features of cloud computing. It allows applications to dynamically scale computing and storage resources, avoiding over- and under-provisioning. In high performance computing (HPC), initiatives are normally modeled to handle bag-of-tasks or key-value applications through a load balancer and a loosely-coupled set of virtual machine (VM) instances. In the joint-field of Message Passing Interface (MPI) and tightly-coupled HPC applications, we observe the need of rewriting source codes, previous knowledge of the application and/or stop-reconfigure-and-go approaches to address cloud elasticity. Besides, there are problems related to how profit this new feature in the HPC scope, since in MPI 2.0 applications the programmers need to handle communicators by themselves, and a sudden consolidation of a VM, together with a process, can compromise the entire execution. To address these issues, we propose a PaaS-based elasticity model, named AutoElastic. It acts as a middleware that allows iterative HPC applications to take advantage of dynamic resource provisioning of cloud infrastructures without any major modification. AutoElastic provides a new concept denoted here as asynchronous elasticity, i.e., it provides a framework to allow applications to either increase or decrease their computing resources without blocking the current execution. The feasibility of AutoElastic is demonstrated through a prototype that runs a CPU-bound numerical integration application on top of the OpenNebula middleware. The results showed the saving of about 3 min at each scaling out operations, emphasizing the contribution of the new concept on contexts where seconds are precious.
eduCRATE--a Virtual Hospital architecture.
Stoicu-Tivadar, Lăcrimioara; Stoicu-Tivadar, Vasile; Berian, Dorin; Drăgan, Simona; Serban, Alexandru; Serban, Corina
2014-01-01
eduCRATE is a complex project proposal which aims to develop a virtual learning environment offering interactive digital content through original and integrated solutions using cloud computing, complex multimedia systems in virtual space and personalized design with avatars. Compared to existing similar products the project brings the novelty of using languages for medical guides in order to ensure a maximum of flexibility. The Virtual Hospital simulations will create interactive clinical scenarios for which students will find solutions for positive diagnosis and therapeutic management. The solution based on cloud computing and immersive multimedia is an attractive option in education because is economical and it matches the current working style of the young generation to whom it addresses.
Bootstrapping and Maintaining Trust in the Cloud
2016-12-01
simultaneous cloud nodes. 1. INTRODUCTION The proliferation and popularity of infrastructure-as-a- service (IaaS) cloud computing services such as...Amazon Web Services and Google Compute Engine means more cloud tenants are hosting sensitive, private, and business critical data and applications in the...thousands of IaaS resources as they are elastically instantiated and terminated. Prior cloud trusted computing solutions address a subset of these features
A case analysis of INFOMED: the Cuban national health care telecommunications network and portal.
Séror, Ann C
2006-01-27
The Internet and telecommunications technologies contribute to national health care system infrastructures and extend global health care services markets. The Cuban national health care system offers a model to show how a national information portal can contribute to system integration, including research, education, and service delivery as well as international trade in products and services. The objectives of this paper are (1) to present the context of the Cuban national health care system since the revolution in 1959, (2) to identify virtual institutional infrastructures of the system associated with the Cuban National Health Care Telecommunications Network and Portal (INFOMED), and (3) to show how they contribute to Cuban trade in international health care service markets. Qualitative case research methods were used to identify the integrated virtual infrastructure of INFOMED and to show how it reflects socialist ideology. Virtual institutional infrastructures include electronic medical and information services and the structure of national networks linking such services. Analysis of INFOMED infrastructures shows integration of health care information, research, and education as well as the interface between Cuban national information networks and the global Internet. System control mechanisms include horizontal integration and coordination through virtual institutions linked through INFOMED, and vertical control through the Ministry of Public Health and the government hierarchy. Telecommunications technology serves as a foundation for a dual market structure differentiating domestic services from international trade. INFOMED is a model of interest for integrating health care information, research, education, and services. The virtual infrastructures linked through INFOMED support the diffusion of Cuban health care products and services in global markets. Transferability of this model is contingent upon ideology and interpretation of values such as individual intellectual property and confidentiality of individual health information. Future research should focus on examination of these issues and their consequences for global markets in health care.
A Case Analysis of INFOMED: The Cuban National Health Care Telecommunications Network and Portal
2006-01-01
Background The Internet and telecommunications technologies contribute to national health care system infrastructures and extend global health care services markets. The Cuban national health care system offers a model to show how a national information portal can contribute to system integration, including research, education, and service delivery as well as international trade in products and services. Objective The objectives of this paper are (1) to present the context of the Cuban national health care system since the revolution in 1959, (2) to identify virtual institutional infrastructures of the system associated with the Cuban National Health Care Telecommunications Network and Portal (INFOMED), and (3) to show how they contribute to Cuban trade in international health care service markets. Methods Qualitative case research methods were used to identify the integrated virtual infrastructure of INFOMED and to show how it reflects socialist ideology. Virtual institutional infrastructures include electronic medical and information services and the structure of national networks linking such services. Results Analysis of INFOMED infrastructures shows integration of health care information, research, and education as well as the interface between Cuban national information networks and the global Internet. System control mechanisms include horizontal integration and coordination through virtual institutions linked through INFOMED, and vertical control through the Ministry of Public Health and the government hierarchy. Telecommunications technology serves as a foundation for a dual market structure differentiating domestic services from international trade. Conclusions INFOMED is a model of interest for integrating health care information, research, education, and services. The virtual infrastructures linked through INFOMED support the diffusion of Cuban health care products and services in global markets. Transferability of this model is contingent upon ideology and interpretation of values such as individual intellectual property and confidentiality of individual health information. Future research should focus on examination of these issues and their consequences for global markets in health care. PMID:16585025
NASA Astrophysics Data System (ADS)
Tudose, Alexandru; Terstyansky, Gabor; Kacsuk, Peter; Winter, Stephen
Grid Application Repositories vary greatly in terms of access interface, security system, implementation technology, communication protocols and repository model. This diversity has become a significant limitation in terms of interoperability and inter-repository access. This paper presents the Grid Application Meta-Repository System (GAMRS) as a solution that offers better options for the management of Grid applications. GAMRS proposes a generic repository architecture, which allows any Grid Application Repository (GAR) to be connected to the system independent of their underlying technology. It also presents applications in a uniform manner and makes applications from all connected repositories visible to web search engines, OGSI/WSRF Grid Services and other OAI (Open Archive Initiative)-compliant repositories. GAMRS can also function as a repository in its own right and can store applications under a new repository model. With the help of this model, applications can be presented as embedded in virtual machines (VM) and therefore they can be run in their native environments and can easily be deployed on virtualized infrastructures allowing interoperability with new generation technologies such as cloud computing, application-on-demand, automatic service/application deployments and automatic VM generation.
Hybrid architecture for building secure sensor networks
NASA Astrophysics Data System (ADS)
Owens, Ken R., Jr.; Watkins, Steve E.
2012-04-01
Sensor networks have various communication and security architectural concerns. Three approaches are defined to address these concerns for sensor networks. The first area is the utilization of new computing architectures that leverage embedded virtualization software on the sensor. Deploying a small, embedded virtualization operating system on the sensor nodes that is designed to communicate to low-cost cloud computing infrastructure in the network is the foundation to delivering low-cost, secure sensor networks. The second area focuses on securing the sensor. Sensor security components include developing an identification scheme, and leveraging authentication algorithms and protocols that address security assurance within the physical, communication network, and application layers. This function will primarily be accomplished through encrypting the communication channel and integrating sensor network firewall and intrusion detection/prevention components to the sensor network architecture. Hence, sensor networks will be able to maintain high levels of security. The third area addresses the real-time and high priority nature of the data that sensor networks collect. This function requires that a quality-of-service (QoS) definition and algorithm be developed for delivering the right data at the right time. A hybrid architecture is proposed that combines software and hardware features to handle network traffic with diverse QoS requirements.
ERIC Educational Resources Information Center
Islam, Muhammad Faysal
2013-01-01
Cloud computing offers the advantage of on-demand, reliable and cost efficient computing solutions without the capital investment and management resources to build and maintain in-house data centers and network infrastructures. Scalability of cloud solutions enable consumers to upgrade or downsize their services as needed. In a cloud environment,…
Making the most of cloud storage - a toolkit for exploitation by WLCG experiments
NASA Astrophysics Data System (ADS)
Alvarez Ayllon, Alejandro; Arsuaga Rios, Maria; Bitzes, Georgios; Furano, Fabrizio; Keeble, Oliver; Manzi, Andrea
2017-10-01
Understanding how cloud storage can be effectively used, either standalone or in support of its associated compute, is now an important consideration for WLCG. We report on a suite of extensions to familiar tools targeted at enabling the integration of cloud object stores into traditional grid infrastructures and workflows. Notable updates include support for a number of object store flavours in FTS3, Davix and gfal2, including mitigations for lack of vector reads; the extension of Dynafed to operate as a bridge between grid and cloud domains; protocol translation in FTS3; the implementation of extensions to DPM (also implemented by the dCache project) to allow 3rd party transfers over HTTP. The result is a toolkit which facilitates data movement and access between grid and cloud infrastructures, broadening the range of workflows suitable for cloud. We report on deployment scenarios and prototype experience, explaining how, for example, an Amazon S3 or Azure allocation can be exploited by grid workflows.
Arctic Boreal Vulnerability Experiment (ABoVE) Science Cloud
NASA Astrophysics Data System (ADS)
Duffy, D.; Schnase, J. L.; McInerney, M.; Webster, W. P.; Sinno, S.; Thompson, J. H.; Griffith, P. C.; Hoy, E.; Carroll, M.
2014-12-01
The effects of climate change are being revealed at alarming rates in the Arctic and Boreal regions of the planet. NASA's Terrestrial Ecology Program has launched a major field campaign to study these effects over the next 5 to 8 years. The Arctic Boreal Vulnerability Experiment (ABoVE) will challenge scientists to take measurements in the field, study remote observations, and even run models to better understand the impacts of a rapidly changing climate for areas of Alaska and western Canada. The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center (GSFC) has partnered with the Terrestrial Ecology Program to create a science cloud designed for this field campaign - the ABoVE Science Cloud. The cloud combines traditional high performance computing with emerging technologies to create an environment specifically designed for large-scale climate analytics. The ABoVE Science Cloud utilizes (1) virtualized high-speed InfiniBand networks, (2) a combination of high-performance file systems and object storage, and (3) virtual system environments tailored for data intensive, science applications. At the center of the architecture is a large object storage environment, much like a traditional high-performance file system, that supports data proximal processing using technologies like MapReduce on a Hadoop Distributed File System (HDFS). Surrounding the storage is a cloud of high performance compute resources with many processing cores and large memory coupled to the storage through an InfiniBand network. Virtual systems can be tailored to a specific scientist and provisioned on the compute resources with extremely high-speed network connectivity to the storage and to other virtual systems. In this talk, we will present the architectural components of the science cloud and examples of how it is being used to meet the needs of the ABoVE campaign. In our experience, the science cloud approach significantly lowers the barriers and risks to organizations that require high performance computing solutions and provides the NCCS with the agility required to meet our customers' rapidly increasing and evolving requirements.
77 FR 72673 - Critical Infrastructure Protection and Resilience Month, 2012
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-05
.... Cyber incidents can have devastating consequences on both physical and virtual infrastructure, which is... work within existing authorities to fortify our country against cyber risks, comprehensive legislation remains essential to improving infrastructure security, enhancing cyber information sharing between...
Remote laboratories for optical metrology: from the lab to the cloud
NASA Astrophysics Data System (ADS)
Osten, W.; Wilke, M.; Pedrini, G.
2012-10-01
The idea of remote and virtual metrology has been reported already in 2000 with a conceptual illustration by use of comparative digital holography, aimed at the comparison of two nominally identical but physically different objects, e.g., master and sample, in industrial inspection processes. However, the concept of remote and virtual metrology can be extended far beyond this. For example, it does not only allow for the transmission of static holograms over the Internet, but also provides an opportunity to communicate with and eventually control the physical set-up of a remote metrology system. Furthermore, the metrology system can be modeled in the environment of a 3D virtual reality using CAD or similar technology, providing a more intuitive interface to the physical setup within the virtual world. An engineer or scientist who would like to access the remote real world system can log on to the virtual system, moving and manipulating the setup through an avatar and take the desired measurements. The real metrology system responds to the interaction between the avatar and the 3D virtual representation, providing a more intuitive interface to the physical setup within the virtual world. The measurement data are stored and interpreted automatically for appropriate display within the virtual world, providing the necessary feedback to the experimenter. Such a system opens up many novel opportunities in industrial inspection such as the remote master-sample-comparison and the virtual assembling of parts that are fabricated at different places. Moreover, a multitude of new techniques can be envisaged. To them belong modern ways for documenting, efficient methods for metadata storage, the possibility for remote reviewing of experimental results, the adding of real experiments to publications by providing remote access to the metadata and to the experimental setup via Internet, the presentation of complex experiments in classrooms and lecture halls, the sharing of expensive and complex infrastructure within international collaborations, the implementation of new ways for the remote test of new devices, for their maintenance and service, and many more. The paper describes the idea of remote laboratories and illustrates the potential of the approach on selected examples with special attention to optical metrology.
High-Precision Registration of Point Clouds Based on Sphere Feature Constraints.
Huang, Junhui; Wang, Zhao; Gao, Jianmin; Huang, Youping; Towers, David Peter
2016-12-30
Point cloud registration is a key process in multi-view 3D measurements. Its precision affects the measurement precision directly. However, in the case of the point clouds with non-overlapping areas or curvature invariant surface, it is difficult to achieve a high precision. A high precision registration method based on sphere feature constraint is presented to overcome the difficulty in the paper. Some known sphere features with constraints are used to construct virtual overlapping areas. The virtual overlapping areas provide more accurate corresponding point pairs and reduce the influence of noise. Then the transformation parameters between the registered point clouds are solved by an optimization method with weight function. In that case, the impact of large noise in point clouds can be reduced and a high precision registration is achieved. Simulation and experiments validate the proposed method.
High-Precision Registration of Point Clouds Based on Sphere Feature Constraints
Huang, Junhui; Wang, Zhao; Gao, Jianmin; Huang, Youping; Towers, David Peter
2016-01-01
Point cloud registration is a key process in multi-view 3D measurements. Its precision affects the measurement precision directly. However, in the case of the point clouds with non-overlapping areas or curvature invariant surface, it is difficult to achieve a high precision. A high precision registration method based on sphere feature constraint is presented to overcome the difficulty in the paper. Some known sphere features with constraints are used to construct virtual overlapping areas. The virtual overlapping areas provide more accurate corresponding point pairs and reduce the influence of noise. Then the transformation parameters between the registered point clouds are solved by an optimization method with weight function. In that case, the impact of large noise in point clouds can be reduced and a high precision registration is achieved. Simulation and experiments validate the proposed method. PMID:28042846
An incremental anomaly detection model for virtual machines.
Zhang, Hancui; Chen, Shuyu; Liu, Jun; Zhou, Zhen; Wu, Tianshu
2017-01-01
Self-Organizing Map (SOM) algorithm as an unsupervised learning method has been applied in anomaly detection due to its capabilities of self-organizing and automatic anomaly prediction. However, because of the algorithm is initialized in random, it takes a long time to train a detection model. Besides, the Cloud platforms with large scale virtual machines are prone to performance anomalies due to their high dynamic and resource sharing characters, which makes the algorithm present a low accuracy and a low scalability. To address these problems, an Improved Incremental Self-Organizing Map (IISOM) model is proposed for anomaly detection of virtual machines. In this model, a heuristic-based initialization algorithm and a Weighted Euclidean Distance (WED) algorithm are introduced into SOM to speed up the training process and improve model quality. Meanwhile, a neighborhood-based searching algorithm is presented to accelerate the detection time by taking into account the large scale and high dynamic features of virtual machines on cloud platform. To demonstrate the effectiveness, experiments on a common benchmark KDD Cup dataset and a real dataset have been performed. Results suggest that IISOM has advantages in accuracy and convergence velocity of anomaly detection for virtual machines on cloud platform.
An incremental anomaly detection model for virtual machines
Zhang, Hancui; Chen, Shuyu; Liu, Jun; Zhou, Zhen; Wu, Tianshu
2017-01-01
Self-Organizing Map (SOM) algorithm as an unsupervised learning method has been applied in anomaly detection due to its capabilities of self-organizing and automatic anomaly prediction. However, because of the algorithm is initialized in random, it takes a long time to train a detection model. Besides, the Cloud platforms with large scale virtual machines are prone to performance anomalies due to their high dynamic and resource sharing characters, which makes the algorithm present a low accuracy and a low scalability. To address these problems, an Improved Incremental Self-Organizing Map (IISOM) model is proposed for anomaly detection of virtual machines. In this model, a heuristic-based initialization algorithm and a Weighted Euclidean Distance (WED) algorithm are introduced into SOM to speed up the training process and improve model quality. Meanwhile, a neighborhood-based searching algorithm is presented to accelerate the detection time by taking into account the large scale and high dynamic features of virtual machines on cloud platform. To demonstrate the effectiveness, experiments on a common benchmark KDD Cup dataset and a real dataset have been performed. Results suggest that IISOM has advantages in accuracy and convergence velocity of anomaly detection for virtual machines on cloud platform. PMID:29117245
e-Collaboration for Earth observation (E-CEO): the Cloud4SAR interferometry data challenge
NASA Astrophysics Data System (ADS)
Casu, Francesco; Manunta, Michele; Boissier, Enguerran; Brito, Fabrice; Aas, Christina; Lavender, Samantha; Ribeiro, Rita; Farres, Jordi
2014-05-01
The e-Collaboration for Earth Observation (E-CEO) project addresses the technologies and architectures needed to provide a collaborative research Platform for automating data mining and processing, and information extraction experiments. The Platform serves for the implementation of Data Challenge Contests focusing on Information Extraction for Earth Observations (EO) applications. The possibility to implement multiple processors within a Common Software Environment facilitates the validation, evaluation and transparent peer comparison among different methodologies, which is one of the main requirements rose by scientists who develop algorithms in the EO field. In this scenario, we set up a Data Challenge, referred to as Cloud4SAR (http://wiki.services.eoportal.org/tiki-index.php?page=ECEO), to foster the deployment of Interferometric SAR (InSAR) processing chains within a Cloud Computing platform. While a large variety of InSAR processing software tools are available, they require a high level of expertise and a complex user interaction to be effectively run. Computing a co-seismic interferogram or a 20-years deformation time series on a volcanic area are not easy tasks to be performed in a fully unsupervised way and/or in very short time (hours or less). Benefiting from ESA's E-CEO platform, participants can optimise algorithms on a Virtual Sandbox environment without being expert programmers, and compute results on high performing Cloud platforms. Cloud4SAR requires solving a relatively easy InSAR problem by trying to maximize the exploitation of the processing capabilities provided by a Cloud Computing infrastructure. The proposed challenge offers two different frameworks, each dedicated to participants with different skills, identified as Beginners and Experts. For both of them, the contest mainly resides in the degree of automation of the deployed algorithms, no matter which one is used, as well as in the capability of taking effective benefit from a parallel computing environment.
Na, Y; Suh, T; Xing, L
2012-06-01
Multi-objective (MO) plan optimization entails generation of an enormous number of IMRT or VMAT plans constituting the Pareto surface, which presents a computationally challenging task. The purpose of this work is to overcome the hurdle by developing an efficient MO method using emerging cloud computing platform. As a backbone of cloud computing for optimizing inverse treatment planning, Amazon Elastic Compute Cloud with a master node (17.1 GB memory, 2 virtual cores, 420 GB instance storage, 64-bit platform) is used. The master node is able to scale seamlessly a number of working group instances, called workers, based on the user-defined setting account for MO functions in clinical setting. Each worker solved the objective function with an efficient sparse decomposition method. The workers are automatically terminated if there are finished tasks. The optimized plans are archived to the master node to generate the Pareto solution set. Three clinical cases have been planned using the developed MO IMRT and VMAT planning tools to demonstrate the advantages of the proposed method. The target dose coverage and critical structure sparing of plans are comparable obtained using the cloud computing platform are identical to that obtained using desktop PC (Intel Xeon® CPU 2.33GHz, 8GB memory). It is found that the MO planning speeds up the processing of obtaining the Pareto set substantially for both types of plans. The speedup scales approximately linearly with the number of nodes used for computing. With the use of N nodes, the computational time is reduced by the fitting model, 0.2+2.3/N, with r̂2>0.99, on average of the cases making real-time MO planning possible. A cloud computing infrastructure is developed for MO optimization. The algorithm substantially improves the speed of inverse plan optimization. The platform is valuable for both MO planning and future off- or on-line adaptive re-planning. © 2012 American Association of Physicists in Medicine.
Unified Geophysical Cloud Platform (UGCP) for Seismic Monitoring and other Geophysical Applications.
NASA Astrophysics Data System (ADS)
Synytsky, R.; Starovoit, Y. O.; Henadiy, S.; Lobzakov, V.; Kolesnikov, L.
2016-12-01
We present Unified Geophysical Cloud Platform (UGCP) or UniGeoCloud as an innovative approach for geophysical data processing in the Cloud environment with the ability to run any type of data processing software in isolated environment within the single Cloud platform. We've developed a simple and quick method of several open-source widely known software seismic packages (SeisComp3, Earthworm, Geotool, MSNoise) installation which does not require knowledge of system administration, configuration, OS compatibility issues etc. and other often annoying details preventing time wasting for system configuration work. Installation process is simplified as "mouse click" on selected software package from the Cloud market place. The main objective of the developed capability was the software tools conception with which users are able to design and install quickly their own highly reliable and highly available virtual IT-infrastructure for the organization of seismic (and in future other geophysical) data processing for either research or monitoring purposes. These tools provide access to any seismic station data available in open IP configuration from the different networks affiliated with different Institutions and Organizations. It allows also setting up your own network as you desire by selecting either regionally deployed stations or the worldwide global network based on stations selection form the global map. The processing software and products and research results could be easily monitored from everywhere using variety of user's devices form desk top computers to IT gadgets. Currents efforts of the development team are directed to achieve Scalability, Reliability and Sustainability (SRS) of proposed solutions allowing any user to run their applications with the confidence of no data loss and no failure of the monitoring or research software components. The system is suitable for quick rollout of NDC-in-Box software package developed for State Signatories and aimed for promotion of data processing collected by the IMS Network.
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away
NASA Astrophysics Data System (ADS)
Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.
2012-09-01
By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data, and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.
Securing the Data Storage and Processing in Cloud Computing Environment
ERIC Educational Resources Information Center
Owens, Rodney
2013-01-01
Organizations increasingly utilize cloud computing architectures to reduce costs and energy consumption both in the data warehouse and on mobile devices by better utilizing the computing resources available. However, the security and privacy issues with publicly available cloud computing infrastructures have not been studied to a sufficient depth…
A service brokering and recommendation mechanism for better selecting cloud services.
Gui, Zhipeng; Yang, Chaowei; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Yu, Manzhu; Sun, Min; Zhou, Nanyin; Jin, Baoxuan
2014-01-01
Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI).
NASA Astrophysics Data System (ADS)
Caumont, Herve; Brito, Fabrice; Mathot, Emmanuel; Barchetta, Francesco; Loeschau, Frank
2015-04-01
We present recent achievements with the Geohazards Exploitation Platform (GEP), a European contribution to the GEO SuperSites, and its interoperability with the MEDiterranean SUpersite Volcanoes (MED-SUV) e- infrastructure. The GEP is a catalyst for the use of satellite Earth observation missions, providing data to initiatives such as the GEO Geohazard Supersites and Natural Laboratories (GSNL), the Volcano and Seismic Hazards CEOS Pilots or the European Plate Observing System (EPOS). As satellite sensors are delivering increasing amounts of data, researchers need more computational science tools and services. The GEP contribution in this regard allows scientists to access different data types, relevant to the same area and phenomena and to directly stage selected inputs to scalable processing applications that deliver EO-based science products. With the GEP concept of operation for improved collaboration, a partner can bring its processing tools, use from his workspace other shared toolboxes and access large data repositories. GEP is based on Open Source Software components, on a Cloud Services architecture inheriting a range of ESA and EC funded innovations, and is associating the scientific community and SMEs in implementing new capabilities. Via MED-SUV, we are making discoverable and accessible a large number of products over the Mt. Etna, Vesu- vius/Campi Flegrei volcanic areas, which are of broader interest for Geosciences researchers, so they can process ENVISAT MERIS, ENVISAT ASAR, and ERS SAR data (both Level 1 and Level 2) hosted in the ESA clusters and in ESA's Virtual Archive, TerraSAR-X data hosted in DLR's Virtual Archive, as well as data hosted in other dedicated MED-SUV Virtual Archives (e.g. for LANDSAT, EOS-1). GEP will gradually access Sentinel-1A data, other space agencies data and value-added products. Processed products can also be published and archived on the MED-SUV e-Infrastructure. In this effort, data policy rules applied to the acquisitions are verified against the GEOSS Data Collection of Open Resources for Everyone (GEOSS Data-CORE) principles. The resulting infras- tructure repositories include connectivity to the GEOSS Data Access Broker (DAB), through the "OGC CS-W OpenSearch Geo and Time extensions" interface standard, a key interoperability arrangement used by the MED- SUV systems, making EO data products available to both the project partners and the broader initiatives. GEP is also proposing and further developing hosted processing, aimed at MED-SUV researchers' work on new methods to integrate in-situ and satellite sensors data: a set of users services (concept of Platform-as-a-Service, or PaaS) for generating value-added products, including tools to design and develop Hadoop-enabled processing chains. The PaaS core engine is the Developer Cloud Sandboxes service, where scalable processing chains are prepared and validated. The PaaS makes use of Virtual Machines technology, and of middleware for scaling-out processing tasks via interfaces to commercial Cloud Providers, or through research agreements to academic re- sources like EGI.eu. After integration, processors are deployed and invoked 'as-a-Service' by partners via OGC Web Processing Service standard interface, or shared as reusable virtualized resources. Recent integration work covered e.g. ROI_PAC, GMTSAR and DORIS ADORE toolboxes along with supporting processing services such as DEM generation. Such approach has been discussed also with the MARSite project, ensuring the adopted solu- tions are aligned. As part of the MED-SUV project, we are developing tools and services supporting researchers working on new data fusion methods, and fostering collaboration between different end users and partners, including towards the GEO communities. Overall, the approach provides an integrated European contribution for the exploitation of decades of scientific data gathered from Earth observation satellites.
Investigation into Cloud Computing for More Robust Automated Bulk Image Geoprocessing
NASA Technical Reports Server (NTRS)
Brown, Richard B.; Smoot, James C.; Underwood, Lauren; Armstrong, C. Duane
2012-01-01
Geospatial resource assessments frequently require timely geospatial data processing that involves large multivariate remote sensing data sets. In particular, for disasters, response requires rapid access to large data volumes, substantial storage space and high performance processing capability. The processing and distribution of this data into usable information products requires a processing pipeline that can efficiently manage the required storage, computing utilities, and data handling requirements. In recent years, with the availability of cloud computing technology, cloud processing platforms have made available a powerful new computing infrastructure resource that can meet this need. To assess the utility of this resource, this project investigates cloud computing platforms for bulk, automated geoprocessing capabilities with respect to data handling and application development requirements. This presentation is of work being conducted by Applied Sciences Program Office at NASA-Stennis Space Center. A prototypical set of image manipulation and transformation processes that incorporate sample Unmanned Airborne System data were developed to create value-added products and tested for implementation on the "cloud". This project outlines the steps involved in creating and testing of open source software developed process code on a local prototype platform, and then transitioning this code with associated environment requirements into an analogous, but memory and processor enhanced cloud platform. A data processing cloud was used to store both standard digital camera panchromatic and multi-band image data, which were subsequently subjected to standard image processing functions such as NDVI (Normalized Difference Vegetation Index), NDMI (Normalized Difference Moisture Index), band stacking, reprojection, and other similar type data processes. Cloud infrastructure service providers were evaluated by taking these locally tested processing functions, and then applying them to a given cloud-enabled infrastructure to assesses and compare environment setup options and enabled technologies. This project reviews findings that were observed when cloud platforms were evaluated for bulk geoprocessing capabilities based on data handling and application development requirements.
Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640
Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.
NASA Astrophysics Data System (ADS)
Barth, Alexander; Troupin, Charles; Watelet, Sylvain; Alvera-Azcarate, Aida; Beckers, Jean-Marie
2017-04-01
The analysis tool DIVA (Data-Interpolating Variational Analysis) is designed to generate gridded fields or climatologies from in situ observations. The tool DIVA minimizes a cost function to ensure that the analysed field is relatively close to the observations and conforms at the same time to a set of dynamical constraints. In particular, DIVA naturally decouples water bodies which are not directly connected and it uses a (potentially spatial varying) correlation length to describe over which length-scale the analysed variable is correlated. In addition, DIVA can also take ocean currents into account to introduce a preferential direction for the correlation. The SeaDataCloud project aims to facilitate the access and use of ocean in situ data from 45 national oceanographic data centres and marine data centres from 35 countries riparian to all European seas. A central aspect is to provide web-based virtual research environment, where scientists can easily access and explore the data sets through the SeaDataCloud infrastructure. For users familiar with programming languages like Julia and Python, Jupyter (acronym for Julia, Python and R) notebooks provide an exciting way to analyse and to interact with ocean data. Jupyter notebooks are made up of cells that can be run individually and can contain text, formulas or code fragment. A complete notebook explains how to go from input data and parameters to a result, in this case a gridded field obtained executing DIVA. This presentation discusses this new web-based workflow for generating climatologies using DIVA. It explores its new possibilities in particular, in terms of improved ease of use and reproducibility of the results. The integration in the infrastructure of EUDAT is also addressed.
NASA Technical Reports Server (NTRS)
Murphy, James R.; Otto, Neil M.
2017-01-01
NASA's Unmanned Aircraft Systems Integration in the National Airspace System Project is conducting human in the loop simulations and flight testing intended to reduce barriers associated with enabling routine airspace access for unmanned aircraft. The primary focus of these tests is interaction of the unmanned aircraft pilot with the display of detect and avoid alerting and guidance information. The project's integrated test and evaluation team was charged with developing the test infrastructure. As with any development effort, compromises in the underlying system architecture and design were made to allow for the rapid prototyping and open-ended nature of the research. In order to accommodate these design choices, a distributed test environment was developed incorporating Live, Virtual, Constructive, (LVC) concepts. The LVC components form the core infrastructure support simulation of UAS operations by integrating live and virtual aircraft in a realistic air traffic environment. This LVC infrastructure enables efficient testing by leveraging the use of existing assets distributed across multiple NASA Centers. Using standard LVC concepts enable future integration with existing simulation infrastructure.
NASA Technical Reports Server (NTRS)
Murphy, Jim; Otto, Neil
2017-01-01
NASA's Unmanned Aircraft Systems Integration in the National Airspace System Project is conducting human in the loop simulations and flight testing intended to reduce barriers associated with enabling routine airspace access for unmanned aircraft. The primary focus of these tests is interaction of the unmanned aircraft pilot with the display of detect and avoid alerting and guidance information. The projects integrated test and evaluation team was charged with developing the test infrastructure. As with any development effort, compromises in the underlying system architecture and design were made to allow for the rapid prototyping and open-ended nature of the research. In order to accommodate these design choices, a distributed test environment was developed incorporating Live, Virtual, Constructive, (LVC) concepts. The LVC components form the core infrastructure support simulation of UAS operations by integrating live and virtual aircraft in a realistic air traffic environment. This LVC infrastructure enables efficient testing by leveraging the use of existing assets distributed across multiple NASA Centers. Using standard LVC concepts enable future integration with existing simulation infrastructure.
NASA Astrophysics Data System (ADS)
Ali, Mufajjul
This paper proposes a Green Cloud model for mobile Cloud computing. The proposed model leverage on the current trend of IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service), and look at new paradigm called "Network as a Service" (NaaS). The Green Cloud model proposes various Telco's revenue generating streams and services with the CaaS (Cloud as a Service) for the near future.
Krintz, Chandra
2013-01-01
AppScale is an open source distributed software system that implements a cloud platform as a service (PaaS). AppScale makes cloud applications easy to deploy and scale over disparate cloud fabrics, implementing a set of APIs and architecture that also makes apps portable across the services they employ. AppScale is API-compatible with Google App Engine (GAE) and thus executes GAE applications on-premise or over other cloud infrastructures, without modification. PMID:23828721
A national-scale authentication infrastructure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butler, R.; Engert, D.; Foster, I.
2000-12-01
Today, individuals and institutions in science and industry are increasingly forming virtual organizations to pool resources and tackle a common goal. Participants in virtual organizations commonly need to share resources such as data archives, computer cycles, and networks - resources usually available only with restrictions based on the requested resource's nature and the user's identity. Thus, any sharing mechanism must have the ability to authenticate the user's identity and determine if the user is authorized to request the resource. Virtual organizations tend to be fluid, however, so authentication mechanisms must be flexible and lightweight, allowing administrators to quickly establish andmore » change resource-sharing arrangements. However, because virtual organizations complement rather than replace existing institutions, sharing mechanisms cannot change local policies and must allow individual institutions to maintain control over their own resources. Our group has created and deployed an authentication and authorization infrastructure that meets these requirements: the Grid Security Infrastructure. GSI offers secure single sign-ons and preserves site control over access policies and local security. It provides its own versions of common applications, such as FTP and remote login, and a programming interface for creating secure applications.« less
NASA Astrophysics Data System (ADS)
Huang, Haibin; Guo, Bingli; Li, Xin; Yin, Shan; Zhou, Yu; Huang, Shanguo
2017-12-01
Virtualization of datacenter (DC) infrastructures enables infrastructure providers (InPs) to provide novel services like virtual networks (VNs). Furthermore, optical networks have been employed to connect the metro-scale geographically distributed DCs. The synergistic virtualization of the DC infrastructures and optical networks enables the efficient VN service over inter-DC optical networks (inter-DCONs). While the capacity of the used standard single-mode fiber (SSMF) is limited by their nonlinear characteristics. Thus, mode-division multiplexing (MDM) technology based on few-mode fibers (FMFs) could be employed to increase the capacity of optical networks. Whereas, modal crosstalk (XT) introduced by optical fibers and components deployed in the MDM optical networks impacts the performance of VN embedding (VNE) over inter-DCONs with FMFs. In this paper, we propose a XT-aware VNE mechanism over inter-DCONs with FMFs. The impact of XT is considered throughout the VNE procedures. The simulation results show that the proposed XT-aware VNE can achieves better performances of blocking probability and spectrum utilization compared to conventional VNE mechanisms.
The Czech National Grid Infrastructure
NASA Astrophysics Data System (ADS)
Chudoba, J.; Křenková, I.; Mulač, M.; Ruda, M.; Sitera, J.
2017-10-01
The Czech National Grid Infrastructure is operated by MetaCentrum, a CESNET department responsible for coordinating and managing activities related to distributed computing. CESNET as the Czech National Research and Education Network (NREN) provides many e-infrastructure services, which are used by 94% of the scientific and research community in the Czech Republic. Computing and storage resources owned by different organizations are connected by fast enough network to provide transparent access to all resources. We describe in more detail the computing infrastructure, which is based on several different technologies and covers grid, cloud and map-reduce environment. While the largest part of CPUs is still accessible via distributed torque servers, providing environment for long batch jobs, part of infrastructure is available via standard EGI tools in EGI, subset of NGI resources is provided into EGI FedCloud environment with cloud interface and there is also Hadoop cluster provided by the same e-infrastructure.A broad spectrum of computing servers is offered; users can choose from standard 2 CPU servers to large SMP machines with up to 6 TB of RAM or servers with GPU cards. Different groups have different priorities on various resources, resource owners can even have an exclusive access. The software is distributed via AFS. Storage servers offering up to tens of terabytes of disk space to individual users are connected via NFS4 on top of GPFS and access to long term HSM storage with peta-byte capacity is also provided. Overview of available resources and recent statistics of usage will be given.
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
Prediction based proactive thermal virtual machine scheduling in green clouds.
Kinger, Supriya; Kumar, Rajesh; Sharma, Anju
2014-01-01
Cloud computing has rapidly emerged as a widely accepted computing paradigm, but the research on Cloud computing is still at an early stage. Cloud computing provides many advanced features but it still has some shortcomings such as relatively high operating cost and environmental hazards like increasing carbon footprints. These hazards can be reduced up to some extent by efficient scheduling of Cloud resources. Working temperature on which a machine is currently running can be taken as a criterion for Virtual Machine (VM) scheduling. This paper proposes a new proactive technique that considers current and maximum threshold temperature of Server Machines (SMs) before making scheduling decisions with the help of a temperature predictor, so that maximum temperature is never reached. Different workload scenarios have been taken into consideration. The results obtained show that the proposed system is better than existing systems of VM scheduling, which does not consider current temperature of nodes before making scheduling decisions. Thus, a reduction in need of cooling systems for a Cloud environment has been obtained and validated.
Integrated cloud infrastructure of the LIT JINR, PE "NULITS" and INP's Astana branch
NASA Astrophysics Data System (ADS)
Mazhitova, Yelena; Balashov, Nikita; Baranov, Aleksandr; Kutovskiy, Nikolay; Semenov, Roman
2018-04-01
The article describes the distributed cloud infrastructure deployed on the basis of the resources of the Laboratory of Information Technologies of the Joint Institute for Nuclear Research (LIT JINR) and some JINR Member State organizations. It explains a motivation of that work, an approach it is based on, lists of its participants among which there are private entity "Nazarbayev University Library and IT services" (PE "NULITS") Autonomous Education Organization "Nazarbayev University" (AO NU) and The Institute of Nuclear Physics' (INP's) Astana branch.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Grids, Clouds, and Virtualization
NASA Astrophysics Data System (ADS)
Cafaro, Massimo; Aloisio, Giovanni
This chapter introduces and puts in context Grids, Clouds, and Virtualization. Grids promised to deliver computing power on demand. However, despite a decade of active research, no viable commercial grid computing provider has emerged. On the other hand, it is widely believed - especially in the Business World - that HPC will eventually become a commodity. Just as some commercial consumers of electricity have mission requirements that necessitate they generate their own power, some consumers of computational resources will continue to need to provision their own supercomputers. Clouds are a recent business-oriented development with the potential to render this eventually as rare as organizations that generate their own electricity today, even among institutions who currently consider themselves the unassailable elite of the HPC business. Finally, Virtualization is one of the key technologies enabling many different Clouds. We begin with a brief history in order to put them in context, and recall the basic principles and concepts underlying and clearly differentiating them. A thorough overview and survey of existing technologies provides the basis to delve into details as the reader progresses through the book.
Architectural Implications of Cloud Computing
2011-10-24
Public Cloud Infrastructure-as-a- Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of...Twitter #SEIVirtualForum © 2011 Carnegie Mellon University Software -as-a- Service ( SaaS ) Model of software deployment in which a third-party...and System Solutions (RTSS) Program. Her current interests and projects are in service -oriented architecture (SOA), cloud computing, and context
Efficient On-Demand Operations in Large-Scale Infrastructures
ERIC Educational Resources Information Center
Ko, Steven Y.
2009-01-01
In large-scale distributed infrastructures such as clouds, Grids, peer-to-peer systems, and wide-area testbeds, users and administrators typically desire to perform "on-demand operations" that deal with the most up-to-date state of the infrastructure. However, the scale and dynamism present in the operating environment make it challenging to…
NASA Astrophysics Data System (ADS)
Schnase, J. L.; Duffy, D. Q.; Tamkin, G. S.; Strong, S.; Ripley, D.; Gill, R.; Sinno, S. S.; Shen, Y.; Carriere, L. E.; Brieger, L.; Moore, R.; Rajasekar, A.; Schroeder, W.; Wan, M.
2011-12-01
Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service. A virtual climate data server is an OAIS-compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have developed prototype vCDSs to manage NetCDF, HDF, and GeoTIF data products. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA's Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into these virtualized resources, multiple vCDSs can use iRODS's federation and realized object capabilities to create an integrated ecosystem of data servers that can scale and adapt to changing requirements. This approach enables platform- or software-as-a-service deployment of the vCDSs and allows the NCCS to offer virtualization-as-a-service, a capacity to respond in an agile way to new customer requests for data services, and a path for migrating existing services into the cloud. We have registered MODIS Atmosphere data products in a vCDS that contains 54 million registered files, 630TB of data, and over 300 million metadata values. We are now assembling IPCC AR5 data into a production vCDS that will provide the platform upon which NCCS's Earth System Grid (ESG) node publishes to the extended science community. In this talk, we describe our approach, experiences, lessons learned, and plans for the future.
ASSURED CLOUD COMPUTING UNIVERSITY CENTER OFEXCELLENCE (ACC UCOE)
2018-01-18
average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed...infrastructure security -Design of algorithms and techniques for real- time assuredness in cloud computing -Map-reduce task assignment with data locality...46 DESIGN OF ALGORITHMS AND TECHNIQUES FOR REAL- TIME ASSUREDNESS IN CLOUD COMPUTING
A Development of Lightweight Grid Interface
NASA Astrophysics Data System (ADS)
Iwai, G.; Kawai, Y.; Sasaki, T.; Watase, Y.
2011-12-01
In order to help a rapid development of Grid/Cloud aware applications, we have developed API to abstract the distributed computing infrastructures based on SAGA (A Simple API for Grid Applications). SAGA, which is standardized in the OGF (Open Grid Forum), defines API specifications to access distributed computing infrastructures, such as Grid, Cloud and local computing resources. The Universal Grid API (UGAPI), which is a set of command line interfaces (CLI) and APIs, aims to offer simpler API to combine several SAGA interfaces with richer functionalities. These CLIs of the UGAPI offer typical functionalities required by end users for job management and file access to the different distributed computing infrastructures as well as local computing resources. We have also built a web interface for the particle therapy simulation and demonstrated the large scale calculation using the different infrastructures at the same time. In this paper, we would like to present how the web interface based on UGAPI and SAGA achieve more efficient utilization of computing resources over the different infrastructures with technical details and practical experiences.
Genomic cloud computing: legal and ethical points to consider
Dove, Edward S; Joly, Yann; Tassé, Anne-Marie; Burton, Paul; Chisholm, Rex; Fortier, Isabel; Goodwin, Pat; Harris, Jennifer; Hveem, Kristian; Kaye, Jane; Kent, Alistair; Knoppers, Bartha Maria; Lindpaintner, Klaus; Little, Julian; Riegman, Peter; Ripatti, Samuli; Stolk, Ronald; Bobrow, Martin; Cambon-Thomsen, Anne; Dressler, Lynn; Joly, Yann; Kato, Kazuto; Knoppers, Bartha Maria; Rodriguez, Laura Lyman; McPherson, Treasa; Nicolás, Pilar; Ouellette, Francis; Romeo-Casabona, Carlos; Sarin, Rajiv; Wallace, Susan; Wiesner, Georgia; Wilson, Julia; Zeps, Nikolajs; Simkevitz, Howard; De Rienzo, Assunta; Knoppers, Bartha M
2015-01-01
The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key ‘points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These ‘points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure. PMID:25248396
Genomic cloud computing: legal and ethical points to consider.
Dove, Edward S; Joly, Yann; Tassé, Anne-Marie; Knoppers, Bartha M
2015-10-01
The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key 'points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These 'points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure.
Detecting Abnormal Machine Characteristics in Cloud Infrastructures
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Das, Kamalika; Matthews, Bryan L.
2011-01-01
In the cloud computing environment resources are accessed as services rather than as a product. Monitoring this system for performance is crucial because of typical pay-peruse packages bought by the users for their jobs. With the huge number of machines currently in the cloud system, it is often extremely difficult for system administrators to keep track of all machines using distributed monitoring programs such as Ganglia1 which lacks system health assessment and summarization capabilities. To overcome this problem, we propose a technique for automated anomaly detection using machine performance data in the cloud. Our algorithm is entirely distributed and runs locally on each computing machine on the cloud in order to rank the machines in order of their anomalous behavior for given jobs. There is no need to centralize any of the performance data for the analysis and at the end of the analysis, our algorithm generates error reports, thereby allowing the system administrators to take corrective actions. Experiments performed on real data sets collected for different jobs validate the fact that our algorithm has a low overhead for tracking anomalous machines in a cloud infrastructure.
ERIC Educational Resources Information Center
McCrea, Bridget; Weil, Marty
2011-01-01
Across the U.S., innovative collaboration practices are happening in the cloud: Sixth-graders participate in literary salons. Fourth-graders mentor kindergarteners. And teachers use virtual Post-it notes to advise students as they create their own television shows. In other words, cloud computing is no longer just used to manage administrative…
Generalized Intelligent Framework for Tutoring (GIFT) Cloud/Virtual Open Campus Quick-Start Guide
2016-03-01
distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT This document serves as the quick-start guide for GIFT Cloud, the web -based...to users with a GIFT Account at no cost. GIFT Cloud is a new implementation of GIFT. This web -based application allows learners, authors, and...distribution is unlimited. 3 3. Requirements for GIFT Cloud GIFT Cloud is accessed via a web browser. Officially, GIFT Cloud has been tested to work on
Security and Cloud Outsourcing Framework for Economic Dispatch
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sarker, Mushfiqur R.; Wang, Jianhui; Li, Zuyi
The computational complexity and problem sizes of power grid applications have increased significantly with the advent of renewable resources and smart grid technologies. The current paradigm of solving these issues consist of inhouse high performance computing infrastructures, which have drawbacks of high capital expenditures, maintenance, and limited scalability. Cloud computing is an ideal alternative due to its powerful computational capacity, rapid scalability, and high cost-effectiveness. A major challenge, however, remains in that the highly confidential grid data is susceptible for potential cyberattacks when outsourced to the cloud. In this work, a security and cloud outsourcing framework is developed for themore » Economic Dispatch (ED) linear programming application. As a result, the security framework transforms the ED linear program into a confidentiality-preserving linear program, that masks both the data and problem structure, thus enabling secure outsourcing to the cloud. Results show that for large grid test cases the performance gain and costs outperforms the in-house infrastructure.« less
Security and Cloud Outsourcing Framework for Economic Dispatch
Sarker, Mushfiqur R.; Wang, Jianhui; Li, Zuyi; ...
2017-04-24
The computational complexity and problem sizes of power grid applications have increased significantly with the advent of renewable resources and smart grid technologies. The current paradigm of solving these issues consist of inhouse high performance computing infrastructures, which have drawbacks of high capital expenditures, maintenance, and limited scalability. Cloud computing is an ideal alternative due to its powerful computational capacity, rapid scalability, and high cost-effectiveness. A major challenge, however, remains in that the highly confidential grid data is susceptible for potential cyberattacks when outsourced to the cloud. In this work, a security and cloud outsourcing framework is developed for themore » Economic Dispatch (ED) linear programming application. As a result, the security framework transforms the ED linear program into a confidentiality-preserving linear program, that masks both the data and problem structure, thus enabling secure outsourcing to the cloud. Results show that for large grid test cases the performance gain and costs outperforms the in-house infrastructure.« less
Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets.
Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L
2014-01-01
As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
NASA Astrophysics Data System (ADS)
Freer, J. E.; Bloomfield, J. P.; Johnes, P. J.; MacLeod, C.; Reaney, S.
2010-12-01
There are many challenges in developing effective and integrated catchment management solutions for hydrology and water quality issues. Such solutions should ideally build on current scientific evidence to inform policy makers and regulators and additionally allow stakeholders to take ownership of local and/or national issues, in effect bringing together ‘communities of practice’. A strategy being piloted in the UK as the Pilot Virtual Observatory (pVO), funded by NERC, is to demonstrate the use of cyber-infrastructure and cloud computing resources to investigate better methods of linking data and models and to demonstrate scenario analysis for research, policy and operational needs. The research will provide new ways the scientific and stakeholder communities come together to exploit current environmental information, knowledge and experience in an open framework. This poster presents the project scope and methodologies for the pVO work dealing with national modelling of hydrology and macro-nutrient biogeochemistry. We evaluate the strategies needed to robustly benchmark our current predictive capability of these resources through ensemble modelling. We explore the use of catchment similarity concepts to understand if national monitoring programs can inform us about the behaviour of catchments. We discuss the challenges to applying these strategies in an open access and integrated framework and finally we consider the future for such virtual observatory platforms for improving the way we iteratively improve our understanding of catchment science.
A Service Brokering and Recommendation Mechanism for Better Selecting Cloud Services
Gui, Zhipeng; Yang, Chaowei; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Yu, Manzhu; Sun, Min; Zhou, Nanyin; Jin, Baoxuan
2014-01-01
Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI). PMID:25170937
Cloud Fingerprinting: Using Clock Skews To Determine Co Location Of Virtual Machines
2016-09-01
DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) Cloud computing has quickly revolutionized computing practices of organizations, to include the Department of... Cloud computing has quickly revolutionized computing practices of organizations, to in- clude the Department of Defense. However, security concerns...vi Table of Contents 1 Introduction 1 1.1 Proliferation of Cloud Computing . . . . . . . . . . . . . . . . . . 1 1.2 Problem Statement
Research on Key Technologies of Cloud Computing
NASA Astrophysics Data System (ADS)
Zhang, Shufen; Yan, Hongcan; Chen, Xuebin
With the development of multi-core processors, virtualization, distributed storage, broadband Internet and automatic management, a new type of computing mode named cloud computing is produced. It distributes computation task on the resource pool which consists of massive computers, so the application systems can obtain the computing power, the storage space and software service according to its demand. It can concentrate all the computing resources and manage them automatically by the software without intervene. This makes application offers not to annoy for tedious details and more absorbed in his business. It will be advantageous to innovation and reduce cost. It's the ultimate goal of cloud computing to provide calculation, services and applications as a public facility for the public, So that people can use the computer resources just like using water, electricity, gas and telephone. Currently, the understanding of cloud computing is developing and changing constantly, cloud computing still has no unanimous definition. This paper describes three main service forms of cloud computing: SAAS, PAAS, IAAS, compared the definition of cloud computing which is given by Google, Amazon, IBM and other companies, summarized the basic characteristics of cloud computing, and emphasized on the key technologies such as data storage, data management, virtualization and programming model.
2011-09-01
COMPUTING: EFFECTS AND APPLICATION OF HASTILY FORMED NETWORKS (HFN) FOR HUMANITARIAN ASSISTANCE/DISASTER RELIEF (HA/DR) MISSIONS by Mark K. Morris...i REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour...SUBTITLE Virtual Cloud Computing: Effects and Application of Hastily Formed Networks (HFN) for Humanitarian Assistance/Disaster Relief (HA/DR) Missions
Proba-V Mission Exploitation Platform
NASA Astrophysics Data System (ADS)
Goor, E.
2017-12-01
VITO and partners developed the Proba-V Mission Exploitation Platform (MEP) as an end-to-end solution to drastically improve the exploitation of the Proba-V (an EC Copernicus contributing mission) EO-data archive, the past mission SPOT-VEGETATION and derived vegetation parameters by researchers, service providers (e.g. the EC Copernicus Global Land Service) and end-users. The analysis of time series of data (PB range) is addressed, as well as the large scale on-demand processing of near real-time data on a powerful and scalable processing environment. New features are still developed, but the platform is yet fully operational since November 2016 and offers A time series viewer (browser web client and API), showing the evolution of Proba-V bands and derived vegetation parameters for any country, region, pixel or polygon defined by the user. Full-resolution viewing services for the complete data archive. On-demand processing chains on a powerfull Hadoop/Spark backend. Virtual Machines can be requested by users with access to the complete data archive mentioned above and pre-configured tools to work with this data, e.g. various toolboxes and support for R and Python. This allows users to immediately work with the data without having to install tools or download data, but as well to design, debug and test applications on the platform. Jupyter Notebooks is available with some examples python and R projects worked out to show the potential of the data. Today the platform is already used by several international third party projects to perform R&D activities on the data, and to develop/host data analysis toolboxes. From the Proba-V MEP, access to other data sources such as Sentinel-2 and landsat data is also addressed. Selected components of the MEP are also deployed on public cloud infrastructures in various R&D projects. Users can make use of powerful Web based tools and can self-manage virtual machines to perform their work on the infrastructure at VITO with access to the complete data archive. To realise this, private cloud technology (openStack) is used and a distributed processing environment is built based on Hadoop. The Hadoop ecosystem offers a lot of technologies (Spark, Yarn, Accumulo) which we integrate with several open-source components (e.g. Geotrellis).
Bootstrapping and Maintaining Trust in the Cloud
2016-12-01
proliferation and popularity of infrastructure-as-a- service (IaaS) cloud computing services such as Amazon Web Services and Google Compute Engine means...IaaS trusted computing system: • Secure Bootstrapping – the system should enable the tenant to securely install an initial root secret into each cloud ...elastically instantiated and terminated. Prior cloud trusted computing solutions address a subset of these features, but none achieve all. Excalibur [31] sup
A Highly Scalable Data Service (HSDS) using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.; Henderson, J.; Willmore, F.
2017-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy, security mechanisms and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and legacy software systems developed for online data repositories within the federal government were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Moreover, services bases on object storage are well established and provided through all the leading cloud service providers (Amazon Web Service, Microsoft Azure, Google Cloud, etc…) of which can often provide unmatched "scale-out" capabilities and data availability to a large and growing consumer base at a price point unachievable from in-house solutions. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows a performance advantage for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
Selecting a Virtual World Platform for Learning
ERIC Educational Resources Information Center
Robbins, Russell W.; Butler, Brian S.
2009-01-01
Like any infrastructure technology, Virtual World (VW) platforms provide affordances that facilitate some activities and hinder others. Although it is theoretically possible for a VW platform to support all types of activities, designers make choices that lead technologies to be more or less suited for different learning objectives. Virtual World…
Improving Virtual Collaborative Learning through Canonical Action Research
ERIC Educational Resources Information Center
Weber, Peter; Lehr, Christian; Gersch, Martin
2014-01-01
Virtual collaboration continues to gain in significance and is attracting attention also as virtual collaborative learning (VCL) in education. This paper addresses aspects of VCL that we identified as critical in a series of courses named "Net Economy": (1) technical infrastructure, (2) motivation and collaboration, and (3) assessment…
Managing competing elastic Grid and Cloud scientific computing applications using OpenNebula
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Lusso, S.; Masera, M.; Vallero, S.
2015-12-01
Elastic cloud computing applications, i.e. applications that automatically scale according to computing needs, work on the ideal assumption of infinite resources. While large public cloud infrastructures may be a reasonable approximation of this condition, scientific computing centres like WLCG Grid sites usually work in a saturated regime, in which applications compete for scarce resources through queues, priorities and scheduling policies, and keeping a fraction of the computing cores idle to allow for headroom is usually not an option. In our particular environment one of the applications (a WLCG Tier-2 Grid site) is much larger than all the others and cannot autoscale easily. Nevertheless, other smaller applications can benefit of automatic elasticity; the implementation of this property in our infrastructure, based on the OpenNebula cloud stack, will be described and the very first operational experiences with a small number of strategies for timely allocation and release of resources will be discussed.
NASA Astrophysics Data System (ADS)
Perez Montes, Diego A.; Añel Cabanelas, Juan A.; Wallom, David C. H.; Arribas, Alberto; Uhe, Peter; Caderno, Pablo V.; Pena, Tomas F.
2017-04-01
Cloud Computing is a technological option that offers great possibilities for modelling in geosciences. We have studied how two different climate models, HadAM3P-HadRM3P and CESM-WACCM, can be adapted in two different ways to run on Cloud Computing Environments from three different vendors: Amazon, Google and Microsoft. Also, we have evaluated qualitatively how the use of Cloud Computing can affect the allocation of resources by funding bodies and issues related to computing security, including scientific reproducibility. Our first experiments were developed using the well known ClimatePrediction.net (CPDN), that uses BOINC, over the infrastructure from two cloud providers, namely Microsoft Azure and Amazon Web Services (hereafter AWS). For this comparison we ran a set of thirteen month climate simulations for CPDN in Azure and AWS using a range of different virtual machines (VMs) for HadRM3P (50 km resolution over South America CORDEX region) nested in the global atmosphere-only model HadAM3P. These simulations were run on a single processor and took between 3 and 5 days to compute depending on the VM type. The last part of our simulation experiments was running WACCM over different VMS on the Google Compute Engine (GCE) and make a comparison with the supercomputer (SC) Finisterrae1 from the Centro de Supercomputacion de Galicia. It was shown that GCE gives better performance than the SC for smaller number of cores/MPI tasks but the model throughput shows clearly how the SC performance is better after approximately 100 cores (related with network speed and latency differences). From a cost point of view, Cloud Computing moves researchers from a traditional approach where experiments were limited by the available hardware resources to monetary resources (how many resources can be afforded). As there is an increasing movement and recommendation for budgeting HPC projects on this technology (budgets can be calculated in a more realistic way) we could see a shift on the trends over the next years to consolidate Cloud as the preferred solution.
Educational and Scientific Applications of Climate Model Diagnostic Analyzer
NASA Astrophysics Data System (ADS)
Lee, S.; Pan, L.; Zhai, C.; Tang, B.; Kubar, T. L.; Zhang, J.; Bao, Q.
2016-12-01
Climate Model Diagnostic Analyzer (CMDA) is a web-based information system designed for the climate modeling and model analysis community to analyze climate data from models and observations. CMDA provides tools to diagnostically analyze climate data for model validation and improvement, and to systematically manage analysis provenance for sharing results with other investigators. CMDA utilizes cloud computing resources, multi-threading computing, machine-learning algorithms, web service technologies, and provenance-supporting technologies to address technical challenges that the Earth science modeling and model analysis community faces in evaluating and diagnosing climate models. As CMDA infrastructure and technology have matured, we have developed the educational and scientific applications of CMDA. Educationally, CMDA supported the summer school of the JPL Center for Climate Sciences for three years since 2014. In the summer school, the students work on group research projects where CMDA provide datasets and analysis tools. Each student is assigned to a virtual machine with CMDA installed in Amazon Web Services. A provenance management system for CMDA is developed to keep track of students' usages of CMDA, and to recommend datasets and analysis tools for their research topic. The provenance system also allows students to revisit their analysis results and share them with their group. Scientifically, we have developed several science use cases of CMDA covering various topics, datasets, and analysis types. Each use case developed is described and listed in terms of a scientific goal, datasets used, the analysis tools used, scientific results discovered from the use case, an analysis result such as output plots and data files, and a link to the exact analysis service call with all the input arguments filled. For example, one science use case is the evaluation of NCAR CAM5 model with MODIS total cloud fraction. The analysis service used is Difference Plot Service of Two Variables, and the datasets used are NCAR CAM total cloud fraction and MODIS total cloud fraction. The scientific highlight of the use case is that the CAM5 model overall does a fairly decent job at simulating total cloud cover, though simulates too few clouds especially near and offshore of the eastern ocean basins where low clouds are dominant.
Architecture for the Next Generation System Management Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gallard, Jerome; Lebre, I Adrien; Morin, Christine
2011-01-01
To get more results or greater accuracy, computational scientists execute their applications on distributed computing platforms such as Clusters, Grids and Clouds. These platforms are different in terms of hardware and software resources as well as locality: some span across multiple sites and multiple administrative domains whereas others are limited to a single site/domain. As a consequence, in order to scale their applica- tions up the scientists have to manage technical details for each target platform. From our point of view, this complexity should be hidden from the scientists who, in most cases, would prefer to focus on their researchmore » rather than spending time dealing with platform configuration concerns. In this article, we advocate for a system management framework that aims to automatically setup the whole run-time environment according to the applications needs. The main difference with regards to usual approaches is that they generally only focus on the software layer whereas we address both the hardware and the software expecta- tions through a unique system. For each application, scientists describe their requirements through the definition of a Virtual Platform (VP) and a Virtual System Environment (VSE). Relying on the VP/VSE definitions, the framework is in charge of: (i) the configuration of the physical infrastructure to satisfy the VP requirements, (ii) the setup of the VP, and (iii) the customization of the execution environment (VSE) upon the former VP. We propose a new formalism that the system can rely upon to successfully perform each of these three steps without burdening the user with the specifics of the configuration for the physical resources, and system management tools. This formalism leverages Goldberg s theory for recursive virtual machines by introducing new concepts based on system virtualization (identity, partitioning, aggregation) and emulation (simple, abstraction). This enables the definition of complex VP/VSE configurations without making assumptions about the hardware and the software re- sources. For each requirement, the system executes the corresponding operation with the appropriate management tool. As a proof of concept, we implemented a first prototype that currently interacts with several system management tools (e.g., OSCAR, the Grid 5000 toolkit, and XtreemOS) and that can be easily extended to integrate new resource brokers or cloud systems such as Nimbus, OpenNebula or Eucalyptus for instance.« less
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing.
Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei
2016-02-18
Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users' costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers' resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center's energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically.
An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing
Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei
2016-01-01
Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users’ costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers’ resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center’s energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically. PMID:26901201
Easy, Collaborative and Engaging--The Use of Cloud Computing in the Design of Management Classrooms
ERIC Educational Resources Information Center
Schneckenberg, Dirk
2014-01-01
Background: Cloud computing has recently received interest in information systems research and practice as a new way to organise information with the help of an increasingly ubiquitous computer infrastructure. However, the use of cloud computing in higher education institutions and business schools, as well as its potential to create novel…
Evaluating virtual hosted desktops for graphics-intensive astronomy
NASA Astrophysics Data System (ADS)
Meade, B. F.; Fluke, C. J.
2018-04-01
Visualisation of data is critical to understanding astronomical phenomena. Today, many instruments produce datasets that are too big to be downloaded to a local computer, yet many of the visualisation tools used by astronomers are deployed only on desktop computers. Cloud computing is increasingly used to provide a computation and simulation platform in astronomy, but it also offers great potential as a visualisation platform. Virtual hosted desktops, with graphics processing unit (GPU) acceleration, allow interactive, graphics-intensive desktop applications to operate co-located with astronomy datasets stored in remote data centres. By combining benchmarking and user experience testing, with a cohort of 20 astronomers, we investigate the viability of replacing physical desktop computers with virtual hosted desktops. In our work, we compare two Apple MacBook computers (one old and one new, representing hardware and opposite ends of the useful lifetime) with two virtual hosted desktops: one commercial (Amazon Web Services) and one in a private research cloud (the Australian NeCTAR Research Cloud). For two-dimensional image-based tasks and graphics-intensive three-dimensional operations - typical of astronomy visualisation workflows - we found that benchmarks do not necessarily provide the best indication of performance. When compared to typical laptop computers, virtual hosted desktops can provide a better user experience, even with lower performing graphics cards. We also found that virtual hosted desktops are equally simple to use, provide greater flexibility in choice of configuration, and may actually be a more cost-effective option for typical usage profiles.
Auscope: Australian Earth Science Information Infrastructure using Free and Open Source Software
NASA Astrophysics Data System (ADS)
Woodcock, R.; Cox, S. J.; Fraser, R.; Wyborn, L. A.
2013-12-01
Since 2005 the Australian Government has supported a series of initiatives providing researchers with access to major research facilities and information networks necessary for world-class research. Starting with the National Collaborative Research Infrastructure Strategy (NCRIS) the Australian earth science community established an integrated national geoscience infrastructure system called AuScope. AuScope is now in operation, providing a number of components to assist in understanding the structure and evolution of the Australian continent. These include the acquisition of subsurface imaging , earth composition and age analysis, a virtual drill core library, geological process simulation, and a high resolution geospatial reference framework. To draw together information from across the earth science community in academia, industry and government, AuScope includes a nationally distributed information infrastructure. Free and Open Source Software (FOSS) has been a significant enabler in building the AuScope community and providing a range of interoperable services for accessing data and scientific software. A number of FOSS components have been created, adopted or upgraded to create a coherent, OGC compliant Spatial Information Services Stack (SISS). SISS is now deployed at all Australian Geological Surveys, many Universities and the CSIRO. Comprising a set of OGC catalogue and data services, and augmented with new vocabulary and identifier services, the SISS provides a comprehensive package for organisations to contribute their data to the AuScope network. This packaging and a variety of software testing and documentation activities enabled greater trust and notably reduced barriers to adoption. FOSS selection was important, not only for technical capability and robustness, but also for appropriate licensing and community models to ensure sustainability of the infrastructure in the long term. Government agencies were sensitive to these issues and AuScope's careful selection has been rewarded by adoption. In some cases the features provided by the SISS solution are now significantly in advance of COTS offerings which will create expectations that can be passed back from users to their preferred vendors. Using FOSS, AuScope has addressed the challenge of data exchange across organisations nationally. The data standards (e.g. GeosciML) and platforms that underpin AuScope provide important new datasets and multi-agency links independent of underlying software and hardware differences. AuScope has created an infrastructure, a platform of technologies and the opportunity for new ways of working with and integrating disparate data at much lower cost. Research activities are now exploiting the information infrastructure to create virtual laboratories for research ranging from geophysics through water and the environment. Once again the AuScope community is making heavy use of FOSS to provide access to processing software and Cloud computing and HPC. The successful use of FOSS by AuScope, and the efforts made to ensure it is suitable for adoption, have resulted in the SISS being selected as a reference implementation for a number of Australian Government initiatives beyond AuScope in environmental information and bioregional assessments.
Towards Cloud Processing of GGOS Big Data
NASA Astrophysics Data System (ADS)
Weston, Stuart; Kim, Bumjun; Litchfield, Alan; Gulyaev, Sergei; Hall, Dylan; Chorao, Carlos; Ruthven, Andrew; Davies, Glyn; Lagos, Bruno; Christie, Don
2017-04-01
We report on our initial steps towards development of a cloud-like correlation infrastructure for geodetic Very Long Baseline Interferometry (VLBI), which in its raw format is of the order of 10-100 TB (big data). Data is generated by multiple VLBI radio telescopes, and is then used by for geodetic, geophysical, and astrometric research and operational activities through the International VLBI Service (IVS), as well as for corrections of GPS satellite orbits. Currently IVS data is correlated in several international Correlators (Correlation Centres), which receive data from individual radio telescope stations either in hard drives via regular mail service or via fibre using e-transfer mode. The latter is strongly limited by connectivity of existing correlation centres, which creates bottle necks and slows down the turnover of the data. This becomes critical in many applications - for example, it currently takes 1-2 weeks to generate the dUT1 parameter for corrections of GNSS orbits while less than 1-2 days delay is desirable. We started with a blade server at the AUT campus to emulate a cloud server using Virtual Machines (VMWare). The New Zealand Data Head node is connected to the high speed (100 Gbps) network ring circuit courtesy of the Research and Education Advanced Network New Zealand (REANNZ), with the additional nodes at remote physical sites connected via 10 Gbps fibre. We use real Australian Long Baseline Array (LBA) observational data from 6 radio telescopes in Australia, South Africa and New Zealand (15 baselines) of 1.5 hours in duration making 8 TB to emulate data transfer from remote locations and to provide a meaningful benchmark dataset for correlation. Data was successfully transferred using bespoke UDT network transfer tools and correlated with the speed-up factor of 0.8 using DiFX software correlator. In partnership with the New Zealand office of Catalyst IT Ltd we have moved this environment into Catalyst Cloud and report on the first correlation of a VLBI Dataset in a true cloud environment.
The Ophidia framework: toward cloud-based data analytics for climate change
NASA Astrophysics Data System (ADS)
Fiore, Sandro; D'Anca, Alessandro; Elia, Donatello; Mancini, Marco; Mariello, Andrea; Mirto, Maria; Palazzo, Cosimo; Aloisio, Giovanni
2015-04-01
The Ophidia project is a research effort on big data analytics facing scientific data analysis challenges in the climate change domain. It provides parallel (server-side) data analysis, an internal storage model and a hierarchical data organization to manage large amount of multidimensional scientific data. The Ophidia analytics platform provides several MPI-based parallel operators to manipulate large datasets (data cubes) and array-based primitives to perform data analysis on large arrays of scientific data. The most relevant data analytics use cases implemented in national and international projects target fire danger prevention (OFIDIA), interactions between climate change and biodiversity (EUBrazilCC), climate indicators and remote data analysis (CLIP-C), sea situational awareness (TESSA), large scale data analytics on CMIP5 data in NetCDF format, Climate and Forecast (CF) convention compliant (ExArch). Two use cases regarding the EU FP7 EUBrazil Cloud Connect and the INTERREG OFIDIA projects will be presented during the talk. In the former case (EUBrazilCC) the Ophidia framework is being extended to integrate scalable VM-based solutions for the management of large volumes of scientific data (both climate and satellite data) in a cloud-based environment to study how climate change affects biodiversity. In the latter one (OFIDIA) the data analytics framework is being exploited to provide operational support regarding processing chains devoted to fire danger prevention. To tackle the project challenges, data analytics workflows consisting of about 130 operators perform, among the others, parallel data analysis, metadata management, virtual file system tasks, maps generation, rolling of datasets, import/export of datasets in NetCDF format. Finally, the entire Ophidia software stack has been deployed at CMCC on 24-nodes (16-cores/node) of the Athena HPC cluster. Moreover, a cloud-based release tested with OpenNebula is also available and running in the private cloud infrastructure of the CMCC Supercomputing Centre.
Department of Defense Use of Commercial Cloud Computing Capabilities and Services
2015-11-01
models (Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service ( SaaS )), and four deployment models (Public...NIST defines three main models for cloud computing: IaaS, PaaS, and SaaS . These models help differentiate the implementation responsibilities that fall...and SaaS . 3. Public, Private, Community, and Hybrid Clouds Cloud services come in different forms, depending on the customer’s specific needs
CloudStackProjectsNContributions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nielsen, Roy
2017-07-12
Collection of applications and cloud templates. Project currently based on www.packer.io for automation, www.github.com/boxcutter templates and the www.github.com/csd-dev-tools/ClockworkVMs application for wrapping both of the above for easy creation of virtual systems. Will in future also contain cloud templates tuned for various services, applications and purposes.
Prediction Based Proactive Thermal Virtual Machine Scheduling in Green Clouds
Kinger, Supriya; Kumar, Rajesh; Sharma, Anju
2014-01-01
Cloud computing has rapidly emerged as a widely accepted computing paradigm, but the research on Cloud computing is still at an early stage. Cloud computing provides many advanced features but it still has some shortcomings such as relatively high operating cost and environmental hazards like increasing carbon footprints. These hazards can be reduced up to some extent by efficient scheduling of Cloud resources. Working temperature on which a machine is currently running can be taken as a criterion for Virtual Machine (VM) scheduling. This paper proposes a new proactive technique that considers current and maximum threshold temperature of Server Machines (SMs) before making scheduling decisions with the help of a temperature predictor, so that maximum temperature is never reached. Different workload scenarios have been taken into consideration. The results obtained show that the proposed system is better than existing systems of VM scheduling, which does not consider current temperature of nodes before making scheduling decisions. Thus, a reduction in need of cooling systems for a Cloud environment has been obtained and validated. PMID:24737962
Change Analysis in Structural Laser Scanning Point Clouds: The Baseline Method
Shen, Yueqian; Lindenbergh, Roderik; Wang, Jinhu
2016-01-01
A method is introduced for detecting changes from point clouds that avoids registration. For many applications, changes are detected between two scans of the same scene obtained at different times. Traditionally, these scans are aligned to a common coordinate system having the disadvantage that this registration step introduces additional errors. In addition, registration requires stable targets or features. To avoid these issues, we propose a change detection method based on so-called baselines. Baselines connect feature points within one scan. To analyze changes, baselines connecting corresponding points in two scans are compared. As feature points either targets or virtual points corresponding to some reconstructable feature in the scene are used. The new method is implemented on two scans sampling a masonry laboratory building before and after seismic testing, that resulted in damages in the order of several centimeters. The centres of the bricks of the laboratory building are automatically extracted to serve as virtual points. Baselines connecting virtual points and/or target points are extracted and compared with respect to a suitable structural coordinate system. Changes detected from the baseline analysis are compared to a traditional cloud to cloud change analysis demonstrating the potential of the new method for structural analysis. PMID:28029121
Change Analysis in Structural Laser Scanning Point Clouds: The Baseline Method.
Shen, Yueqian; Lindenbergh, Roderik; Wang, Jinhu
2016-12-24
A method is introduced for detecting changes from point clouds that avoids registration. For many applications, changes are detected between two scans of the same scene obtained at different times. Traditionally, these scans are aligned to a common coordinate system having the disadvantage that this registration step introduces additional errors. In addition, registration requires stable targets or features. To avoid these issues, we propose a change detection method based on so-called baselines. Baselines connect feature points within one scan. To analyze changes, baselines connecting corresponding points in two scans are compared. As feature points either targets or virtual points corresponding to some reconstructable feature in the scene are used. The new method is implemented on two scans sampling a masonry laboratory building before and after seismic testing, that resulted in damages in the order of several centimeters. The centres of the bricks of the laboratory building are automatically extracted to serve as virtual points. Baselines connecting virtual points and/or target points are extracted and compared with respect to a suitable structural coordinate system. Changes detected from the baseline analysis are compared to a traditional cloud to cloud change analysis demonstrating the potential of the new method for structural analysis.
2017-06-01
for GIFT Cloud, the web -based application version of the Generalized Intelligent Framework for Tutoring (GIFT). GIFT is a modular, open-source...external applications. GIFT is available to users with a GIFT Account at no cost. GIFT Cloud is an implementation of GIFT. This web -based application...section. Approved for public release; distribution is unlimited. 3 3. Requirements for GIFT Cloud GIFT Cloud is accessed via a web browser
Monitoring of IaaS and scientific applications on the Cloud using the Elasticsearch ecosystem
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Guarise, A.; Lusso, S.; Masera, M.; Vallero, S.
2015-05-01
The private Cloud at the Torino INFN computing centre offers IaaS services to different scientific computing applications. The infrastructure is managed with the OpenNebula cloud controller. The main stakeholders of the facility are a grid Tier-2 site for the ALICE collaboration at LHC, an interactive analysis facility for the same experiment and a grid Tier-2 site for the BES-III collaboration, plus an increasing number of other small tenants. Besides keeping track of the usage, the automation of dynamic allocation of resources to tenants requires detailed monitoring and accounting of the resource usage. As a first investigation towards this, we set up a monitoring system to inspect the site activities both in terms of IaaS and applications running on the hosted virtual instances. For this purpose we used the Elasticsearch, Logstash and Kibana stack. In the current implementation, the heterogeneous accounting information is fed to different MySQL databases and sent to Elasticsearch via a custom Logstash plugin. For the IaaS metering, we developed sensors for the OpenNebula API. The IaaS level information gathered through the API is sent to the MySQL database through an ad-hoc developed RESTful web service, which is also used for other accounting purposes. Concerning the application level, we used the Root plugin TProofMonSenderSQL to collect accounting data from the interactive analysis facility. The BES-III virtual instances used to be monitored with Zabbix, as a proof of concept we also retrieve the information contained in the Zabbix database. Each of these three cases is indexed separately in Elasticsearch. We are now starting to consider dismissing the intermediate level provided by the SQL database and evaluating a NoSQL option as a unique central database for all the monitoring information. We setup a set of Kibana dashboards with pre-defined queries in order to monitor the relevant information in each case. In this way we have achieved a uniform monitoring interface for both the IaaS and the scientific applications, mostly leveraging off-the-shelf tools.
Performance implications from sizing a VM on multi-core systems: A Data analytic application s view
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lim, Seung-Hwan; Horey, James L; Begoli, Edmon
In this paper, we present a quantitative performance analysis of data analytics applications running on multi-core virtual machines. Such environments form the core of cloud computing. In addition, data analytics applications, such as Cassandra and Hadoop, are becoming increasingly popular on cloud computing platforms. This convergence necessitates a better understanding of the performance and cost implications of such hybrid systems. For example, the very rst step in hosting applications in virtualized environments, requires the user to con gure the number of virtual processors and the size of memory. To understand performance implications of this step, we benchmarked three Yahoo Cloudmore » Serving Benchmark (YCSB) workloads in a virtualized multi-core environment. Our measurements indicate that the performance of Cassandra for YCSB workloads does not heavily depend on the processing capacity of a system, while the size of the data set is critical to performance relative to allocated memory. We also identi ed a strong relationship between the running time of workloads and various hardware events (last level cache loads, misses, and CPU migrations). From this analysis, we provide several suggestions to improve the performance of data analytics applications running on cloud computing environments.« less
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Multi-Level Secure Information Sharing Between Smart Cloud Systems of Systems
2014-03-01
implementation of virtual hardware (VMWare), along with a commercial implementation of virtual networking (VPN), such as OpenVPN . 1. VMWare Virtualization...en.wikipedia.org/wiki/MongoDB. Wikipedia. 2014b. Accessed February 26. s.v. “Open VPN,” http://en.wikipedia.org/wiki/ OpenVPN . Wikipedia. 2014c. Accessed
Oh, Sungyoung; Cha, Jieun; Ji, Myungkyu; Kang, Hyekyung; Kim, Seok; Heo, Eunyoung; Han, Jong Soo; Kang, Hyunggoo; Chae, Hoseok; Hwang, Hee; Yoo, Sooyoung
2015-04-01
To design a cloud computing-based Healthcare Software-as-a-Service (SaaS) Platform (HSP) for delivering healthcare information services with low cost, high clinical value, and high usability. We analyzed the architecture requirements of an HSP, including the interface, business services, cloud SaaS, quality attributes, privacy and security, and multi-lingual capacity. For cloud-based SaaS services, we focused on Clinical Decision Service (CDS) content services, basic functional services, and mobile services. Microsoft's Azure cloud computing for Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) was used. The functional and software views of an HSP were designed in a layered architecture. External systems can be interfaced with the HSP using SOAP and REST/JSON. The multi-tenancy model of the HSP was designed as a shared database, with a separate schema for each tenant through a single application, although healthcare data can be physically located on a cloud or in a hospital, depending on regulations. The CDS services were categorized into rule-based services for medications, alert registration services, and knowledge services. We expect that cloud-based HSPs will allow small and mid-sized hospitals, in addition to large-sized hospitals, to adopt information infrastructures and health information technology with low system operation and maintenance costs.
Virtual Sensors: Using Data Mining to Efficiently Estimate Spectra
NASA Technical Reports Server (NTRS)
Srivastava, Ashok; Oza, Nikunj; Stroeve, Julienne
2004-01-01
Detecting clouds within a satellite image is essential for retrieving surface geophysical parameters, such as albedo and temperature, from optical and thermal imagery because the retrieval methods tend to be valid for clear skies only. Thus, routine satellite data processing requires reliable automated cloud detection algorithms that are applicable to many surface types. Unfortunately, cloud detection over snow and ice is difficult due to the lack of spectral contrast between clouds and snow. Snow and clouds are both highly reflective in the visible wavelen,ats and often show little contrast in the thermal Infrared. However, at 1.6 microns, the spectral signatures of snow and clouds differ enough to allow improved snow/ice/cloud discrimination. The recent Terra and Aqua Moderate Resolution Imaging Spectro-Radiometer (MODIS) sensors have a channel (channel 6) at 1.6 microns. Presently the most comprehensive, long-term information on surface albedo and temperature over snow- and ice-covered surfaces comes from the Advanced Very High Resolution Radiometer ( AVHRR) sensor that has been providing imagery since July 1981. The earlier AVHRR sensors (e.g. AVHRR/2) did not however have a channel designed for discriminating clouds from snow, such as the 1.6 micron channel available on the more recent AVHRR/3 or the MODIS sensors. In the absence of the 1.6 micron channel, the AVHRR Polar Pathfinder (APP) product performs cloud detection using a combination of time-series analysis and multispectral threshold tests based on the satellite's measuring channels to produce a cloud mask. The method has been found to work reasonably well over sea ice, but not so well over the ice sheets. Thus, improving the cloud mask in the APP dataset would be extremely helpful toward increasing the accuracy of the albedo and temperature retrievals, as well as extending the time-series of albedo and temperature retrievals from the more recent sensors to the historical ones. In this work, we use data mining methods to construct a model of MODIS channel 6 as a function of other channels that are common to both MODIS and AVHRR. The idea is to use the model to generate the equivalent of MODIS channel 6 for AVHRR as a function of the AVHRR equivalents to MODIS channels. We call this a Virtual Sensor because it predicts unmeasured spectra. The goal is to use this virtual channel 6. to yield a cloud mask superior to what is currently used in APP . Our results show that several data mining methods such as multilayer perceptrons (MLPs), ensemble methods (e.g., bagging), and kernel methods (e.g., support vector machines) generate channel 6 for unseen MODIS images with high accuracy. Because the true channel 6 is not available for AVHRR images, we qualitatively assess the virtual channel 6 for several AVHRR images.
NASA Technical Reports Server (NTRS)
Schnase, John L.; Tamkin, Glenn S.; Ripley, W. David III; Stong, Savannah; Gill, Roger; Duffy, Daniel Q.
2012-01-01
Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of a Virtual Climate Data Server (vCDS), repetitive provisioning, image-based deployment and distribution, and virtualization-as-a-service. The vCDS is an iRODS-based data server specialized to the needs of a particular data-centric application. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA s Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into one or more of these virtualized resource classes, vCDSs can use iRODS s federation capabilities to create an integrated ecosystem of managed collections that is scalable and adaptable to changing resource requirements. This approach enables platform- or software-asa- service deployment of vCDS and allows the NCCS to offer virtualization-as-a-service: a capacity to respond in an agile way to new customer requests for data services.
Software Reuse Methods to Improve Technological Infrastructure for e-Science
NASA Technical Reports Server (NTRS)
Marshall, James J.; Downs, Robert R.; Mattmann, Chris A.
2011-01-01
Social computing has the potential to contribute to scientific research. Ongoing developments in information and communications technology improve capabilities for enabling scientific research, including research fostered by social computing capabilities. The recent emergence of e-Science practices has demonstrated the benefits from improvements in the technological infrastructure, or cyber-infrastructure, that has been developed to support science. Cloud computing is one example of this e-Science trend. Our own work in the area of software reuse offers methods that can be used to improve new technological development, including cloud computing capabilities, to support scientific research practices. In this paper, we focus on software reuse and its potential to contribute to the development and evaluation of information systems and related services designed to support new capabilities for conducting scientific research.
Cloud Computing Security Issue: Survey
NASA Astrophysics Data System (ADS)
Kamal, Shailza; Kaur, Rajpreet
2011-12-01
Cloud computing is the growing field in IT industry since 2007 proposed by IBM. Another company like Google, Amazon, and Microsoft provides further products to cloud computing. The cloud computing is the internet based computing that shared recourses, information on demand. It provides the services like SaaS, IaaS and PaaS. The services and recourses are shared by virtualization that run multiple operation applications on cloud computing. This discussion gives the survey on the challenges on security issues during cloud computing and describes some standards and protocols that presents how security can be managed.
A Cloud-based Approach to Medical NLP
Chard, Kyle; Russell, Michael; Lussier, Yves A.; Mendonça, Eneida A; Silverstein, Jonathan C.
2011-01-01
Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN. PMID:22195072
A cloud-based approach to medical NLP.
Chard, Kyle; Russell, Michael; Lussier, Yves A; Mendonça, Eneida A; Silverstein, Jonathan C
2011-01-01
Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN.
Biomedical image analysis and processing in clouds
NASA Astrophysics Data System (ADS)
Bednarz, Tomasz; Szul, Piotr; Arzhaeva, Yulia; Wang, Dadong; Burdett, Neil; Khassapov, Alex; Chen, Shiping; Vallotton, Pascal; Lagerstrom, Ryan; Gureyev, Tim; Taylor, John
2013-10-01
Cloud-based Image Analysis and Processing Toolbox project runs on the Australian National eResearch Collaboration Tools and Resources (NeCTAR) cloud infrastructure and allows access to biomedical image processing and analysis services to researchers via remotely accessible user interfaces. By providing user-friendly access to cloud computing resources and new workflow-based interfaces, our solution enables researchers to carry out various challenging image analysis and reconstruction tasks. Several case studies will be presented during the conference.
Integration of end-user Cloud storage for CMS analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
Integration of end-user Cloud storage for CMS analysis
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez; ...
2017-05-19
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
Security-aware Virtual Machine Allocation in the Cloud: A Game Theoretic Approach
2015-01-13
predecessor, however, this paper used empirical evidence and actual data from running experiments on the Amazon EC2 cloud . They began by running all 5...is through effective VM allocation management of the cloud provider to ensure delivery of maximum security for all cloud users. The negative... Cloud : A Game Theoretic Approach 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f
Elastic extension of a local analysis facility on external clouds for the LHC experiments
NASA Astrophysics Data System (ADS)
Ciaschini, V.; Codispoti, G.; Rinaldi, L.; Aiftimiei, D. C.; Bonacorsi, D.; Calligola, P.; Dal Pra, S.; De Girolamo, D.; Di Maria, R.; Grandi, C.; Michelotto, D.; Panella, M.; Taneja, S.; Semeria, F.
2017-10-01
The computing infrastructures serving the LHC experiments have been designed to cope at most with the average amount of data recorded. The usage peaks, as already observed in Run-I, may however originate large backlogs, thus delaying the completion of the data reconstruction and ultimately the data availability for physics analysis. In order to cope with the production peaks, the LHC experiments are exploring the opportunity to access Cloud resources provided by external partners or commercial providers. In this work we present the proof of concept of the elastic extension of a local analysis facility, specifically the Bologna Tier-3 Grid site, for the LHC experiments hosted at the site, on an external OpenStack infrastructure. We focus on the Cloud Bursting of the Grid site using DynFarm, a newly designed tool that allows the dynamic registration of new worker nodes to LSF. In this approach, the dynamically added worker nodes instantiated on an OpenStack infrastructure are transparently accessed by the LHC Grid tools and at the same time they serve as an extension of the farm for the local usage.
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Woodcock, R.
2013-12-01
One of the greatest drivers for change in the way scientific research is undertaken in Australia was the development of the Australian eResearch Infrastructure which was coordinated by the then Australian Government Department of Innovation, Industry, Science and Research. There were two main tranches of funding: the 2007-2013 National Collaborative Research Infrastructure Strategy (NCRIS) and the 2009 Education and Investment Framework (EIF) Super Science Initiative. Investments were in two areas: the Australian e-Research Infrastructure and domain specific capabilities: combined investment in both is 1,452M with at least 456M being invested in eResearch infrastructure. NCRIS was specifically designed as a community-guided process to provide researchers, both academic and government, with major research facilities, supporting infrastructures and networks necessary for world-class research. Extensive community engagement was sought to inform decisions on where Australia could best make strategic infrastructure investments to further develop its research capacity and improve research outcomes over the next 5 to 10years. The current (2007-2014) Australian e-Research Infrastructure has 2 components: 1. The National eResearch physical infrastructure which includes two petascale HPC facilities (one in Canberra and one in Perth), a 10 Gbps national network (National Research Network), a national data storage infrastructure comprising 8 multi petabyte data stores and shared access methods (Australian Access Federation). 2. A second component is focused on research integration infrastructures and includes the Australian National Data Service, which is concerned with better management, description and access to distributed research data in Australia and the National eResearch Collaboration Tools and Resources (NeCTAR) project. NeCTAR is centred on developing problem oriented digital laboratories which provide better and coordinated access to research tools, data environments and workflows. The eResearch Infrastructure Stack is designed to support 12 individual domain-specific capabilities. Four are relevant to the Earth and Space Sciences: (1) AuScope (a national Earth Science Infrastructure Program), (2) the Integrated Marine Observing System (IMOS), (3) the Terrestrial Ecosystems Research Network (TERN) and (4) the Australian Urban Research Infrastructure Network (AURIN). The two main research integration infrastructures, ANDS and NeCTAR, are seen as pivotal to the success of the Australian eResearch Infrastructure. Without them, there was a risk that that the investments in new computers and data storage would provide physical infrastructure, but few would come to use it as the skills barriers to entry were too high. ANDS focused on transforming Australia's research data environment. Its flagship is Research Data Australia, an Internet-based discovery service designed to provide rich connections between data, projects, researchers and institutions, and promote visibility of Australian research data collections in search engines. NeCTAR focused on building eResearch infrastructure in four areas: virtual laboratories, tools, a federated research cloud and a hosting service. Combined, ANDS and NeCTAR are ensuring that people ARE coming and ARE using the physical infrastructures that were built.
Transportation Infrastructure Design and Construction \\0x16 Virtual Training Tools
DOT National Transportation Integrated Search
2003-09-01
This project will develop 3D interactive computer-training environments for a major element of transportation infrastructure : hot mix asphalt paving. These tools will include elements of hot mix design (including laboratory equipment) and constructi...
The effective use of virtualization for selection of data centers in a cloud computing environment
NASA Astrophysics Data System (ADS)
Kumar, B. Santhosh; Parthiban, Latha
2018-04-01
Data centers are the places which consist of network of remote servers to store, access and process the data. Cloud computing is a technology where users worldwide will submit the tasks and the service providers will direct the requests to the data centers which are responsible for execution of tasks. The servers in the data centers need to employ the virtualization concept so that multiple tasks can be executed simultaneously. In this paper we proposed an algorithm for data center selection based on energy of virtual machines created in server. The virtualization energy in each of the server is calculated and total energy of the data center is obtained by the summation of individual server energy. The tasks submitted are routed to the data center with least energy consumption which will result in minimizing the operational expenses of a service provider.
CloudMC: a cloud computing application for Monte Carlo simulation.
Miras, H; Jiménez, R; Miras, C; Gomà, C
2013-04-21
This work presents CloudMC, a cloud computing application-developed in Windows Azure®, the platform of the Microsoft® cloud-for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based-the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice.
Lawrence, Daphne
2009-03-01
Blade servers and virtualization can reduce infrastructure, maintenance, heating, electric, cooling and equipment costs. Blade server technology is evolving and some elements may become obsolete. There is very little interoperability between blades. Hospitals can virtualize 40 to 60 percent of their servers, and old servers can be reused for testing. Not all applications lend themselves to virtualization--especially those with high memory requirements. CIOs should engage their vendors in virtualization discussions.
Innovative Decentralized Decision-Making Enabling Capability on Mobile Edge Devices
2015-09-01
feasibility of adapting mobile device infrastructure into a future tactical cloud ecosystem. F. SCOPE The scope of this research is focused on the...critical to mobility : wireless infrastructure , the mobile device itself, and mobile applications” (Office of the Department of Defense Chief Information... Infrastructure to a Cost Effective and Platform Agnostic Environment; 3) Collaborate with DOD and Industry Partners to Develop a Classified Mobile Device
Integrating Network Management for Cloud Computing Services
2015-06-01
abstraction and system design. In this dissertation, we make three major contributions. We rst propose to consolidate the tra c and infrastructure management...abstraction and system design. In this dissertation, we make three major contributions. We first propose to consolidate the traffic and infrastructure ...1.3.1 Safe Datacenter Traffic/ Infrastructure Management . . . . . . 9 1.3.2 End-host/Network Cooperative Traffic Management . . . . . . 10 1.3.3 Direct
NASA Astrophysics Data System (ADS)
Lucero, D. A.; Ivey, M.; Helsel, F.; Hardesty, J.; Dexheimer, D.
2015-12-01
Scientific infrastructure to support atmospheric science and aerosol science for the Department of Energy's Atmospheric Radiation Measurement programs at Barrow, Alaska.The Atmospheric Radiation Measurement (ARM) Program's located at Barrow, Alaska is a U.S. Department of Energy (DOE) site. The site provides a scientific infrastructure and data archives for the international Arctic research community. The infrastructure at Barrow has been in place since 1998, with many improvements since then. Barrow instruments include: scanning precipitation Radar-cloud radar, Doppler Lidar, Eddy correlation flux systems, Ceilometer, Manual and state-of-art automatic Balloon sounding systems, Atmospheric Emitted Radiance Interferometer (AERI), Micro-pulse Lidar (MPL), Millimeter cloud radar, High Spectral Resolution Lidar (HSRL) along with all the standard metrological measurements. Data from these instruments is placed in the ARM data archives and are available to the international research community. This poster will discuss what instruments are at Barrow and the challenges of maintaining these instruments in an Arctic site.
NASA Technical Reports Server (NTRS)
Maluf, David A.; Shetye, Sandeep D.; Chilukuri, Sri; Sturken, Ian
2012-01-01
Cloud computing can reduce cost significantly because businesses can share computing resources. In recent years Small and Medium Businesses (SMB) have used Cloud effectively for cost saving and for sharing IT expenses. With the success of SMBs, many perceive that the larger enterprises ought to move into Cloud environment as well. Government agency s stove-piped environments are being considered as candidates for potential use of Cloud either as an enterprise entity or pockets of small communities. Cloud Computing is the delivery of computing as a service rather than as a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network. Underneath the offered services, there exists a modern infrastructure cost of which is often spread across its services or its investors. As NASA is considered as an Enterprise class organization, like other enterprises, a shift has been occurring in perceiving its IT services as candidates for Cloud services. This paper discusses market trends in cloud computing from an enterprise angle and then addresses the topic of Cloud Computing for NASA in two possible forms. First, in the form of a public Cloud to support it as an enterprise, as well as to share it with the commercial and public at large. Second, as a private Cloud wherein the infrastructure is operated solely for NASA, whether managed internally or by a third-party and hosted internally or externally. The paper addresses the strengths and weaknesses of both paradigms of public and private Clouds, in both internally and externally operated settings. The content of the paper is from a NASA perspective but is applicable to any large enterprise with thousands of employees and contractors.
Integration of Openstack cloud resources in BES III computing cluster
NASA Astrophysics Data System (ADS)
Li, Haibo; Cheng, Yaodong; Huang, Qiulan; Cheng, Zhenjing; Shi, Jingyan
2017-10-01
Cloud computing provides a new technical means for data processing of high energy physics experiment. However, the resource of each queue is fixed and the usage of the resource is static in traditional job management system. In order to make it simple and transparent for physicist to use, we developed a virtual cluster system (vpmanager) to integrate IHEPCloud and different batch systems such as Torque and HTCondor. Vpmanager provides dynamic virtual machines scheduling according to the job queue. The BES III use case results show that resource efficiency is greatly improved.
Intelligent cloud computing security using genetic algorithm as a computational tools
NASA Astrophysics Data System (ADS)
Razuky AL-Shaikhly, Mazin H.
2018-05-01
An essential change had occurred in the field of Information Technology which represented with cloud computing, cloud giving virtual assets by means of web yet awesome difficulties in the field of information security and security assurance. Currently main problem with cloud computing is how to improve privacy and security for cloud “cloud is critical security”. This paper attempts to solve cloud security by using intelligent system with genetic algorithm as wall to provide cloud data secure, all services provided by cloud must detect who receive and register it to create list of users (trusted or un-trusted) depend on behavior. The execution of present proposal has shown great outcome.
A Virtual Environment for Resilient Infrastructure Modeling and Design
2015-09-01
Security CI Critical Infrastructure CID Center for Infrastructure Defense CSV Comma Separated Value DAD Defender-Attacker-Defender DHS Department...responses to disruptive events (e.g., cascading failure behavior) in a context- rich , controlled environment for exercises, education, and training...The general attacker-defender (AD) and defender-attacker-defender ( DAD ) models for CI are defined in Brown et al. (2006). These models help
49 CFR 15.5 - Sensitive security information.
Code of Federal Regulations, 2014 CFR
2014-10-01
... sources and methods used to gather or develop threat information, including threats against cyber infrastructure. (8) Security measures. Specific details of aviation or maritime transportation security measures... infrastructure asset information. Any list identifying systems or assets, whether physical or virtual, so vital...
49 CFR 15.5 - Sensitive security information.
Code of Federal Regulations, 2011 CFR
2011-10-01
... sources and methods used to gather or develop threat information, including threats against cyber infrastructure. (8) Security measures. Specific details of aviation or maritime transportation security measures... infrastructure asset information. Any list identifying systems or assets, whether physical or virtual, so vital...
49 CFR 15.5 - Sensitive security information.
Code of Federal Regulations, 2013 CFR
2013-10-01
... sources and methods used to gather or develop threat information, including threats against cyber infrastructure. (8) Security measures. Specific details of aviation or maritime transportation security measures... infrastructure asset information. Any list identifying systems or assets, whether physical or virtual, so vital...
49 CFR 15.5 - Sensitive security information.
Code of Federal Regulations, 2012 CFR
2012-10-01
... sources and methods used to gather or develop threat information, including threats against cyber infrastructure. (8) Security measures. Specific details of aviation or maritime transportation security measures... infrastructure asset information. Any list identifying systems or assets, whether physical or virtual, so vital...
NASA Astrophysics Data System (ADS)
López García, Álvaro; Fernández del Castillo, Enol; Orviz Fernández, Pablo
In this document we present an implementation of the Open Grid Forum's Open Cloud Computing Interface (OCCI) for OpenStack, namely ooi (Openstack occi interface, 2015) [1]. OCCI is an open standard for management tasks over cloud resources, focused on interoperability, portability and integration. ooi aims to implement this open interface for the OpenStack cloud middleware, promoting interoperability with other OCCI-enabled cloud management frameworks and infrastructures. ooi focuses on being non-invasive with a vanilla OpenStack installation, not tied to a particular OpenStack release version.
Monte Carlo verification of radiotherapy treatments with CloudMC.
Miras, Hector; Jiménez, Rubén; Perales, Álvaro; Terrón, José Antonio; Bertolet, Alejandro; Ortiz, Antonio; Macías, José
2018-06-27
A new implementation has been made on CloudMC, a cloud-based platform presented in a previous work, in order to provide services for radiotherapy treatment verification by means of Monte Carlo in a fast, easy and economical way. A description of the architecture of the application and the new developments implemented is presented together with the results of the tests carried out to validate its performance. CloudMC has been developed over Microsoft Azure cloud. It is based on a map/reduce implementation for Monte Carlo calculations distribution over a dynamic cluster of virtual machines in order to reduce calculation time. CloudMC has been updated with new methods to read and process the information related to radiotherapy treatment verification: CT image set, treatment plan, structures and dose distribution files in DICOM format. Some tests have been designed in order to determine, for the different tasks, the most suitable type of virtual machines from those available in Azure. Finally, the performance of Monte Carlo verification in CloudMC is studied through three real cases that involve different treatment techniques, linac models and Monte Carlo codes. Considering computational and economic factors, D1_v2 and G1 virtual machines were selected as the default type for the Worker Roles and the Reducer Role respectively. Calculation times up to 33 min and costs of 16 € were achieved for the verification cases presented when a statistical uncertainty below 2% (2σ) was required. The costs were reduced to 3-6 € when uncertainty requirements are relaxed to 4%. Advantages like high computational power, scalability, easy access and pay-per-usage model, make Monte Carlo cloud-based solutions, like the one presented in this work, an important step forward to solve the long-lived problem of truly introducing the Monte Carlo algorithms in the daily routine of the radiotherapy planning process.
A location selection policy of live virtual machine migration for power saving and load balancing.
Zhao, Jia; Ding, Yan; Xu, Gaochao; Hu, Liang; Dong, Yushuang; Fu, Xiaodong
2013-01-01
Green cloud data center has become a research hotspot of virtualized cloud computing architecture. And load balancing has also been one of the most important goals in cloud data centers. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on location selection (migration policy) of live VM migration for power saving and load balancing. We propose a novel approach MOGA-LS, which is a heuristic and self-adaptive multiobjective optimization algorithm based on the improved genetic algorithm (GA). This paper has presented the specific design and implementation of MOGA-LS such as the design of the genetic operators, fitness values, and elitism. We have introduced the Pareto dominance theory and the simulated annealing (SA) idea into MOGA-LS and have presented the specific process to get the final solution, and thus, the whole approach achieves a long-term efficient optimization for power saving and load balancing. The experimental results demonstrate that MOGA-LS evidently reduces the total incremental power consumption and better protects the performance of VM migration and achieves the balancing of system load compared with the existing research. It makes the result of live VM migration more high-effective and meaningful.
A Location Selection Policy of Live Virtual Machine Migration for Power Saving and Load Balancing
Xu, Gaochao; Hu, Liang; Dong, Yushuang; Fu, Xiaodong
2013-01-01
Green cloud data center has become a research hotspot of virtualized cloud computing architecture. And load balancing has also been one of the most important goals in cloud data centers. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on location selection (migration policy) of live VM migration for power saving and load balancing. We propose a novel approach MOGA-LS, which is a heuristic and self-adaptive multiobjective optimization algorithm based on the improved genetic algorithm (GA). This paper has presented the specific design and implementation of MOGA-LS such as the design of the genetic operators, fitness values, and elitism. We have introduced the Pareto dominance theory and the simulated annealing (SA) idea into MOGA-LS and have presented the specific process to get the final solution, and thus, the whole approach achieves a long-term efficient optimization for power saving and load balancing. The experimental results demonstrate that MOGA-LS evidently reduces the total incremental power consumption and better protects the performance of VM migration and achieves the balancing of system load compared with the existing research. It makes the result of live VM migration more high-effective and meaningful. PMID:24348165
Exploiting GPUs in Virtual Machine for BioCloud
Jo, Heeseung; Jeong, Jinkyu; Lee, Myoungho; Choi, Dong Hoon
2013-01-01
Recently, biological applications start to be reimplemented into the applications which exploit many cores of GPUs for better computation performance. Therefore, by providing virtualized GPUs to VMs in cloud computing environment, many biological applications will willingly move into cloud environment to enhance their computation performance and utilize infinite cloud computing resource while reducing expenses for computations. In this paper, we propose a BioCloud system architecture that enables VMs to use GPUs in cloud environment. Because much of the previous research has focused on the sharing mechanism of GPUs among VMs, they cannot achieve enough performance for biological applications of which computation throughput is more crucial rather than sharing. The proposed system exploits the pass-through mode of PCI express (PCI-E) channel. By making each VM be able to access underlying GPUs directly, applications can show almost the same performance as when those are in native environment. In addition, our scheme multiplexes GPUs by using hot plug-in/out device features of PCI-E channel. By adding or removing GPUs in each VM in on-demand manner, VMs in the same physical host can time-share their GPUs. We implemented the proposed system using the Xen VMM and NVIDIA GPUs and showed that our prototype is highly effective for biological GPU applications in cloud environment. PMID:23710465
Exploiting GPUs in virtual machine for BioCloud.
Jo, Heeseung; Jeong, Jinkyu; Lee, Myoungho; Choi, Dong Hoon
2013-01-01
Recently, biological applications start to be reimplemented into the applications which exploit many cores of GPUs for better computation performance. Therefore, by providing virtualized GPUs to VMs in cloud computing environment, many biological applications will willingly move into cloud environment to enhance their computation performance and utilize infinite cloud computing resource while reducing expenses for computations. In this paper, we propose a BioCloud system architecture that enables VMs to use GPUs in cloud environment. Because much of the previous research has focused on the sharing mechanism of GPUs among VMs, they cannot achieve enough performance for biological applications of which computation throughput is more crucial rather than sharing. The proposed system exploits the pass-through mode of PCI express (PCI-E) channel. By making each VM be able to access underlying GPUs directly, applications can show almost the same performance as when those are in native environment. In addition, our scheme multiplexes GPUs by using hot plug-in/out device features of PCI-E channel. By adding or removing GPUs in each VM in on-demand manner, VMs in the same physical host can time-share their GPUs. We implemented the proposed system using the Xen VMM and NVIDIA GPUs and showed that our prototype is highly effective for biological GPU applications in cloud environment.
Cloud Computing Based E-Learning System
ERIC Educational Resources Information Center
Al-Zoube, Mohammed; El-Seoud, Samir Abou; Wyne, Mudasser F.
2010-01-01
Cloud computing technologies although in their early stages, have managed to change the way applications are going to be developed and accessed. These technologies are aimed at running applications as services over the internet on a flexible infrastructure. Microsoft office applications, such as word processing, excel spreadsheet, access database…
USDA-ARS?s Scientific Manuscript database
Service oriented architectures allow modelling engines to be hosted over the Internet abstracting physical hardware configuration and software deployments from model users. Many existing environmental models are deployed as desktop applications running on user's personal computers (PCs). Migration ...
Development of a cloud-based Bioinformatics Training Platform.
Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A
2017-05-01
The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.
Development of a cloud-based Bioinformatics Training Platform
Revote, Jerico; Watson-Haigh, Nathan S.; Quenette, Steve; Bethwaite, Blair; McGrath, Annette
2017-01-01
Abstract The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. PMID:27084333
Australian DefenceScience. Volume 16, Number 2, Winter
2008-01-01
Making Virtual Advisers speedily interactive To provide an authentically interactive experience for humans working with Virtual Advisers, the Virtual...peer trusted and strong authentication for checking of security credentials without recourse to third parties or infrastructure, thus eliminating...multiple passwords, or carry around multiple security tokens.” Each CodeStick device is readied for use with a biometric authentication process. Since
CloudMan as a platform for tool, data, and analysis distribution.
Afgan, Enis; Chapman, Brad; Taylor, James
2012-11-27
Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.
Context-aware distributed cloud computing using CloudScheduler
NASA Astrophysics Data System (ADS)
Seuster, R.; Leavett-Brown, CR; Casteels, K.; Driemel, C.; Paterson, M.; Ring, D.; Sobie, RJ; Taylor, RP; Weldon, J.
2017-10-01
The distributed cloud using the CloudScheduler VM provisioning service is one of the longest running systems for HEP workloads. It has run millions of jobs for ATLAS and Belle II over the past few years using private and commercial clouds around the world. Our goal is to scale the distributed cloud to the 10,000-core level, with the ability to run any type of application (low I/O, high I/O and high memory) on any cloud. To achieve this goal, we have been implementing changes that utilize context-aware computing designs that are currently employed in the mobile communication industry. Context-awareness makes use of real-time and archived data to respond to user or system requirements. In our distributed cloud, we have many opportunistic clouds with no local HEP services, software or storage repositories. A context-aware design significantly improves the reliability and performance of our system by locating the nearest location of the required services. We describe how we are collecting and managing contextual information from our workload management systems, the clouds, the virtual machines and our services. This information is used not only to monitor the system but also to carry out automated corrective actions. We are incrementally adding new alerting and response services to our distributed cloud. This will enable us to scale the number of clouds and virtual machines. Further, a context-aware design will enable us to run analysis or high I/O application on opportunistic clouds. We envisage an open-source HTTP data federation (for example, the DynaFed system at CERN) as a service that would provide us access to existing storage elements used by the HEP experiments.
NASA Astrophysics Data System (ADS)
Heus, Thijs; Jonker, Harm J. J.; van den Akker, Harry E. A.; Griffith, Eric J.; Koutek, Michal; Post, Frits H.
2009-03-01
In this study, a new method is developed to investigate the entire life cycle of shallow cumuli in large eddy simulations. Although trained observers have no problem in distinguishing the different life stages of a cloud, this process proves difficult to automate, because cloud-splitting and cloud-merging events complicate the distinction between a single system divided in several cloudy parts and two independent systems that collided. Because the human perception is well equipped to capture and to make sense of these time-dependent three-dimensional features, a combination of automated constraints and human inspection in a three-dimensional virtual reality environment is used to select clouds that are exemplary in their behavior throughout their entire life span. Three specific cases (ARM, BOMEX, and BOMEX without large-scale forcings) are analyzed in this way, and the considerable number of selected clouds warrants reliable statistics of cloud properties conditioned on the phase in their life cycle. The most dominant feature in this statistical life cycle analysis is the pulsating growth that is present throughout the entire lifetime of the cloud, independent of the case and of the large-scale forcings. The pulses are a self-sustained phenomenon, driven by a balance between buoyancy and horizontal convergence of dry air. The convective inhibition just above the cloud base plays a crucial role as a barrier for the cloud to overcome in its infancy stage, and as a buffer region later on, ensuring a steady supply of buoyancy into the cloud.
Layer 1 VPN services in distributed next-generation SONET/SDH networks with inverse multiplexing
NASA Astrophysics Data System (ADS)
Ghani, N.; Muthalaly, M. V.; Benhaddou, D.; Alanqar, W.
2006-05-01
Advances in next-generation SONET/SDH along with GMPLS control architectures have enabled many new service provisioning capabilities. In particular, a key services paradigm is the emergent Layer 1 virtual private network (L1 VPN) framework, which allows multiple clients to utilize a common physical infrastructure and provision their own 'virtualized' circuit-switched networks. This precludes expensive infrastructure builds and increases resource utilization for carriers. Along these lines, a novel L1 VPN services resource management scheme for next-generation SONET/SDH networks is proposed that fully leverages advanced virtual concatenation and inverse multiplexing features. Additionally, both centralized and distributed GMPLS-based implementations are also tabled to support the proposed L1 VPN services model. Detailed performance analysis results are presented along with avenues for future research.
Challenges and opportunities of cloud computing for atmospheric sciences
NASA Astrophysics Data System (ADS)
Pérez Montes, Diego A.; Añel, Juan A.; Pena, Tomás F.; Wallom, David C. H.
2016-04-01
Cloud computing is an emerging technological solution widely used in many fields. Initially developed as a flexible way of managing peak demand it has began to make its way in scientific research. One of the greatest advantages of cloud computing for scientific research is independence of having access to a large cyberinfrastructure to fund or perform a research project. Cloud computing can avoid maintenance expenses for large supercomputers and has the potential to 'democratize' the access to high-performance computing, giving flexibility to funding bodies for allocating budgets for the computational costs associated with a project. Two of the most challenging problems in atmospheric sciences are computational cost and uncertainty in meteorological forecasting and climate projections. Both problems are closely related. Usually uncertainty can be reduced with the availability of computational resources to better reproduce a phenomenon or to perform a larger number of experiments. Here we expose results of the application of cloud computing resources for climate modeling using cloud computing infrastructures of three major vendors and two climate models. We show how the cloud infrastructure compares in performance to traditional supercomputers and how it provides the capability to complete experiments in shorter periods of time. The monetary cost associated is also analyzed. Finally we discuss the future potential of this technology for meteorological and climatological applications, both from the point of view of operational use and research.
Cloud computing for comparative genomics
2010-01-01
Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786
Cloud computing for comparative genomics.
Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J
2010-05-18
Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.
Oh, Sungyoung; Cha, Jieun; Ji, Myungkyu; Kang, Hyekyung; Kim, Seok; Heo, Eunyoung; Han, Jong Soo; Kang, Hyunggoo; Chae, Hoseok; Hwang, Hee
2015-01-01
Objectives To design a cloud computing-based Healthcare Software-as-a-Service (SaaS) Platform (HSP) for delivering healthcare information services with low cost, high clinical value, and high usability. Methods We analyzed the architecture requirements of an HSP, including the interface, business services, cloud SaaS, quality attributes, privacy and security, and multi-lingual capacity. For cloud-based SaaS services, we focused on Clinical Decision Service (CDS) content services, basic functional services, and mobile services. Microsoft's Azure cloud computing for Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) was used. Results The functional and software views of an HSP were designed in a layered architecture. External systems can be interfaced with the HSP using SOAP and REST/JSON. The multi-tenancy model of the HSP was designed as a shared database, with a separate schema for each tenant through a single application, although healthcare data can be physically located on a cloud or in a hospital, depending on regulations. The CDS services were categorized into rule-based services for medications, alert registration services, and knowledge services. Conclusions We expect that cloud-based HSPs will allow small and mid-sized hospitals, in addition to large-sized hospitals, to adopt information infrastructures and health information technology with low system operation and maintenance costs. PMID:25995962
Modelling operations and security of cloud systems using Z-notation and Chinese Wall security policy
NASA Astrophysics Data System (ADS)
Basu, Srijita; Sengupta, Anirban; Mazumdar, Chandan
2016-11-01
Enterprises are increasingly using cloud computing for hosting their applications. Availability of fast Internet and cheap bandwidth are causing greater number of people to use cloud-based services. This has the advantage of lower cost and minimum maintenance. However, ensuring security of user data and proper management of cloud infrastructure remain major areas of concern. Existing techniques are either too complex, or fail to properly represent the actual cloud scenario. This article presents a formal cloud model using the constructs of Z-notation. Principles of the Chinese Wall security policy have been applied to design secure cloud-specific operations. The proposed methodology will enable users to safely host their services, as well as process sensitive data, on cloud.
Creating a Rackspace and NASA Nebula compatible cloud using the OpenStack project (Invited)
NASA Astrophysics Data System (ADS)
Clark, R.
2010-12-01
NASA and Rackspace have both provided technology to the OpenStack that allows anyone to create a private Infrastructure as a Service (IaaS) cloud using open source software and commodity hardware. OpenStack is designed and developed completely in the open and with an open governance process. NASA donated Nova, which powers the compute portion of NASA Nebula Cloud Computing Platform, and Rackspace donated Swift, which powers Rackspace Cloud Files. The project is now in continuous development by NASA, Rackspace, and hundreds of other participants. When you create a private cloud using Openstack, you will have the ability to easily interact with your private cloud, a government cloud, and an ecosystem of public cloud providers, using the same API.
Universal Keyword Classifier on Public Key Based Encrypted Multikeyword Fuzzy Search in Public Cloud
Munisamy, Shyamala Devi; Chokkalingam, Arun
2015-01-01
Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization. PMID:26380364
Munisamy, Shyamala Devi; Chokkalingam, Arun
2015-01-01
Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization.
Optimization of over-provisioned clouds
NASA Astrophysics Data System (ADS)
Balashov, N.; Baranov, A.; Korenkov, V.
2016-09-01
The functioning of modern applications in cloud-centers is characterized by a huge variety of computational workloads generated. This causes uneven workload distribution and as a result leads to ineffective utilization of cloud-centers' hardware. The proposed article addresses the possible ways to solve this issue and demonstrates that it is a matter of necessity to optimize cloud-centers' hardware utilization. As one of the possible ways to solve the problem of the inefficient resource utilization in heterogeneous cloud-environments an algorithm of dynamic re-allocation of virtual resources is suggested.
2010-09-01
Cloud computing describes a new distributed computing paradigm for IT data and services that involves over-the-Internet provision of dynamically scalable and often virtualized resources. While cost reduction and flexibility in storage, services, and maintenance are important considerations when deciding on whether or how to migrate data and applications to the cloud, large organizations like the Department of Defense need to consider the organization and structure of data on the cloud and the operations on such data in order to reap the full benefit of cloud
Liu, Yong-Kuo; Chao, Nan; Xia, Hong; Peng, Min-Jun; Ayodeji, Abiodun
2018-05-17
This paper presents an improved and efficient virtual reality-based adaptive dose assessment method (VRBAM) applicable to the cutting and dismantling tasks in nuclear facility decommissioning. The method combines the modeling strength of virtual reality with the flexibility of adaptive technology. The initial geometry is designed with the three-dimensional computer-aided design tools, and a hybrid model composed of cuboids and a point-cloud is generated automatically according to the virtual model of the object. In order to improve the efficiency of dose calculation while retaining accuracy, the hybrid model is converted to a weighted point-cloud model, and the point kernels are generated by adaptively simplifying the weighted point-cloud model according to the detector position, an approach that is suitable for arbitrary geometries. The dose rates are calculated with the Point-Kernel method. To account for radiation scattering effects, buildup factors are calculated with the Geometric-Progression formula in the fitting function. The geometric modeling capability of VRBAM was verified by simulating basic geometries, which included a convex surface, a concave surface, a flat surface and their combination. The simulation results show that the VRBAM is more flexible and superior to other approaches in modeling complex geometries. In this paper, the computation time and dose rate results obtained from the proposed method were also compared with those obtained using the MCNP code and an earlier virtual reality-based method (VRBM) developed by the same authors. © 2018 IOP Publishing Ltd.
Building a Cloud Computing and Big Data Infrastructure for Cybersecurity Research and Education
2015-04-17
408 1,408 312,912 17 Hadoop- Integration M/D Node R720xd 2 24 128 3,600 5 Subtotal: 120 640 18,000 5 Cloud - Production VRTX M620 2 16 256 30,720...4 Subtotal: 8 64 1,024 30,720 4 Cloud - Integration IBM HS22 7870H5U 2 12 84 4,800 5 Subtotal: 10 60 420 4,800 5 TOTAL: 62 652 3,492 366,432...3,492 366,432 Cloud - Integration Hadoop- Production Hadoop- Integration Cloud - Production September 2014 8 Exploring New Opportunities (Cybersecurity
The Virtual Xenbase: transitioning an online bioinformatics resource to a private cloud
Karimi, Kamran; Vize, Peter D.
2014-01-01
As a model organism database, Xenbase has been providing informatics and genomic data on Xenopus (Silurana) tropicalis and Xenopus laevis frogs for more than a decade. The Xenbase database contains curated, as well as community-contributed and automatically harvested literature, gene and genomic data. A GBrowse genome browser, a BLAST+ server and stock center support are available on the site. When this resource was first built, all software services and components in Xenbase ran on a single physical server, with inherent reliability, scalability and inter-dependence issues. Recent advances in networking and virtualization techniques allowed us to move Xenbase to a virtual environment, and more specifically to a private cloud. To do so we decoupled the different software services and components, such that each would run on a different virtual machine. In the process, we also upgraded many of the components. The resulting system is faster and more reliable. System maintenance is easier, as individual virtual machines can now be updated, backed up and changed independently. We are also experiencing more effective resource allocation and utilization. Database URL: www.xenbase.org PMID:25380782
Exploring Cloud Computing for Distance Learning
ERIC Educational Resources Information Center
He, Wu; Cernusca, Dan; Abdous, M'hammed
2011-01-01
The use of distance courses in learning is growing exponentially. To better support faculty and students for teaching and learning, distance learning programs need to constantly innovate and optimize their IT infrastructures. The new IT paradigm called "cloud computing" has the potential to transform the way that IT resources are utilized and…
Looking at Clouds from All Sides Now
ERIC Educational Resources Information Center
Katz, Richard N.
2010-01-01
On February 9 and 10, 2010, fifty leaders from colleges, universities, corporations, professional associations, and state networks met in Tempe, Arizona, to discuss cloud computing and the impending shift in the mix of where infrastructure, applications, and services are sourced. This group identified a set of actions that colleges and…
A Novel Artificial Bee Colony Approach of Live Virtual Machine Migration Policy Using Bayes Theorem
Xu, Gaochao; Hu, Liang; Fu, Xiaodong
2013-01-01
Green cloud data center has become a research hotspot of virtualized cloud computing architecture. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on the VM placement selection of live migration for power saving. We present a novel heuristic approach which is called PS-ABC. Its algorithm includes two parts. One is that it combines the artificial bee colony (ABC) idea with the uniform random initialization idea, the binary search idea, and Boltzmann selection policy to achieve an improved ABC-based approach with better global exploration's ability and local exploitation's ability. The other one is that it uses the Bayes theorem to further optimize the improved ABC-based process to faster get the final optimal solution. As a result, the whole approach achieves a longer-term efficient optimization for power saving. The experimental results demonstrate that PS-ABC evidently reduces the total incremental power consumption and better protects the performance of VM running and migrating compared with the existing research. It makes the result of live VM migration more high-effective and meaningful. PMID:24385877
Providing Assistive Technology Applications as a Service Through Cloud Computing.
Mulfari, Davide; Celesti, Antonio; Villari, Massimo; Puliafito, Antonio
2015-01-01
Users with disabilities interact with Personal Computers (PCs) using Assistive Technology (AT) software solutions. Such applications run on a PC that a person with a disability commonly uses. However the configuration of AT applications is not trivial at all, especially whenever the user needs to work on a PC that does not allow him/her to rely on his / her AT tools (e.g., at work, at university, in an Internet point). In this paper, we discuss how cloud computing provides a valid technological solution to enhance such a scenario.With the emergence of cloud computing, many applications are executed on top of virtual machines (VMs). Virtualization allows us to achieve a software implementation of a real computer able to execute a standard operating system and any kind of application. In this paper we propose to build personalized VMs running AT programs and settings. By using the remote desktop technology, our solution enables users to control their customized virtual desktop environment by means of an HTML5-based web interface running on any computer equipped with a browser, whenever they are.
A novel artificial bee colony approach of live virtual machine migration policy using Bayes theorem.
Xu, Gaochao; Ding, Yan; Zhao, Jia; Hu, Liang; Fu, Xiaodong
2013-01-01
Green cloud data center has become a research hotspot of virtualized cloud computing architecture. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on the VM placement selection of live migration for power saving. We present a novel heuristic approach which is called PS-ABC. Its algorithm includes two parts. One is that it combines the artificial bee colony (ABC) idea with the uniform random initialization idea, the binary search idea, and Boltzmann selection policy to achieve an improved ABC-based approach with better global exploration's ability and local exploitation's ability. The other one is that it uses the Bayes theorem to further optimize the improved ABC-based process to faster get the final optimal solution. As a result, the whole approach achieves a longer-term efficient optimization for power saving. The experimental results demonstrate that PS-ABC evidently reduces the total incremental power consumption and better protects the performance of VM running and migrating compared with the existing research. It makes the result of live VM migration more high-effective and meaningful.
NASA Astrophysics Data System (ADS)
Garcia Fernandez, J.; Tammi, K.; Joutsiniemi, A.
2017-02-01
Recent advances in Terrestrial Laser Scanner (TLS), in terms of cost and flexibility, have consolidated this technology as an essential tool for the documentation and digitalization of Cultural Heritage. However, once the TLS data is used, it basically remains stored and left to waste.How can highly accurate and dense point clouds (of the built heritage) be processed for its reuse, especially to engage a broader audience? This paper aims to answer this question by a channel that minimizes the need for expert knowledge, while enhancing the interactivity with the as-built digital data: Virtual Heritage Dissemination through the production of VR content. Driven by the ProDigiOUs project's guidelines on data dissemination (EU funded), this paper advances in a production path to transform the point cloud into virtual stereoscopic spherical images, taking into account the different visual features that produce depth perception, and especially those prompting visual fatigue while experiencing the VR content. Finally, we present the results of the Hiedanranta's scans transformed into stereoscopic spherical animations.
The computing and data infrastructure to interconnect EEE stations
NASA Astrophysics Data System (ADS)
Noferini, F.; EEE Collaboration
2016-07-01
The Extreme Energy Event (EEE) experiment is devoted to the search of high energy cosmic rays through a network of telescopes installed in about 50 high schools distributed throughout the Italian territory. This project requires a peculiar data management infrastructure to collect data registered in stations very far from each other and to allow a coordinated analysis. Such an infrastructure is realized at INFN-CNAF, which operates a Cloud facility based on the OpenStack opensource Cloud framework and provides Infrastructure as a Service (IaaS) for its users. In 2014 EEE started to use it for collecting, monitoring and reconstructing the data acquired in all the EEE stations. For the synchronization between the stations and the INFN-CNAF infrastructure we used BitTorrent Sync, a free peer-to-peer software designed to optimize data syncronization between distributed nodes. All data folders are syncronized with the central repository in real time to allow an immediate reconstruction of the data and their publication in a monitoring webpage. We present the architecture and the functionalities of this data management system that provides a flexible environment for the specific needs of the EEE project.
Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics
Giacomoni, Franck; Le Corguillé, Gildas; Monsoor, Misharl; Landi, Marion; Pericard, Pierre; Pétéra, Mélanie; Duperier, Christophe; Tremblay-Franco, Marie; Martin, Jean-François; Jacob, Daniel; Goulitquer, Sophie; Thévenot, Etienne A.; Caron, Christophe
2015-01-01
Summary: The complex, rapidly evolving field of computational metabolomics calls for collaborative infrastructures where the large volume of new algorithms for data pre-processing, statistical analysis and annotation can be readily integrated whatever the language, evaluated on reference datasets and chained to build ad hoc workflows for users. We have developed Workflow4Metabolomics (W4M), the first fully open-source and collaborative online platform for computational metabolomics. W4M is a virtual research environment built upon the Galaxy web-based platform technology. It enables ergonomic integration, exchange and running of individual modules and workflows. Alternatively, the whole W4M framework and computational tools can be downloaded as a virtual machine for local installation. Availability and implementation: http://workflow4metabolomics.org homepage enables users to open a private account and access the infrastructure. W4M is developed and maintained by the French Bioinformatics Institute (IFB) and the French Metabolomics and Fluxomics Infrastructure (MetaboHUB). Contact: contact@workflow4metabolomics.org PMID:25527831
Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.
Giacomoni, Franck; Le Corguillé, Gildas; Monsoor, Misharl; Landi, Marion; Pericard, Pierre; Pétéra, Mélanie; Duperier, Christophe; Tremblay-Franco, Marie; Martin, Jean-François; Jacob, Daniel; Goulitquer, Sophie; Thévenot, Etienne A; Caron, Christophe
2015-05-01
The complex, rapidly evolving field of computational metabolomics calls for collaborative infrastructures where the large volume of new algorithms for data pre-processing, statistical analysis and annotation can be readily integrated whatever the language, evaluated on reference datasets and chained to build ad hoc workflows for users. We have developed Workflow4Metabolomics (W4M), the first fully open-source and collaborative online platform for computational metabolomics. W4M is a virtual research environment built upon the Galaxy web-based platform technology. It enables ergonomic integration, exchange and running of individual modules and workflows. Alternatively, the whole W4M framework and computational tools can be downloaded as a virtual machine for local installation. http://workflow4metabolomics.org homepage enables users to open a private account and access the infrastructure. W4M is developed and maintained by the French Bioinformatics Institute (IFB) and the French Metabolomics and Fluxomics Infrastructure (MetaboHUB). contact@workflow4metabolomics.org. © The Author 2014. Published by Oxford University Press.
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2013-05-01
Healthcare institutions worldwide have adopted picture archiving and communication system (PACS) for enterprise access to images, relying on Digital Imaging Communication in Medicine (DICOM) standards for data exchange. However, communication over a wider domain of independent medical institutions is not well standardized. A DICOM-compliant bridge was developed for extending and sharing DICOM services across healthcare institutions without requiring complex network setups or dedicated communication channels. A set of DICOM routers interconnected through a public cloud infrastructure was implemented to support medical image exchange among institutions. Despite the advantages of cloud computing, new challenges were encountered regarding data privacy, particularly when medical data are transmitted over different domains. To address this issue, a solution was introduced by creating a ciphered data channel between the entities sharing DICOM services. Two main DICOM services were implemented in the bridge: Storage and Query/Retrieve. The performance measures demonstrated it is quite simple to exchange information and processes between several institutions. The solution can be integrated with any currently installed PACS-DICOM infrastructure. This method works transparently with well-known cloud service providers. Cloud computing was introduced to augment enterprise PACS by providing standard medical imaging services across different institutions, offering communication privacy and enabling creation of wider PACS scenarios with suitable technical solutions.
Feasibility of Virtual Machine and Cloud Computing Technologies for High Performance Computing
2014-05-01
Hat Enterprise Linux SaaS software as a service VM virtual machine vNUMA virtual non-uniform memory access WRF weather research and forecasting...previously mentioned in Chapter I Section B1 of this paper, which is used to run the weather research and forecasting ( WRF ) model in their experiments...against a VMware virtualization solution of WRF . The experiment consisted of running WRF in a standard configuration between the D-VTM and VMware while
ATLAS computing on Swiss Cloud SWITCHengines
NASA Astrophysics Data System (ADS)
Haug, S.; Sciacca, F. G.; ATLAS Collaboration
2017-10-01
Consolidation towards more computing at flat budgets beyond what pure chip technology can offer, is a requirement for the full scientific exploitation of the future data from the Large Hadron Collider at CERN in Geneva. One consolidation measure is to exploit cloud infrastructures whenever they are financially competitive. We report on the technical solutions and the performances used and achieved running simulation tasks for the ATLAS experiment on SWITCHengines. SWITCHengines is a new infrastructure as a service offered to Swiss academia by the National Research and Education Network SWITCH. While solutions and performances are general, financial considerations and policies, on which we also report, are country specific.
Neves Tafula, Sérgio M; Moreira da Silva, Nádia; Rozanski, Verena E; Silva Cunha, João Paulo
2014-01-01
Neuroscience is an increasingly multidisciplinary and highly cooperative field where neuroimaging plays an important role. Neuroimaging rapid evolution is demanding for a growing number of computing resources and skills that need to be put in place at every lab. Typically each group tries to setup their own servers and workstations to support their neuroimaging needs, having to learn from Operating System management to specific neuroscience software tools details before any results can be obtained from each setup. This setup and learning process is replicated in every lab, even if a strong collaboration among several groups is going on. In this paper we present a new cloud service model - Brain Imaging Application as a Service (BiAaaS) - and one of its implementation - Advanced Brain Imaging Lab (ABrIL) - in the form of an ubiquitous virtual desktop remote infrastructure that offers a set of neuroimaging computational services in an interactive neuroscientist-friendly graphical user interface (GUI). This remote desktop has been used for several multi-institution cooperative projects with different neuroscience objectives that already achieved important results, such as the contribution to a high impact paper published in the January issue of the Neuroimage journal. The ABrIL system has shown its applicability in several neuroscience projects with a relatively low-cost, promoting truly collaborative actions and speeding up project results and their clinical applicability.
NASA Astrophysics Data System (ADS)
Garov, A. S.; Karachevtseva, I. P.; Matveev, E. V.; Zubarev, A. E.; Florinsky, I. V.
2016-06-01
We are developing a unified distributed communication environment for processing of spatial data which integrates web-, desktop- and mobile platforms and combines volunteer computing model and public cloud possibilities. The main idea is to create a flexible working environment for research groups, which may be scaled according to required data volume and computing power, while keeping infrastructure costs at minimum. It is based upon the "single window" principle, which combines data access via geoportal functionality, processing possibilities and communication between researchers. Using an innovative software environment the recently developed planetary information system (http://cartsrv.mexlab.ru/geoportal) will be updated. The new system will provide spatial data processing, analysis and 3D-visualization and will be tested based on freely available Earth remote sensing data as well as Solar system planetary images from various missions. Based on this approach it will be possible to organize the research and representation of results on a new technology level, which provides more possibilities for immediate and direct reuse of research materials, including data, algorithms, methodology, and components. The new software environment is targeted at remote scientific teams, and will provide access to existing spatial distributed information for which we suggest implementation of a user interface as an advanced front-end, e.g., for virtual globe system.
NASA Technical Reports Server (NTRS)
Kempler, Steven; Stephens, Graeme; Winkler, Dave; Leptoukh, Greg; Reinke, Don; Smith, Peter
2006-01-01
The succession of US and international Earth observing satellites that follow each other, seconds to minutes apart, across the local afternoon equator crossing is called the ATrain. The A-Train consists of the following satellites, in order of equator crossing: OCO, EOS Aqua, CloudSat, CALIPSO, PARASOL, and EOS Aura. Flying in such formation increases the number of observations, validates observations, and enables coordination between science observations, resulting in a more complete virtual science platform (Kelly, 2000). The goal of this project is to create the first ever A-Train virtual data portal/center, the A-Train Data Depot (ATDD), to process, archive, access, visualize, analyze and correlate distributed atmosphere measurements from various A-Train instruments along A-Train tracks. The ATDD will enable the free movement of remotely located A-Train data so that they are combined to create a consolidated vertical view of the Earth's Atmosphere along the A-Train tracks. Once the infrastructure of the ATDD is in place, it will be easily evolved to serve data from all A-Train data measurements: one stop shopping. The innovative approach of analyzing and visualizing atmospheric profiles along the platforms track (i.e., time) will be accommodated by reusing the GSFC Atmospheric Composition Data and Information Services Center (ACDISC) visualization and analysis tool, GIOVANNI, existing data reduction tools, on-line archiving for fast data access, access to remote data without unnecessary data transfers, and data retrieval by users finding data desirable for further study. Initial measurements utilized include CALIPSO lidar backscatter, CloudSat radar reflectivity, clear air relative humidity, water vapor and temperature from AIRS, and cloud properties and aerosols from both MODIS. This will be foilowed by associated measurements from TVILS, =MI, HIRDLS, sad TES. Given the independent nature of instrumentlplatform development, the ATDD project has been met with many interesting challenges that, once resolved, will provide a much greater understanding of the relative flight dynamics and data co-registration of the suite of A-Train instruments, thus greatly increasing the accuracy of A-Train data analysis. Some of these challenges will be illustrated and discussed. The project's early visualizations and analysis efforts illustrate the importance of managing data so that measurements from various missions can be combined to enhance the understanding of the atmosphere. A-Train data management coordination, as performed here, is extremely significant in facilitating the A-Train science of clouds, precipitation, aerosol and chemistry.
A-Train Data Depot: Integrating and Visualizing Atmospheric Measurements Along the A-Train Tracks
NASA Technical Reports Server (NTRS)
Kempler, Steven; Stephens, Graeme; Winker, Dave; Leptoukh, Greg; Reinke, Don; Smith, Peter
2006-01-01
The succession of US and international satellites that follow each other, seconds to minutes apart, across the local afternoon equator crossing is called the ATrain. The A-Train consists of the following satellites, in order of equator crossing: OCO, EOS Aqua, CloudSat, CALIPSO, PARASOL, and EOS Aura. Flying in such formation increases the number of observations, validates observations, and enables coordination between science observations, resulting in a more complete virtual science platform (Kelly, 2000) The goal of this project is to create the first ever A-Train virtual data portal/center, the A-Train Data Depot, to process, archive, access, visualize, analyze and correlate distributed atmosphere measurements from various A-Train instruments along A-Train tracks. The A-Train Data Depot (ATDD) will enable the free movement of remotely located A-Train data so that they are combined to create a consolidated vertical view of the Earth s Atmosphere along the A-Train tracks. Once the infrastructure of the ATDD is in place, it will be easily evolved to serve data from all A-Train data measurements: one stop shopping. The innovative approach of analyzing and visualizing atmospheric profiles along the platforms track (i.e., time) will be accommodated by reusing the GSFC Atmospheric Composition Data and Information Services Center (ACDISC) visualization and analysis tool, GIOVANNI, existing data reduction tools, on-line archwing for fast data access, and Cooperative Institute for Research in the Atmosphere (CRA) data co-registration tools. Initial measurements utilized include CALIPSO lidar backscatter, CloudSat radar reflectivity, clear air relative humidity, water vapor and temperature from AIRS, and cloud properties and aerosols from both MODIS. This will be followed by associated measurements from MLS, OMI, HIRDLS, and TES. Given the independent nature of instrument/platform development, the ATDD project has been met with many interesting challenges that, once resolved, will provide a much greater understanding of the relative flight dynamics and data co-registration of the suite of A-Train instruments, thus greatly increasing the accuracy of A-Train data analysis. Some of these challenges will be discussed. The project s resulting visualizations and analysis illustrate the importance of managing data so that measurements from various missions can be combined to enhance the understanding of the atmosphere. A-Train data management coordination, as performed here, is extremely significant in facilitating the A-Train science of clouds, precipitation, aerosol and chemistry.
NASA Astrophysics Data System (ADS)
Kempler, S.; Stephens, G.; Winker, D.; Leptoukh, G.; Reinke, D.; Smith, P.; Savtchenko, A.; Kummerer, R.; Mao, J.
2006-12-01
The succession of US and international Earth observing satellites that follow each other, seconds to minutes apart, across the local afternoon equator crossing is called the A-Train. The A-Train consists of the following satellites, in order of equator crossing: OCO, EOS Aqua, CloudSat, CALIPSO, PARASOL, and EOS Aura. Flying in such formation increases the number of observations, validates observations, and enables coordination between science observations, resulting in a more complete virtual science platform (Kelly, 2000) The goal of this project is to create the first ever A-Train virtual data portal/center, the A-Train Data Depot (ATDD), to process, archive, access, visualize, analyze and correlate distributed atmosphere measurements from various A-Train instruments along A-Train tracks. The ATDD will enable the free movement of remotely located A-Train data so that they are combined to create a consolidated vertical view of the Earth's Atmosphere along the A-Train tracks. Once the infrastructure of the ATDD is in place, it will be easily evolved to serve data from all A-Train data measurements: one stop shopping. The innovative approach of analyzing and visualizing atmospheric profiles along the platforms track (i.e., time) will be accommodated by reusing the GSFC Atmospheric Composition Data and Information Services Center (ACDISC) visualization and analysis tool, GIOVANNI, existing data reduction tools, on-line archiving for fast data access, access to remote data without unnecessary data transfers, and data retrieval by users finding data desirable for further study. Initial measurements utilized include CALIPSO lidar backscatter, CloudSat radar reflectivity, clear air relative humidity, water vapor and temperature from AIRS, and cloud properties and aerosols from both MODIS. This will be followed by associated measurements from MLS, OMI, HIRDLS, and TES. Given the independent nature of instrument/platform development, the ATDD project has been met with many interesting challenges that, once resolved, will provide a much greater understanding of the relative flight dynamics and data co-registration of the suite of A-Train instruments, thus greatly increasing the accuracy of A-Train data analysis. Some of these challenges will be illustrated and discussed. The project's early visualizations and analysis efforts illustrate the importance of managing data so that measurements from various missions can be combined to enhance the understanding of the atmosphere. A-Train data management coordination, as performed here, is extremely significant in facilitating the A-Train science of clouds, precipitation, aerosol and chemistry.
Enabling Research without Geographical Boundaries via Collaborative Research Infrastructures
NASA Astrophysics Data System (ADS)
Gesing, S.
2016-12-01
Collaborative research infrastructures on global scale for earth and space sciences face a plethora of challenges from technical implementations to organizational aspects. Science gateways - also known as virtual research environments (VREs) or virtual laboratories - address part of such challenges by providing end-to-end solutions to aid researchers to focus on their specific research questions without the need to become acquainted with the technical details of the complex underlying infrastructures. In general, they provide a single point of entry to tools and data irrespective of organizational boundaries and thus make scientific discoveries easier and faster. The importance of science gateways has been recognized on national as well as on international level by funding bodies and by organizations. For example, the US NSF has just funded a Science Gateways Community Institute, which offers support, consultancy and open accessible software repositories for users and developers; Horizon 2020 provides funding for virtual research environments in Europe, which has led to projects such as VRE4EIC (A Europe-wide Interoperable Virtual Research Environment to Empower Multidisciplinary Research Communities and Accelerate Innovation and Collaboration); national or continental research infrastructures such as XSEDE in the USA, Nectar in Australia or EGI in Europe support the development and uptake of science gateways; the global initiatives International Coalition on Science Gateways, the RDA Virtual Research Environment Interest Group as well as the IEEE Technical Area on Science Gateways have been founded to provide global leadership on future directions for science gateways in general and facilitate awareness for science gateways. This presentation will give an overview on these projects and initiatives aiming at supporting domain researchers and developers with measures for the efficient creation of science gateways, for increasing their usability and sustainability under consideration of the breadth of topics in the context of science gateways. It will go into detail for the challenges the community faces for collaborative research on global scale without geographical boundaries and will provide suggestions for further enhancing the outreach to domain researchers.
Business Case Analysis of the Marine Corps Base Pendleton Virtual Smart Grid
2017-06-01
Metering Infrastructure on DOD installations. An examination of five case studies highlights the costs and benefits of the Virtual Smart Grid (VSG...studies highlights the costs and benefits of the Virtual Smart Grid (VSG) developed by Space and Naval Warfare Systems Command for use at Marine Corps...41 A. SMART GRID BENEFITS .....................................................................41 B. SUMMARY OF VSG ESTIMATED COSTS AND BENEFITS
Desktop Virtualization: Applications and Considerations
ERIC Educational Resources Information Center
Hodgman, Matthew R.
2013-01-01
As educational technology continues to rapidly become a vital part of a school district's infrastructure, desktop virtualization promises to provide cost-effective and education-enhancing solutions to school-based computer technology problems in school systems locally and abroad. This article outlines the history of and basic concepts behind…
ERIC Educational Resources Information Center
Sclater, Niall
2010-01-01
Elearning has grown rapidly in importance for institutions and has been largely facilitated through the "walled garden" of the virtual learning environment. Meanwhile many students are creating their own personal learning environments by combining the various Web 2.0 services they find most useful. Cloud computing offers new…
Realistic natural atmospheric phenomena and weather effects for interactive virtual environments
NASA Astrophysics Data System (ADS)
McLoughlin, Leigh
Clouds and the weather are important aspects of any natural outdoor scene, but existing dynamic techniques within computer graphics only offer the simplest of cloud representations. The problem that this work looks to address is how to provide a means of simulating clouds and weather features such as precipitation, that are suitable for virtual environments. Techniques for cloud simulation are available within the area of meteorology, but numerical weather prediction systems are computationally expensive, give more numerical accuracy than we require for graphics and are restricted to the laws of physics. Within computer graphics, we often need to direct and adjust physical features or to bend reality to meet artistic goals, which is a key difference between the subjects of computer graphics and physical science. Pure physically-based simulations, however, evolve their solutions according to pre-set rules and are notoriously difficult to control. The challenge then is for the solution to be computationally lightweight and able to be directed in some measure while at the same time producing believable results. This work presents a lightweight physically-based cloud simulation scheme that simulates the dynamic properties of cloud formation and weather effects. The system simulates water vapour, cloud water, cloud ice, rain, snow and hail. The water model incorporates control parameters and the cloud model uses an arbitrary vertical temperature profile, with a tool described to allow the user to define this. The result of this work is that clouds can now be simulated in near real-time complete with precipitation. The temperature profile and tool then provide a means of directing the resulting formation..
A Novel Market-Oriented Dynamic Collaborative Cloud Service Platform
NASA Astrophysics Data System (ADS)
Hassan, Mohammad Mehedi; Huh, Eui-Nam
In today's world the emerging Cloud computing (Weiss, 2007) offer a new computing model where resources such as computing power, storage, online applications and networking infrastructures can be shared as "services" over the internet. Cloud providers (CPs) are incentivized by the profits to be made by charging consumers for accessing these services. Consumers, such as enterprises, are attracted by the opportunity for reducing or eliminating costs associated with "in-house" provision of these services.
The CMS Tier0 goes cloud and grid for LHC Run 2
Hufnagel, Dirk
2015-12-23
In 2015, CMS will embark on a new era of collecting LHC collisions at unprecedented rates and complexity. This will put a tremendous stress on our computing systems. Prompt Processing of the raw data by the Tier-0 infrastructure will no longer be constrained to CERN alone due to the significantly increased resource requirements. In LHC Run 2, we will need to operate it as a distributed system utilizing both the CERN Cloud-based Agile Infrastructure and a significant fraction of the CMS Tier-1 Grid resources. In another big change for LHC Run 2, we will process all data using the multi-threadedmore » framework to deal with the increased event complexity and to ensure efficient use of the resources. Furthermore, this contribution will cover the evolution of the Tier-0 infrastructure and present scale testing results and experiences from the first data taking in 2015.« less
Cafe: A Generic Configurable Customizable Composite Cloud Application Framework
NASA Astrophysics Data System (ADS)
Mietzner, Ralph; Unger, Tobias; Leymann, Frank
In this paper we present Cafe (Composite Application Framework) an approach to describe configurable composite service-oriented applications and to automatically provision them across different providers. Cafe enables independent software vendors to describe their composite service-oriented applications and the components that are used to assemble them. Components can be internal to the application or external and can be deployed in any of the delivery models present in the cloud. The components are annotated with requirements for the infrastructure they later need to be run on. Providers on the other hand advertise their infrastructure services by describing them as infrastructure capabilities. The separation of software vendors and providers enables end users and providers to follow a best-of-breed strategy by combining arbitrary applications with arbitrary providers. We show how such applications can be automatically provisioned and present an architecture and a prototype that implements the concepts.
The CMS TierO goes Cloud and Grid for LHC Run 2
NASA Astrophysics Data System (ADS)
Hufnagel, Dirk
2015-12-01
In 2015, CMS will embark on a new era of collecting LHC collisions at unprecedented rates and complexity. This will put a tremendous stress on our computing systems. Prompt Processing of the raw data by the Tier-0 infrastructure will no longer be constrained to CERN alone due to the significantly increased resource requirements. In LHC Run 2, we will need to operate it as a distributed system utilizing both the CERN Cloud-based Agile Infrastructure and a significant fraction of the CMS Tier-1 Grid resources. In another big change for LHC Run 2, we will process all data using the multi-threaded framework to deal with the increased event complexity and to ensure efficient use of the resources. This contribution will cover the evolution of the Tier-0 infrastructure and present scale testing results and experiences from the first data taking in 2015.
A cyber infrastructure for the SKA Telescope Manager
NASA Astrophysics Data System (ADS)
Barbosa, Domingos; Barraca, João. P.; Carvalho, Bruno; Maia, Dalmiro; Gupta, Yashwant; Natarajan, Swaminathan; Le Roux, Gerhard; Swart, Paul
2016-07-01
The Square Kilometre Array Telescope Manager (SKA TM) will be responsible for assisting the SKA Operations and Observation Management, carrying out System diagnosis and collecting Monitoring and Control data from the SKA subsystems and components. To provide adequate compute resources, scalability, operation continuity and high availability, as well as strict Quality of Service, the TM cyber-infrastructure (embodied in the Local Infrastructure - LINFRA) consists of COTS hardware and infrastructural software (for example: server monitoring software, host operating system, virtualization software, device firmware), providing a specially tailored Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) solution. The TM infrastructure provides services in the form of computational power, software defined networking, power, storage abstractions, and high level, state of the art IaaS and PaaS management interfaces. This cyber platform will be tailored to each of the two SKA Phase 1 telescopes (SKA_MID in South Africa and SKA_LOW in Australia) instances, each presenting different computational and storage infrastructures and conditioned by location. This cyber platform will provide a compute model enabling TM to manage the deployment and execution of its multiple components (observation scheduler, proposal submission tools, MandC components, Forensic tools and several Databases, etc). In this sense, the TM LINFRA is primarily focused towards the provision of isolated instances, mostly resorting to virtualization technologies, while defaulting to bare hardware if specifically required due to performance, security, availability, or other requirement.