Sample records for cloud computing platform

  1. Energy Consumption Management of Virtual Cloud Computing Platform

    NASA Astrophysics Data System (ADS)

    Li, Lin

    2017-11-01

    For energy consumption management research on virtual cloud computing platforms, energy consumption management of virtual computers and cloud computing platform should be understood deeper. Only in this way can problems faced by energy consumption management be solved. In solving problems, the key to solutions points to data centers with high energy consumption, so people are in great need to use a new scientific technique. Virtualization technology and cloud computing have become powerful tools in people’s real life, work and production because they have strong strength and many advantages. Virtualization technology and cloud computing now is in a rapid developing trend. It has very high resource utilization rate. In this way, the presence of virtualization and cloud computing technologies is very necessary in the constantly developing information age. This paper has summarized, explained and further analyzed energy consumption management questions of the virtual cloud computing platform. It eventually gives people a clearer understanding of energy consumption management of virtual cloud computing platform and brings more help to various aspects of people’s live, work and son on.

  2. Uncover the Cloud for Geospatial Sciences and Applications to Adopt Cloud Computing

    NASA Astrophysics Data System (ADS)

    Yang, C.; Huang, Q.; Xia, J.; Liu, K.; Li, J.; Xu, C.; Sun, M.; Bambacus, M.; Xu, Y.; Fay, D.

    2012-12-01

    Cloud computing is emerging as the future infrastructure for providing computing resources to support and enable scientific research, engineering development, and application construction, as well as work force education. On the other hand, there is a lot of doubt about the readiness of cloud computing to support a variety of scientific research, development and educations. This research is a project funded by NASA SMD to investigate through holistic studies how ready is the cloud computing to support geosciences. Four applications with different computing characteristics including data, computing, concurrent, and spatiotemporal intensities are taken to test the readiness of cloud computing to support geosciences. Three popular and representative cloud platforms including Amazon EC2, Microsoft Azure, and NASA Nebula as well as a traditional cluster are utilized in the study. Results illustrates that cloud is ready to some degree but more research needs to be done to fully implemented the cloud benefit as advertised by many vendors and defined by NIST. Specifically, 1) most cloud platform could help stand up new computing instances, a new computer, in a few minutes as envisioned, therefore, is ready to support most computing needs in an on demand fashion; 2) the load balance and elasticity, a defining characteristic, is ready in some cloud platforms, such as Amazon EC2, to support bigger jobs, e.g., needs response in minutes, while some are not ready to support the elasticity and load balance well. All cloud platform needs further research and development to support real time application at subminute level; 3) the user interface and functionality of cloud platforms vary a lot and some of them are very professional and well supported/documented, such as Amazon EC2, some of them needs significant improvement for the general public to adopt cloud computing without professional training or knowledge about computing infrastructure; 4) the security is a big concern in cloud computing platform, with the sharing spirit of cloud computing, it is very hard to ensure higher level security, except a private cloud is built for a specific organization without public access, public cloud platform does not support FISMA medium level yet and may never be able to support FISMA high level; 5) HPC jobs needs of cloud computing is not well supported and only Amazon EC2 supports this well. The research is being taken by NASA and other agencies to consider cloud computing adoption. We hope the publication of the research would also benefit the public to adopt cloud computing.

  3. Cloud Based Applications and Platforms (Presentation)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brodt-Giles, D.

    2014-05-15

    Presentation to the Cloud Computing East 2014 Conference, where we are highlighting our cloud computing strategy, describing the platforms on the cloud (including Smartgrid.gov), and defining our process for implementing cloud based applications.

  4. Research on private cloud computing based on analysis on typical opensource platform: a case study with Eucalyptus and Wavemaker

    NASA Astrophysics Data System (ADS)

    Yu, Xiaoyuan; Yuan, Jian; Chen, Shi

    2013-03-01

    Cloud computing is one of the most popular topics in the IT industry and is recently being adopted by many companies. It has four development models, as: public cloud, community cloud, hybrid cloud and private cloud. Except others, private cloud can be implemented in a private network, and delivers some benefits of cloud computing without pitfalls. This paper makes a comparison of typical open source platforms through which we can implement a private cloud. After this comparison, we choose Eucalyptus and Wavemaker to do a case study on the private cloud. We also do some performance estimation of cloud platform services and development of prototype software as cloud services.

  5. A high performance scientific cloud computing environment for materials simulations

    NASA Astrophysics Data System (ADS)

    Jorissen, K.; Vila, F. D.; Rehr, J. J.

    2012-09-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

  6. Study on the application of mobile internet cloud computing platform

    NASA Astrophysics Data System (ADS)

    Gong, Songchun; Fu, Songyin; Chen, Zheng

    2012-04-01

    The innovative development of computer technology promotes the application of the cloud computing platform, which actually is the substitution and exchange of a sort of resource service models and meets the needs of users on the utilization of different resources after changes and adjustments of multiple aspects. "Cloud computing" owns advantages in many aspects which not merely reduce the difficulties to apply the operating system and also make it easy for users to search, acquire and process the resources. In accordance with this point, the author takes the management of digital libraries as the research focus in this paper, and analyzes the key technologies of the mobile internet cloud computing platform in the operation process. The popularization and promotion of computer technology drive people to create the digital library models, and its core idea is to strengthen the optimal management of the library resource information through computers and construct an inquiry and search platform with high performance, allowing the users to access to the necessary information resources at any time. However, the cloud computing is able to promote the computations within the computers to distribute in a large number of distributed computers, and hence implement the connection service of multiple computers. The digital libraries, as a typical representative of the applications of the cloud computing, can be used to carry out an analysis on the key technologies of the cloud computing.

  7. Cloud Computing Fundamentals

    NASA Astrophysics Data System (ADS)

    Furht, Borko

    In the introductory chapter we define the concept of cloud computing and cloud services, and we introduce layers and types of cloud computing. We discuss the differences between cloud computing and cloud services. New technologies that enabled cloud computing are presented next. We also discuss cloud computing features, standards, and security issues. We introduce the key cloud computing platforms, their vendors, and their offerings. We discuss cloud computing challenges and the future of cloud computing.

  8. Transitioning ISR architecture into the cloud

    NASA Astrophysics Data System (ADS)

    Lash, Thomas D.

    2012-06-01

    Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.

  9. GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

    PubMed

    Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

    2013-01-28

    Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.

  10. Homomorphic encryption experiments on IBM's cloud quantum computing platform

    NASA Astrophysics Data System (ADS)

    Huang, He-Liang; Zhao, You-Wei; Li, Tan; Li, Feng-Guang; Du, Yu-Tao; Fu, Xiang-Qun; Zhang, Shuo; Wang, Xiang; Bao, Wan-Su

    2017-02-01

    Quantum computing has undergone rapid development in recent years. Owing to limitations on scalability, personal quantum computers still seem slightly unrealistic in the near future. The first practical quantum computer for ordinary users is likely to be on the cloud. However, the adoption of cloud computing is possible only if security is ensured. Homomorphic encryption is a cryptographic protocol that allows computation to be performed on encrypted data without decrypting them, so it is well suited to cloud computing. Here, we first applied homomorphic encryption on IBM's cloud quantum computer platform. In our experiments, we successfully implemented a quantum algorithm for linear equations while protecting our privacy. This demonstration opens a feasible path to the next stage of development of cloud quantum information technology.

  11. A cloud computing based platform for sleep behavior and chronic diseases collaborative research.

    PubMed

    Kuo, Mu-Hsing; Borycki, Elizabeth; Kushniruk, Andre; Huang, Yueh-Min; Hung, Shu-Hui

    2014-01-01

    The objective of this study is to propose a Cloud Computing based platform for sleep behavior and chronic disease collaborative research. The platform consists of two main components: (1) a sensing bed sheet with textile sensors to automatically record patient's sleep behaviors and vital signs, and (2) a service-oriented cloud computing architecture (SOCCA) that provides a data repository and allows for sharing and analysis of collected data. Also, we describe our systematic approach to implementing the SOCCA. We believe that the new cloud-based platform can provide nurse and other health professional researchers located in differing geographic locations with a cost effective, flexible, secure and privacy-preserved research environment.

  12. Global Software Development with Cloud Platforms

    NASA Astrophysics Data System (ADS)

    Yara, Pavan; Ramachandran, Ramaseshan; Balasubramanian, Gayathri; Muthuswamy, Karthik; Chandrasekar, Divya

    Offshore and outsourced distributed software development models and processes are facing challenges, previously unknown, with respect to computing capacity, bandwidth, storage, security, complexity, reliability, and business uncertainty. Clouds promise to address these challenges by adopting recent advances in virtualization, parallel and distributed systems, utility computing, and software services. In this paper, we envision a cloud-based platform that addresses some of these core problems. We outline a generic cloud architecture, its design and our first implementation results for three cloud forms - a compute cloud, a storage cloud and a cloud-based software service- in the context of global distributed software development (GSD). Our ”compute cloud” provides computational services such as continuous code integration and a compile server farm, ”storage cloud” offers storage (block or file-based) services with an on-line virtual storage service, whereas the on-line virtual labs represent a useful cloud service. We note some of the use cases for clouds in GSD, the lessons learned with our prototypes and identify challenges that must be conquered before realizing the full business benefits. We believe that in the future, software practitioners will focus more on these cloud computing platforms and see clouds as a means to supporting a ecosystem of clients, developers and other key stakeholders.

  13. [The Key Technology Study on Cloud Computing Platform for ECG Monitoring Based on Regional Internet of Things].

    PubMed

    Yang, Shu; Qiu, Yuyan; Shi, Bo

    2016-09-01

    This paper explores the methods of building the internet of things of a regional ECG monitoring, focused on the implementation of ECG monitoring center based on cloud computing platform. It analyzes implementation principles of automatic identifi cation in the types of arrhythmia. It also studies the system architecture and key techniques of cloud computing platform, including server load balancing technology, reliable storage of massive smalfi les and the implications of quick search function.

  14. Application verification research of cloud computing technology in the field of real time aerospace experiment

    NASA Astrophysics Data System (ADS)

    Wan, Junwei; Chen, Hongyan; Zhao, Jing

    2017-08-01

    According to the requirements of real-time, reliability and safety for aerospace experiment, the single center cloud computing technology application verification platform is constructed. At the IAAS level, the feasibility of the cloud computing technology be applied to the field of aerospace experiment is tested and verified. Based on the analysis of the test results, a preliminary conclusion is obtained: Cloud computing platform can be applied to the aerospace experiment computing intensive business. For I/O intensive business, it is recommended to use the traditional physical machine.

  15. Cloud Computing for Geosciences--GeoCloud for standardized geospatial service platforms (Invited)

    NASA Astrophysics Data System (ADS)

    Nebert, D. D.; Huang, Q.; Yang, C.

    2013-12-01

    The 21st century geoscience faces challenges of Big Data, spike computing requirements (e.g., when natural disaster happens), and sharing resources through cyberinfrastructure across different organizations (Yang et al., 2011). With flexibility and cost-efficiency of computing resources a primary concern, cloud computing emerges as a promising solution to provide core capabilities to address these challenges. Many governmental and federal agencies are adopting cloud technologies to cut costs and to make federal IT operations more efficient (Huang et al., 2010). However, it is still difficult for geoscientists to take advantage of the benefits of cloud computing to facilitate the scientific research and discoveries. This presentation reports using GeoCloud to illustrate the process and strategies used in building a common platform for geoscience communities to enable the sharing, integration of geospatial data, information and knowledge across different domains. GeoCloud is an annual incubator project coordinated by the Federal Geographic Data Committee (FGDC) in collaboration with the U.S. General Services Administration (GSA) and the Department of Health and Human Services. It is designed as a staging environment to test and document the deployment of a common GeoCloud community platform that can be implemented by multiple agencies. With these standardized virtual geospatial servers, a variety of government geospatial applications can be quickly migrated to the cloud. In order to achieve this objective, multiple projects are nominated each year by federal agencies as existing public-facing geospatial data services. From the initial candidate projects, a set of common operating system and software requirements was identified as the baseline for platform as a service (PaaS) packages. Based on these developed common platform packages, each project deploys and monitors its web application, develops best practices, and documents cost and performance information. This paper presents the background, architectural design, and activities of GeoCloud in support of the Geospatial Platform Initiative. System security strategies and approval processes for migrating federal geospatial data, information, and applications into cloud, and cost estimation for cloud operations are covered. Finally, some lessons learned from the GeoCloud project are discussed as reference for geoscientists to consider in the adoption of cloud computing.

  16. Cloud Computing with iPlant Atmosphere.

    PubMed

    McKay, Sheldon J; Skidmore, Edwin J; LaRose, Christopher J; Mercer, Andre W; Noutsos, Christos

    2013-10-15

    Cloud Computing refers to distributed computing platforms that use virtualization software to provide easy access to physical computing infrastructure and data storage, typically administered through a Web interface. Cloud-based computing provides access to powerful servers, with specific software and virtual hardware configurations, while eliminating the initial capital cost of expensive computers and reducing the ongoing operating costs of system administration, maintenance contracts, power consumption, and cooling. This eliminates a significant barrier to entry into bioinformatics and high-performance computing for many researchers. This is especially true of free or modestly priced cloud computing services. The iPlant Collaborative offers a free cloud computing service, Atmosphere, which allows users to easily create and use instances on virtual servers preconfigured for their analytical needs. Atmosphere is a self-service, on-demand platform for scientific computing. This unit demonstrates how to set up, access and use cloud computing in Atmosphere. Copyright © 2013 John Wiley & Sons, Inc.

  17. Hybrid Cloud Computing Environment for EarthCube and Geoscience Community

    NASA Astrophysics Data System (ADS)

    Yang, C. P.; Qin, H.

    2016-12-01

    The NSF EarthCube Integration and Test Environment (ECITE) has built a hybrid cloud computing environment to provides cloud resources from private cloud environments by using cloud system software - OpenStack and Eucalyptus, and also manages public cloud - Amazon Web Service that allow resource synchronizing and bursting between private and public cloud. On ECITE hybrid cloud platform, EarthCube and geoscience community can deploy and manage the applications by using base virtual machine images or customized virtual machines, analyze big datasets by using virtual clusters, and real-time monitor the virtual resource usage on the cloud. Currently, a number of EarthCube projects have deployed or started migrating their projects to this platform, such as CHORDS, BCube, CINERGI, OntoSoft, and some other EarthCube building blocks. To accomplish the deployment or migration, administrator of ECITE hybrid cloud platform prepares the specific needs (e.g. images, port numbers, usable cloud capacity, etc.) of each project in advance base on the communications between ECITE and participant projects, and then the scientists or IT technicians in those projects launch one or multiple virtual machines, access the virtual machine(s) to set up computing environment if need be, and migrate their codes, documents or data without caring about the heterogeneity in structure and operations among different cloud platforms.

  18. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

    PubMed

    Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian

    2011-08-30

    Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.

  19. Cloud computing for comparative genomics with windows azure platform.

    PubMed

    Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.

  20. Cloud Computing for Comparative Genomics with Windows Azure Platform

    PubMed Central

    Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd F.; Nelson, Tristan H.; Wall, Dennis P.

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609

  1. Cloud Computing for DoD

    DTIC Science & Technology

    2012-05-01

    cloud computing 17 NASA Nebula Platform •  Cloud computing pilot program at NASA Ames •  Integrates open-source components into seamless, self...Mission support •  Education and public outreach (NASA Nebula , 2010) 18 NSF Supported Cloud Research •  Support for Cloud Computing in...Mell, P. & Grance, T. (2011). The NIST Definition of Cloud Computing. NIST Special Publication 800-145 •  NASA Nebula (2010). Retrieved from

  2. Facilitating NASA Earth Science Data Processing Using Nebula Cloud Computing

    NASA Technical Reports Server (NTRS)

    Pham, Long; Chen, Aijun; Kempler, Steven; Lynnes, Christopher; Theobald, Michael; Asghar, Esfandiari; Campino, Jane; Vollmer, Bruce

    2011-01-01

    Cloud Computing has been implemented in several commercial arenas. The NASA Nebula Cloud Computing platform is an Infrastructure as a Service (IaaS) built in 2008 at NASA Ames Research Center and 2010 at GSFC. Nebula is an open source Cloud platform intended to: a) Make NASA realize significant cost savings through efficient resource utilization, reduced energy consumption, and reduced labor costs. b) Provide an easier way for NASA scientists and researchers to efficiently explore and share large and complex data sets. c) Allow customers to provision, manage, and decommission computing capabilities on an as-needed bases

  3. RAPPORT: running scientific high-performance computing applications on the cloud.

    PubMed

    Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

    2013-01-28

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.

  4. CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

    PubMed Central

    2011-01-01

    Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105

  5. Applications integration in a hybrid cloud computing environment: modelling and platform

    NASA Astrophysics Data System (ADS)

    Li, Qing; Wang, Ze-yuan; Li, Wei-hua; Li, Jun; Wang, Cheng; Du, Rui-yang

    2013-08-01

    With the development of application services providers and cloud computing, more and more small- and medium-sized business enterprises use software services and even infrastructure services provided by professional information service companies to replace all or part of their information systems (ISs). These information service companies provide applications, such as data storage, computing processes, document sharing and even management information system services as public resources to support the business process management of their customers. However, no cloud computing service vendor can satisfy the full functional IS requirements of an enterprise. As a result, enterprises often have to simultaneously use systems distributed in different clouds and their intra enterprise ISs. Thus, this article presents a framework to integrate applications deployed in public clouds and intra ISs. A run-time platform is developed and a cross-computing environment process modelling technique is also developed to improve the feasibility of ISs under hybrid cloud computing environments.

  6. [Porting Radiotherapy Software of Varian to Cloud Platform].

    PubMed

    Zou, Lian; Zhang, Weisha; Liu, Xiangxiang; Xie, Zhao; Xie, Yaoqin

    2017-09-30

    To develop a low-cost private cloud platform of radiotherapy software. First, a private cloud platform which was based on OpenStack and the virtual GPU hardware was builded. Then on the private cloud platform, all the Varian radiotherapy software modules were installed to the virtual machine, and the corresponding function configuration was completed. Finally the software on the cloud was able to be accessed by virtual desktop client. The function test results of the cloud workstation show that a cloud workstation is equivalent to an isolated physical workstation, and any clients on the LAN can use the cloud workstation smoothly. The cloud platform transplantation in this study is economical and practical. The project not only improves the utilization rates of radiotherapy software, but also makes it possible that the cloud computing technology can expand its applications to the field of radiation oncology.

  7. Investigation into Cloud Computing for More Robust Automated Bulk Image Geoprocessing

    NASA Technical Reports Server (NTRS)

    Brown, Richard B.; Smoot, James C.; Underwood, Lauren; Armstrong, C. Duane

    2012-01-01

    Geospatial resource assessments frequently require timely geospatial data processing that involves large multivariate remote sensing data sets. In particular, for disasters, response requires rapid access to large data volumes, substantial storage space and high performance processing capability. The processing and distribution of this data into usable information products requires a processing pipeline that can efficiently manage the required storage, computing utilities, and data handling requirements. In recent years, with the availability of cloud computing technology, cloud processing platforms have made available a powerful new computing infrastructure resource that can meet this need. To assess the utility of this resource, this project investigates cloud computing platforms for bulk, automated geoprocessing capabilities with respect to data handling and application development requirements. This presentation is of work being conducted by Applied Sciences Program Office at NASA-Stennis Space Center. A prototypical set of image manipulation and transformation processes that incorporate sample Unmanned Airborne System data were developed to create value-added products and tested for implementation on the "cloud". This project outlines the steps involved in creating and testing of open source software developed process code on a local prototype platform, and then transitioning this code with associated environment requirements into an analogous, but memory and processor enhanced cloud platform. A data processing cloud was used to store both standard digital camera panchromatic and multi-band image data, which were subsequently subjected to standard image processing functions such as NDVI (Normalized Difference Vegetation Index), NDMI (Normalized Difference Moisture Index), band stacking, reprojection, and other similar type data processes. Cloud infrastructure service providers were evaluated by taking these locally tested processing functions, and then applying them to a given cloud-enabled infrastructure to assesses and compare environment setup options and enabled technologies. This project reviews findings that were observed when cloud platforms were evaluated for bulk geoprocessing capabilities based on data handling and application development requirements.

  8. Cloud Computing in Support of Applied Learning: A Baseline Study of Infrastructure Design at Southern Polytechnic State University

    ERIC Educational Resources Information Center

    Conn, Samuel S.; Reichgelt, Han

    2013-01-01

    Cloud computing represents an architecture and paradigm of computing designed to deliver infrastructure, platforms, and software as constructible computing resources on demand to networked users. As campuses are challenged to better accommodate academic needs for applications and computing environments, cloud computing can provide an accommodating…

  9. Key Technology Research on Open Architecture for The Sharing of Heterogeneous Geographic Analysis Models

    NASA Astrophysics Data System (ADS)

    Yue, S. S.; Wen, Y. N.; Lv, G. N.; Hu, D.

    2013-10-01

    In recent years, the increasing development of cloud computing technologies laid critical foundation for efficiently solving complicated geographic issues. However, it is still difficult to realize the cooperative operation of massive heterogeneous geographical models. Traditional cloud architecture is apt to provide centralized solution to end users, while all the required resources are often offered by large enterprises or special agencies. Thus, it's a closed framework from the perspective of resource utilization. Solving comprehensive geographic issues requires integrating multifarious heterogeneous geographical models and data. In this case, an open computing platform is in need, with which the model owners can package and deploy their models into cloud conveniently, while model users can search, access and utilize those models with cloud facility. Based on this concept, the open cloud service strategies for the sharing of heterogeneous geographic analysis models is studied in this article. The key technology: unified cloud interface strategy, sharing platform based on cloud service, and computing platform based on cloud service are discussed in detail, and related experiments are conducted for further verification.

  10. Performance, Agility and Cost of Cloud Computing Services for NASA GES DISC Giovanni Application

    NASA Astrophysics Data System (ADS)

    Pham, L.; Chen, A.; Wharton, S.; Winter, E. L.; Lynnes, C.

    2013-12-01

    The NASA Goddard Earth Science Data and Information Services Center (GES DISC) is investigating the performance, agility and cost of Cloud computing for GES DISC applications. Giovanni (Geospatial Interactive Online Visualization ANd aNalysis Infrastructure), one of the core applications at the GES DISC for online climate-related Earth science data access, subsetting, analysis, visualization, and downloading, was used to evaluate the feasibility and effort of porting an application to the Amazon Cloud Services platform. The performance and the cost of running Giovanni on the Amazon Cloud were compared to similar parameters for the GES DISC local operational system. A Giovanni Time-Series analysis of aerosol absorption optical depth (388nm) from OMI (Ozone Monitoring Instrument)/Aura was selected for these comparisons. All required data were pre-cached in both the Cloud and local system to avoid data transfer delays. The 3-, 6-, 12-, and 24-month data were used for analysis on the Cloud and local system respectively, and the processing times for the analysis were used to evaluate system performance. To investigate application agility, Giovanni was installed and tested on multiple Cloud platforms. The cost of using a Cloud computing platform mainly consists of: computing, storage, data requests, and data transfer in/out. The Cloud computing cost is calculated based on the hourly rate, and the storage cost is calculated based on the rate of Gigabytes per month. Cost for incoming data transfer is free, and for data transfer out, the cost is based on the rate in Gigabytes. The costs for a local server system consist of buying hardware/software, system maintenance/updating, and operating cost. The results showed that the Cloud platform had a 38% better performance and cost 36% less than the local system. This investigation shows the potential of cloud computing to increase system performance and lower the overall cost of system management.

  11. CloudMan as a platform for tool, data, and analysis distribution.

    PubMed

    Afgan, Enis; Chapman, Brad; Taylor, James

    2012-11-27

    Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.

  12. ProteoCloud: a full-featured open source proteomics cloud computing pipeline.

    PubMed

    Muth, Thilo; Peters, Julian; Blackburn, Jonathan; Rapp, Erdmann; Martens, Lennart

    2013-08-02

    We here present the ProteoCloud pipeline, a freely available, full-featured cloud-based platform to perform computationally intensive, exhaustive searches in a cloud environment using five different peptide identification algorithms. ProteoCloud is entirely open source, and is built around an easy to use and cross-platform software client with a rich graphical user interface. This client allows full control of the number of cloud instances to initiate and of the spectra to assign for identification. It also enables the user to track progress, and to visualize and interpret the results in detail. Source code, binaries and documentation are all available at http://proteocloud.googlecode.com. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Future Naval Use of COTS Networking Infrastructure

    DTIC Science & Technology

    2009-07-01

    user to benefit from Google’s vast databases and computational resources. Obviously, the ability to harness the full power of the Cloud could be... Computing Impact Findings Action Items Take-Aways Appendices: Pages 54-68 A. Terms of Reference Document B. Sample Definitions of Cloud ...and definition of Cloud Computing . While Cloud Computing is developing in many variations – including Infrastructure as a Service (IaaS), Platform as

  14. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud

    PubMed Central

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Background Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. Results We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. Conclusions This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation. PMID:26501966

  15. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    PubMed

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation.

  16. Reprocessing Multiyear GPS Data from Continuously Operating Reference Stations on Cloud Computing Platform

    NASA Astrophysics Data System (ADS)

    Yoon, S.

    2016-12-01

    To define geodetic reference frame using GPS data collected by Continuously Operating Reference Stations (CORS) network, historical GPS data needs to be reprocessed regularly. Reprocessing GPS data collected by upto 2000 CORS sites for the last two decades requires a lot of computational resource. At National Geodetic Survey (NGS), there has been one completed reprocessing in 2011, and currently, the second reprocessing is undergoing. For the first reprocessing effort, in-house computing resource was utilized. In the current second reprocessing effort, outsourced cloud computing platform is being utilized. In this presentation, the outline of data processing strategy at NGS is described as well as the effort to parallelize the data processing procedure in order to maximize the benefit of the cloud computing. The time and cost savings realized by utilizing cloud computing approach will also be discussed.

  17. Cloud computing geospatial application for water resources based on free and open source software and open standards - a prototype

    NASA Astrophysics Data System (ADS)

    Delipetrev, Blagoj

    2016-04-01

    Presently, most of the existing software is desktop-based, designed to work on a single computer, which represents a major limitation in many ways, starting from limited computer processing, storage power, accessibility, availability, etc. The only feasible solution lies in the web and cloud. This abstract presents research and development of a cloud computing geospatial application for water resources based on free and open source software and open standards using hybrid deployment model of public - private cloud, running on two separate virtual machines (VMs). The first one (VM1) is running on Amazon web services (AWS) and the second one (VM2) is running on a Xen cloud platform. The presented cloud application is developed using free and open source software, open standards and prototype code. The cloud application presents a framework how to develop specialized cloud geospatial application that needs only a web browser to be used. This cloud application is the ultimate collaboration geospatial platform because multiple users across the globe with internet connection and browser can jointly model geospatial objects, enter attribute data and information, execute algorithms, and visualize results. The presented cloud application is: available all the time, accessible from everywhere, it is scalable, works in a distributed computer environment, it creates a real-time multiuser collaboration platform, the programing languages code and components are interoperable, and it is flexible in including additional components. The cloud geospatial application is implemented as a specialized water resources application with three web services for 1) data infrastructure (DI), 2) support for water resources modelling (WRM), 3) user management. The web services are running on two VMs that are communicating over the internet providing services to users. The application was tested on the Zletovica river basin case study with concurrent multiple users. The application is a state-of-the-art cloud geospatial collaboration platform. The presented solution is a prototype and can be used as a foundation for developing of any specialized cloud geospatial applications. Further research will be focused on distributing the cloud application on additional VMs, testing the scalability and availability of services.

  18. CloudMan as a platform for tool, data, and analysis distribution

    PubMed Central

    2012-01-01

    Background Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. Results CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. Conclusions With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions. PMID:23181507

  19. Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds

    NASA Astrophysics Data System (ADS)

    Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.

    In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.

  20. A Novel College Network Resource Management Method using Cloud Computing

    NASA Astrophysics Data System (ADS)

    Lin, Chen

    At present information construction of college mainly has construction of college networks and management information system; there are many problems during the process of information. Cloud computing is development of distributed processing, parallel processing and grid computing, which make data stored on the cloud, make software and services placed in the cloud and build on top of various standards and protocols, you can get it through all kinds of equipments. This article introduces cloud computing and function of cloud computing, then analyzes the exiting problems of college network resource management, the cloud computing technology and methods are applied in the construction of college information sharing platform.

  1. Making Spatial Statistics Service Accessible On Cloud Platform

    NASA Astrophysics Data System (ADS)

    Mu, X.; Wu, J.; Li, T.; Zhong, Y.; Gao, X.

    2014-04-01

    Web service can bring together applications running on diverse platforms, users can access and share various data, information and models more effectively and conveniently from certain web service platform. Cloud computing emerges as a paradigm of Internet computing in which dynamical, scalable and often virtualized resources are provided as services. With the rampant growth of massive data and restriction of net, traditional web services platforms have some prominent problems existing in development such as calculation efficiency, maintenance cost and data security. In this paper, we offer a spatial statistics service based on Microsoft cloud. An experiment was carried out to evaluate the availability and efficiency of this service. The results show that this spatial statistics service is accessible for the public conveniently with high processing efficiency.

  2. Cloud Computing for the Grid: GridControl: A Software Platform to Support the Smart Grid

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    GENI Project: Cornell University is creating a new software platform for grid operators called GridControl that will utilize cloud computing to more efficiently control the grid. In a cloud computing system, there are minimal hardware and software demands on users. The user can tap into a network of computers that is housed elsewhere (the cloud) and the network runs computer applications for the user. The user only needs interface software to access all of the cloud’s data resources, which can be as simple as a web browser. Cloud computing can reduce costs, facilitate innovation through sharing, empower users, and improvemore » the overall reliability of a dispersed system. Cornell’s GridControl will focus on 4 elements: delivering the state of the grid to users quickly and reliably; building networked, scalable grid-control software; tailoring services to emerging smart grid uses; and simulating smart grid behavior under various conditions.« less

  3. Design of Control Plane Architecture Based on Cloud Platform and Experimental Network Demonstration for Multi-domain SDON

    NASA Astrophysics Data System (ADS)

    Li, Ming; Yin, Hongxi; Xing, Fangyuan; Wang, Jingchao; Wang, Honghuan

    2016-02-01

    With the features of network virtualization and resource programming, Software Defined Optical Network (SDON) is considered as the future development trend of optical network, provisioning a more flexible, efficient and open network function, supporting intraconnection and interconnection of data centers. Meanwhile cloud platform can provide powerful computing, storage and management capabilities. In this paper, with the coordination of SDON and cloud platform, a multi-domain SDON architecture based on cloud control plane has been proposed, which is composed of data centers with database (DB), path computation element (PCE), SDON controller and orchestrator. In addition, the structure of the multidomain SDON orchestrator and OpenFlow-enabled optical node are proposed to realize the combination of centralized and distributed effective management and control platform. Finally, the functional verification and demonstration are performed through our optical experiment network.

  4. A New Cloud Architecture of Virtual Trusted Platform Modules

    NASA Astrophysics Data System (ADS)

    Liu, Dongxi; Lee, Jack; Jang, Julian; Nepal, Surya; Zic, John

    We propose and implement a cloud architecture of virtual Trusted Platform Modules (TPMs) to improve the usability of TPMs. In this architecture, virtual TPMs can be obtained from the TPM cloud on demand. Hence, the TPM functionality is available for applications that do not have physical TPMs in their local platforms. Moreover, the TPM cloud allows users to access their keys and data in the same virtual TPM even if they move to untrusted platforms. The TPM cloud is easy to access for applications in different languages since cloud computing delivers services in standard protocols. The functionality of the TPM cloud is demonstrated by applying it to implement the Needham-Schroeder public-key protocol for web authentications, such that the strong security provided by TPMs is integrated into high level applications. The chain of trust based on the TPM cloud is discussed and the security properties of the virtual TPMs in the cloud is analyzed.

  5. Cloud Based Earth Observation Data Exploitation Platforms

    NASA Astrophysics Data System (ADS)

    Romeo, A.; Pinto, S.; Loekken, S.; Marin, A.

    2017-12-01

    In the last few years data produced daily by several private and public Earth Observation (EO) satellites reached the order of tens of Terabytes, representing for scientists and commercial application developers both a big opportunity for their exploitation and a challenge for their management. New IT technologies, such as Big Data and cloud computing, enable the creation of web-accessible data exploitation platforms, which offer to scientists and application developers the means to access and use EO data in a quick and cost effective way. RHEA Group is particularly active in this sector, supporting the European Space Agency (ESA) in the Exploitation Platforms (EP) initiative, developing technology to build multi cloud platforms for the processing and analysis of Earth Observation data, and collaborating with larger European initiatives such as the European Plate Observing System (EPOS) and the European Open Science Cloud (EOSC). An EP is a virtual workspace, providing a user community with access to (i) large volume of data, (ii) algorithm development and integration environment, (iii) processing software and services (e.g. toolboxes, visualization routines), (iv) computing resources, (v) collaboration tools (e.g. forums, wiki, etc.). When an EP is dedicated to a specific Theme, it becomes a Thematic Exploitation Platform (TEP). Currently, ESA has seven TEPs in a pre-operational phase dedicated to geo-hazards monitoring and prevention, costal zones, forestry areas, hydrology, polar regions, urban areas and food security. On the technology development side, solutions like the multi cloud EO data processing platform provides the technology to integrate ICT resources and EO data from different vendors in a single platform. In particular it offers (i) Multi-cloud data discovery, (ii) Multi-cloud data management and access and (iii) Multi-cloud application deployment. This platform has been demonstrated with the EGI Federated Cloud, Innovation Platform Testbed Poland and the Amazon Web Services cloud. This work will present an overview of the TEPs and the multi-cloud EO data processing platform, and discuss their main achievements and their impacts in the context of distributed Research Infrastructures such as EPOS and EOSC.

  6. Architectural Implications of Cloud Computing

    DTIC Science & Technology

    2011-10-24

    Public Cloud Infrastructure-as-a- Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of...Twitter #SEIVirtualForum © 2011 Carnegie Mellon University Software -as-a- Service ( SaaS ) Model of software deployment in which a third-party...and System Solutions (RTSS) Program. Her current interests and projects are in service -oriented architecture (SOA), cloud computing, and context

  7. Development of a Cloud Computing-Based Pier Type Port Structure Stability Evaluation Platform Using Fiber Bragg Grating Sensors.

    PubMed

    Jo, Byung Wan; Jo, Jun Ho; Khan, Rana Muhammad Asad; Kim, Jung Hoon; Lee, Yun Sung

    2018-05-23

    Structure Health Monitoring is a topic of great interest in port structures due to the ageing of structures and the limitations of evaluating structures. This paper presents a cloud computing-based stability evaluation platform for a pier type port structure using Fiber Bragg Grating (FBG) sensors in a system consisting of a FBG strain sensor, FBG displacement gauge, FBG angle meter, gateway, and cloud computing-based web server. The sensors were installed on core components of the structure and measurements were taken to evaluate the structures. The measurement values were transmitted to the web server via the gateway to analyze and visualize them. All data were analyzed and visualized in the web server to evaluate the structure based on the safety evaluation index (SEI). The stability evaluation platform for pier type port structures involves the efficient monitoring of the structures which can be carried out easily anytime and anywhere by converging new technologies such as cloud computing and FBG sensors. In addition, the platform has been successfully implemented at “Maryang Harbor” situated in Maryang-Meyon of Korea to test its durability.

  8. Design and Development of ChemInfoCloud: An Integrated Cloud Enabled Platform for Virtual Screening.

    PubMed

    Karthikeyan, Muthukumarasamy; Pandit, Deepak; Bhavasar, Arvind; Vyas, Renu

    2015-01-01

    The power of cloud computing and distributed computing has been harnessed to handle vast and heterogeneous data required to be processed in any virtual screening protocol. A cloud computing platorm ChemInfoCloud was built and integrated with several chemoinformatics and bioinformatics tools. The robust engine performs the core chemoinformatics tasks of lead generation, lead optimisation and property prediction in a fast and efficient manner. It has also been provided with some of the bioinformatics functionalities including sequence alignment, active site pose prediction and protein ligand docking. Text mining, NMR chemical shift (1H, 13C) prediction and reaction fingerprint generation modules for efficient lead discovery are also implemented in this platform. We have developed an integrated problem solving cloud environment for virtual screening studies that also provides workflow management, better usability and interaction with end users using container based virtualization, OpenVz.

  9. SU-E-T-628: A Cloud Computing Based Multi-Objective Optimization Method for Inverse Treatment Planning.

    PubMed

    Na, Y; Suh, T; Xing, L

    2012-06-01

    Multi-objective (MO) plan optimization entails generation of an enormous number of IMRT or VMAT plans constituting the Pareto surface, which presents a computationally challenging task. The purpose of this work is to overcome the hurdle by developing an efficient MO method using emerging cloud computing platform. As a backbone of cloud computing for optimizing inverse treatment planning, Amazon Elastic Compute Cloud with a master node (17.1 GB memory, 2 virtual cores, 420 GB instance storage, 64-bit platform) is used. The master node is able to scale seamlessly a number of working group instances, called workers, based on the user-defined setting account for MO functions in clinical setting. Each worker solved the objective function with an efficient sparse decomposition method. The workers are automatically terminated if there are finished tasks. The optimized plans are archived to the master node to generate the Pareto solution set. Three clinical cases have been planned using the developed MO IMRT and VMAT planning tools to demonstrate the advantages of the proposed method. The target dose coverage and critical structure sparing of plans are comparable obtained using the cloud computing platform are identical to that obtained using desktop PC (Intel Xeon® CPU 2.33GHz, 8GB memory). It is found that the MO planning speeds up the processing of obtaining the Pareto set substantially for both types of plans. The speedup scales approximately linearly with the number of nodes used for computing. With the use of N nodes, the computational time is reduced by the fitting model, 0.2+2.3/N, with r̂2>0.99, on average of the cases making real-time MO planning possible. A cloud computing infrastructure is developed for MO optimization. The algorithm substantially improves the speed of inverse plan optimization. The platform is valuable for both MO planning and future off- or on-line adaptive re-planning. © 2012 American Association of Physicists in Medicine.

  10. Application of microarray analysis on computer cluster and cloud platforms.

    PubMed

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  11. Platform for High-Assurance Cloud Computing

    DTIC Science & Technology

    2016-06-01

    to create today’s standard cloud computing applications and services. Additionally , our SuperCloud (a related but distinct project under the same... Additionally , our SuperCloud (a related but distinct project under the same MRC funding) reduces vendor lock-in and permits application to migrate, to follow...managing key- value storage with strong assurance properties. This first accomplishment allows us to climb the cloud technical stack, by offering

  12. 3D Viewer Platform of Cloud Clustering Management System: Google Map 3D

    NASA Astrophysics Data System (ADS)

    Choi, Sung-Ja; Lee, Gang-Soo

    The new management system of framework for cloud envrionemnt is needed by the platfrom of convergence according to computing environments of changes. A ISV and small business model is hard to adapt management system of platform which is offered from super business. This article suggest the clustering management system of cloud computing envirionments for ISV and a man of enterprise in small business model. It applies the 3D viewer adapt from map3D & earth of google. It is called 3DV_CCMS as expand the CCMS[1].

  13. Cloud computing task scheduling strategy based on improved differential evolution algorithm

    NASA Astrophysics Data System (ADS)

    Ge, Junwei; He, Qian; Fang, Yiqiu

    2017-04-01

    In order to optimize the cloud computing task scheduling scheme, an improved differential evolution algorithm for cloud computing task scheduling is proposed. Firstly, the cloud computing task scheduling model, according to the model of the fitness function, and then used improved optimization calculation of the fitness function of the evolutionary algorithm, according to the evolution of generation of dynamic selection strategy through dynamic mutation strategy to ensure the global and local search ability. The performance test experiment was carried out in the CloudSim simulation platform, the experimental results show that the improved differential evolution algorithm can reduce the cloud computing task execution time and user cost saving, good implementation of the optimal scheduling of cloud computing tasks.

  14. e-Collaboration for Earth observation (E-CEO): the Cloud4SAR interferometry data challenge

    NASA Astrophysics Data System (ADS)

    Casu, Francesco; Manunta, Michele; Boissier, Enguerran; Brito, Fabrice; Aas, Christina; Lavender, Samantha; Ribeiro, Rita; Farres, Jordi

    2014-05-01

    The e-Collaboration for Earth Observation (E-CEO) project addresses the technologies and architectures needed to provide a collaborative research Platform for automating data mining and processing, and information extraction experiments. The Platform serves for the implementation of Data Challenge Contests focusing on Information Extraction for Earth Observations (EO) applications. The possibility to implement multiple processors within a Common Software Environment facilitates the validation, evaluation and transparent peer comparison among different methodologies, which is one of the main requirements rose by scientists who develop algorithms in the EO field. In this scenario, we set up a Data Challenge, referred to as Cloud4SAR (http://wiki.services.eoportal.org/tiki-index.php?page=ECEO), to foster the deployment of Interferometric SAR (InSAR) processing chains within a Cloud Computing platform. While a large variety of InSAR processing software tools are available, they require a high level of expertise and a complex user interaction to be effectively run. Computing a co-seismic interferogram or a 20-years deformation time series on a volcanic area are not easy tasks to be performed in a fully unsupervised way and/or in very short time (hours or less). Benefiting from ESA's E-CEO platform, participants can optimise algorithms on a Virtual Sandbox environment without being expert programmers, and compute results on high performing Cloud platforms. Cloud4SAR requires solving a relatively easy InSAR problem by trying to maximize the exploitation of the processing capabilities provided by a Cloud Computing infrastructure. The proposed challenge offers two different frameworks, each dedicated to participants with different skills, identified as Beginners and Experts. For both of them, the contest mainly resides in the degree of automation of the deployed algorithms, no matter which one is used, as well as in the capability of taking effective benefit from a parallel computing environment.

  15. Bigdata Driven Cloud Security: A Survey

    NASA Astrophysics Data System (ADS)

    Raja, K.; Hanifa, Sabibullah Mohamed

    2017-08-01

    Cloud Computing (CC) is a fast-growing technology to perform massive-scale and complex computing. It eliminates the need to maintain expensive computing hardware, dedicated space, and software. Recently, it has been observed that massive growth in the scale of data or big data generated through cloud computing. CC consists of a front-end, includes the users’ computers and software required to access the cloud network, and back-end consists of various computers, servers and database systems that create the cloud. In SaaS (Software as-a-Service - end users to utilize outsourced software), PaaS (Platform as-a-Service-platform is provided) and IaaS (Infrastructure as-a-Service-physical environment is outsourced), and DaaS (Database as-a-Service-data can be housed within a cloud), where leading / traditional cloud ecosystem delivers the cloud services become a powerful and popular architecture. Many challenges and issues are in security or threats, most vital barrier for cloud computing environment. The main barrier to the adoption of CC in health care relates to Data security. When placing and transmitting data using public networks, cyber attacks in any form are anticipated in CC. Hence, cloud service users need to understand the risk of data breaches and adoption of service delivery model during deployment. This survey deeply covers the CC security issues (covering Data Security in Health care) so as to researchers can develop the robust security application models using Big Data (BD) on CC (can be created / deployed easily). Since, BD evaluation is driven by fast-growing cloud-based applications developed using virtualized technologies. In this purview, MapReduce [12] is a good example of big data processing in a cloud environment, and a model for Cloud providers.

  16. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  17. CSNS computing environment Based on OpenStack

    NASA Astrophysics Data System (ADS)

    Li, Yakang; Qi, Fazhi; Chen, Gang; Wang, Yanming; Hong, Jianshu

    2017-10-01

    Cloud computing can allow for more flexible configuration of IT resources and optimized hardware utilization, it also can provide computing service according to the real need. We are applying this computing mode to the China Spallation Neutron Source(CSNS) computing environment. So, firstly, CSNS experiment and its computing scenarios and requirements are introduced in this paper. Secondly, the design and practice of cloud computing platform based on OpenStack are mainly demonstrated from the aspects of cloud computing system framework, network, storage and so on. Thirdly, some improvments to openstack we made are discussed further. Finally, current status of CSNS cloud computing environment are summarized in the ending of this paper.

  18. Federated and Cloud Enabled Resources for Data Management and Utilization

    NASA Astrophysics Data System (ADS)

    Rankin, R.; Gordon, M.; Potter, R. G.; Satchwill, B.

    2011-12-01

    The emergence of cloud computing over the past three years has led to a paradigm shift in how data can be managed, processed and made accessible. Building on the federated data management system offered through the Canadian Space Science Data Portal (www.cssdp.ca), we demonstrate how heterogeneous and geographically distributed data sets and modeling tools have been integrated to form a virtual data center and computational modeling platform that has services for data processing and visualization embedded within it. We also discuss positive and negative experiences in utilizing Eucalyptus and OpenStack cloud applications, and job scheduling facilitated by Condor and Star Cluster. We summarize our findings by demonstrating use of these technologies in the Cloud Enabled Space Weather Data Assimilation and Modeling Platform CESWP (www.ceswp.ca), which is funded through Canarie's (canarie.ca) Network Enabled Platforms program in Canada.

  19. A Geospatial Information Grid Framework for Geological Survey.

    PubMed

    Wu, Liang; Xue, Lei; Li, Chaoling; Lv, Xia; Chen, Zhanlong; Guo, Mingqiang; Xie, Zhong

    2015-01-01

    The use of digital information in geological fields is becoming very important. Thus, informatization in geological surveys should not stagnate as a result of the level of data accumulation. The integration and sharing of distributed, multi-source, heterogeneous geological information is an open problem in geological domains. Applications and services use geological spatial data with many features, including being cross-region and cross-domain and requiring real-time updating. As a result of these features, desktop and web-based geographic information systems (GISs) experience difficulties in meeting the demand for geological spatial information. To facilitate the real-time sharing of data and services in distributed environments, a GIS platform that is open, integrative, reconfigurable, reusable and elastic would represent an indispensable tool. The purpose of this paper is to develop a geological cloud-computing platform for integrating and sharing geological information based on a cloud architecture. Thus, the geological cloud-computing platform defines geological ontology semantics; designs a standard geological information framework and a standard resource integration model; builds a peer-to-peer node management mechanism; achieves the description, organization, discovery, computing and integration of the distributed resources; and provides the distributed spatial meta service, the spatial information catalog service, the multi-mode geological data service and the spatial data interoperation service. The geological survey information cloud-computing platform has been implemented, and based on the platform, some geological data services and geological processing services were developed. Furthermore, an iron mine resource forecast and an evaluation service is introduced in this paper.

  20. A Geospatial Information Grid Framework for Geological Survey

    PubMed Central

    Wu, Liang; Xue, Lei; Li, Chaoling; Lv, Xia; Chen, Zhanlong; Guo, Mingqiang; Xie, Zhong

    2015-01-01

    The use of digital information in geological fields is becoming very important. Thus, informatization in geological surveys should not stagnate as a result of the level of data accumulation. The integration and sharing of distributed, multi-source, heterogeneous geological information is an open problem in geological domains. Applications and services use geological spatial data with many features, including being cross-region and cross-domain and requiring real-time updating. As a result of these features, desktop and web-based geographic information systems (GISs) experience difficulties in meeting the demand for geological spatial information. To facilitate the real-time sharing of data and services in distributed environments, a GIS platform that is open, integrative, reconfigurable, reusable and elastic would represent an indispensable tool. The purpose of this paper is to develop a geological cloud-computing platform for integrating and sharing geological information based on a cloud architecture. Thus, the geological cloud-computing platform defines geological ontology semantics; designs a standard geological information framework and a standard resource integration model; builds a peer-to-peer node management mechanism; achieves the description, organization, discovery, computing and integration of the distributed resources; and provides the distributed spatial meta service, the spatial information catalog service, the multi-mode geological data service and the spatial data interoperation service. The geological survey information cloud-computing platform has been implemented, and based on the platform, some geological data services and geological processing services were developed. Furthermore, an iron mine resource forecast and an evaluation service is introduced in this paper. PMID:26710255

  1. Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers

    NASA Astrophysics Data System (ADS)

    Dreher, Patrick; Scullin, William; Vouk, Mladen

    2015-09-01

    Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.

  2. Cloud computing applications for biomedical science: A perspective.

    PubMed

    Navale, Vivek; Bourne, Philip E

    2018-06-01

    Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.

  3. Cloud computing applications for biomedical science: A perspective

    PubMed Central

    2018-01-01

    Biomedical research has become a digital data–intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research. PMID:29902176

  4. A Platform for Scalable Satellite and Geospatial Data Analysis

    NASA Astrophysics Data System (ADS)

    Beneke, C. M.; Skillman, S.; Warren, M. S.; Kelton, T.; Brumby, S. P.; Chartrand, R.; Mathis, M.

    2017-12-01

    At Descartes Labs, we use the commercial cloud to run global-scale machine learning applications over satellite imagery. We have processed over 5 Petabytes of public and commercial satellite imagery, including the full Landsat and Sentinel archives. By combining open-source tools with a FUSE-based filesystem for cloud storage, we have enabled a scalable compute platform that has demonstrated reading over 200 GB/s of satellite imagery into cloud compute nodes. In one application, we generated global 15m Landsat-8, 20m Sentinel-1, and 10m Sentinel-2 composites from 15 trillion pixels, using over 10,000 CPUs. We recently created a public open-source Python client library that can be used to query and access preprocessed public satellite imagery from within our platform, and made this platform available to researchers for non-commercial projects. In this session, we will describe how you can use the Descartes Labs Platform for rapid prototyping and scaling of geospatial analyses and demonstrate examples in land cover classification.

  5. An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

    2012-01-01

    The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.

  6. Toward ubiquitous healthcare services with a novel efficient cloud platform.

    PubMed

    He, Chenguang; Fan, Xiaomao; Li, Ye

    2013-01-01

    Ubiquitous healthcare services are becoming more and more popular, especially under the urgent demand of the global aging issue. Cloud computing owns the pervasive and on-demand service-oriented natures, which can fit the characteristics of healthcare services very well. However, the abilities in dealing with multimodal, heterogeneous, and nonstationary physiological signals to provide persistent personalized services, meanwhile keeping high concurrent online analysis for public, are challenges to the general cloud. In this paper, we proposed a private cloud platform architecture which includes six layers according to the specific requirements. This platform utilizes message queue as a cloud engine, and each layer thereby achieves relative independence by this loosely coupled means of communications with publish/subscribe mechanism. Furthermore, a plug-in algorithm framework is also presented, and massive semistructure or unstructured medical data are accessed adaptively by this cloud architecture. As the testing results showing, this proposed cloud platform, with robust, stable, and efficient features, can satisfy high concurrent requests from ubiquitous healthcare services.

  7. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.

    PubMed

    Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E

    2012-03-19

    A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.

  8. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community

    PubMed Central

    2012-01-01

    Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them. PMID:22429538

  9. Big data mining analysis method based on cloud computing

    NASA Astrophysics Data System (ADS)

    Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao

    2017-08-01

    Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.

  10. Long Read Alignment with Parallel MapReduce Cloud Platform

    PubMed Central

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887

  11. Long Read Alignment with Parallel MapReduce Cloud Platform.

    PubMed

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.

  12. The Advance of Computing from the Ground to the Cloud

    ERIC Educational Resources Information Center

    Breeding, Marshall

    2009-01-01

    A trend toward the abstraction of computing platforms that has been developing in the broader IT arena over the last few years is just beginning to make inroads into the library technology scene. Cloud computing offers for libraries many interesting possibilities that may help reduce technology costs and increase capacity, reliability, and…

  13. A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data

    NASA Astrophysics Data System (ADS)

    Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.

    2017-12-01

    Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.

  14. Symmetrical compression distance for arrhythmia discrimination in cloud-based big-data services.

    PubMed

    Lillo-Castellano, J M; Mora-Jiménez, I; Santiago-Mozos, R; Chavarría-Asso, F; Cano-González, A; García-Alberola, A; Rojo-Álvarez, J L

    2015-07-01

    The current development of cloud computing is completely changing the paradigm of data knowledge extraction in huge databases. An example of this technology in the cardiac arrhythmia field is the SCOOP platform, a national-level scientific cloud-based big data service for implantable cardioverter defibrillators. In this scenario, we here propose a new methodology for automatic classification of intracardiac electrograms (EGMs) in a cloud computing system, designed for minimal signal preprocessing. A new compression-based similarity measure (CSM) is created for low computational burden, so-called weighted fast compression distance, which provides better performance when compared with other CSMs in the literature. Using simple machine learning techniques, a set of 6848 EGMs extracted from SCOOP platform were classified into seven cardiac arrhythmia classes and one noise class, reaching near to 90% accuracy when previous patient arrhythmia information was available and 63% otherwise, hence overcoming in all cases the classification provided by the majority class. Results show that this methodology can be used as a high-quality service of cloud computing, providing support to physicians for improving the knowledge on patient diagnosis.

  15. Secure public cloud platform for medical images sharing.

    PubMed

    Pan, Wei; Coatrieux, Gouenou; Bouslimi, Dalel; Prigent, Nicolas

    2015-01-01

    Cloud computing promises medical imaging services offering large storage and computing capabilities for limited costs. In this data outsourcing framework, one of the greatest issues to deal with is data security. To do so, we propose to secure a public cloud platform devoted to medical image sharing by defining and deploying a security policy so as to control various security mechanisms. This policy stands on a risk assessment we conducted so as to identify security objectives with a special interest for digital content protection. These objectives are addressed by means of different security mechanisms like access and usage control policy, partial-encryption and watermarking.

  16. MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants.

    PubMed

    Elshazly, Hatem; Souilmi, Yassine; Tonellato, Peter J; Wall, Dennis P; Abouelhoda, Mohamed

    2017-01-20

    Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices. In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers. MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot machines as a result of a sudden price increase. The package has a web-interface and it is available for free for academic use.

  17. A Novel Market-Oriented Dynamic Collaborative Cloud Service Platform

    NASA Astrophysics Data System (ADS)

    Hassan, Mohammad Mehedi; Huh, Eui-Nam

    In today's world the emerging Cloud computing (Weiss, 2007) offer a new computing model where resources such as computing power, storage, online applications and networking infrastructures can be shared as "services" over the internet. Cloud providers (CPs) are incentivized by the profits to be made by charging consumers for accessing these services. Consumers, such as enterprises, are attracted by the opportunity for reducing or eliminating costs associated with "in-house" provision of these services.

  18. Automating NEURON Simulation Deployment in Cloud Resources.

    PubMed

    Stockton, David B; Santamaria, Fidel

    2017-01-01

    Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.

  19. Automating NEURON Simulation Deployment in Cloud Resources

    PubMed Central

    Santamaria, Fidel

    2016-01-01

    Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the Open-Stack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon’s proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model. PMID:27655341

  20. Department of Defense Use of Commercial Cloud Computing Capabilities and Services

    DTIC Science & Technology

    2015-11-01

    models (Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service ( SaaS )), and four deployment models (Public...NIST defines three main models for cloud computing: IaaS, PaaS, and SaaS . These models help differentiate the implementation responsibilities that fall...and SaaS . 3. Public, Private, Community, and Hybrid Clouds Cloud services come in different forms, depending on the customer’s specific needs

  1. Volunteered Cloud Computing for Disaster Management

    NASA Astrophysics Data System (ADS)

    Evans, J. D.; Hao, W.; Chettri, S. R.

    2014-12-01

    Disaster management relies increasingly on interpreting earth observations and running numerical models; which require significant computing capacity - usually on short notice and at irregular intervals. Peak computing demand during event detection, hazard assessment, or incident response may exceed agency budgets; however some of it can be met through volunteered computing, which distributes subtasks to participating computers via the Internet. This approach has enabled large projects in mathematics, basic science, and climate research to harness the slack computing capacity of thousands of desktop computers. This capacity is likely to diminish as desktops give way to battery-powered mobile devices (laptops, smartphones, tablets) in the consumer market; but as cloud computing becomes commonplace, it may offer significant slack capacity -- if its users are given an easy, trustworthy mechanism for participating. Such a "volunteered cloud computing" mechanism would also offer several advantages over traditional volunteered computing: tasks distributed within a cloud have fewer bandwidth limitations; granular billing mechanisms allow small slices of "interstitial" computing at no marginal cost; and virtual storage volumes allow in-depth, reversible machine reconfiguration. Volunteered cloud computing is especially suitable for "embarrassingly parallel" tasks, including ones requiring large data volumes: examples in disaster management include near-real-time image interpretation, pattern / trend detection, or large model ensembles. In the context of a major disaster, we estimate that cloud users (if suitably informed) might volunteer hundreds to thousands of CPU cores across a large provider such as Amazon Web Services. To explore this potential, we are building a volunteered cloud computing platform and targeting it to a disaster management context. Using a lightweight, fault-tolerant network protocol, this platform helps cloud users join parallel computing projects; automates reconfiguration of their virtual machines; ensures accountability for donated computing; and optimizes the use of "interstitial" computing. Initial applications include fire detection from multispectral satellite imagery and flood risk mapping through hydrological simulations.

  2. Research on the application in disaster reduction for using cloud computing technology

    NASA Astrophysics Data System (ADS)

    Tao, Liang; Fan, Yida; Wang, Xingling

    Cloud Computing technology has been rapidly applied in different domains recently, promotes the progress of the domain's informatization. Based on the analysis of the state of application requirement in disaster reduction and combining the characteristics of Cloud Computing technology, we present the research on the application of Cloud Computing technology in disaster reduction. First of all, we give the architecture of disaster reduction cloud, which consists of disaster reduction infrastructure as a service (IAAS), disaster reduction cloud application platform as a service (PAAS) and disaster reduction software as a service (SAAS). Secondly, we talk about the standard system of disaster reduction in five aspects. Thirdly, we indicate the security system of disaster reduction cloud. Finally, we draw a conclusion the use of cloud computing technology will help us to solve the problems for disaster reduction and promote the development of disaster reduction.

  3. Model-as-a-service (MaaS) using the cloud service innovation platform (CSIP)

    USDA-ARS?s Scientific Manuscript database

    Cloud infrastructures for modelling activities such as data processing, performing environmental simulations, or conducting model calibrations/optimizations provide a cost effective alternative to traditional high performance computing approaches. Cloud-based modelling examples emerged into the more...

  4. The cloud services innovation platform- enabling service-based environmental modelling using infrastructure-as-a-service cloud computing

    USDA-ARS?s Scientific Manuscript database

    Service oriented architectures allow modelling engines to be hosted over the Internet abstracting physical hardware configuration and software deployments from model users. Many existing environmental models are deployed as desktop applications running on user's personal computers (PCs). Migration ...

  5. EduCloud: PaaS versus IaaS Cloud Usage for an Advanced Computer Science Course

    ERIC Educational Resources Information Center

    Vaquero, L. M.

    2011-01-01

    The cloud has become a widely used term in academia and the industry. Education has not remained unaware of this trend, and several educational solutions based on cloud technologies are already in place, especially for software as a service cloud. However, an evaluation of the educational potential of infrastructure and platform clouds has not…

  6. Analysis and Research on Spatial Data Storage Model Based on Cloud Computing Platform

    NASA Astrophysics Data System (ADS)

    Hu, Yong

    2017-12-01

    In this paper, the data processing and storage characteristics of cloud computing are analyzed and studied. On this basis, a cloud computing data storage model based on BP neural network is proposed. In this data storage model, it can carry out the choice of server cluster according to the different attributes of the data, so as to complete the spatial data storage model with load balancing function, and have certain feasibility and application advantages.

  7. Use of cloud computing in biomedicine.

    PubMed

    Sobeslav, Vladimir; Maresova, Petra; Krejcar, Ondrej; Franca, Tanos C C; Kuca, Kamil

    2016-12-01

    Nowadays, biomedicine is characterised by a growing need for processing of large amounts of data in real time. This leads to new requirements for information and communication technologies (ICT). Cloud computing offers a solution to these requirements and provides many advantages, such as cost savings, elasticity and scalability of using ICT. The aim of this paper is to explore the concept of cloud computing and the related use of this concept in the area of biomedicine. Authors offer a comprehensive analysis of the implementation of the cloud computing approach in biomedical research, decomposed into infrastructure, platform and service layer, and a recommendation for processing large amounts of data in biomedicine. Firstly, the paper describes the appropriate forms and technological solutions of cloud computing. Secondly, the high-end computing paradigm of cloud computing aspects is analysed. Finally, the potential and current use of applications in scientific research of this technology in biomedicine is discussed.

  8. GATE Monte Carlo simulation in a cloud computing environment

    NASA Astrophysics Data System (ADS)

    Rowedder, Blake Austin

    The GEANT4-based GATE is a unique and powerful Monte Carlo (MC) platform, which provides a single code library allowing the simulation of specific medical physics applications, e.g. PET, SPECT, CT, radiotherapy, and hadron therapy. However, this rigorous yet flexible platform is used only sparingly in the clinic due to its lengthy calculation time. By accessing the powerful computational resources of a cloud computing environment, GATE's runtime can be significantly reduced to clinically feasible levels without the sizable investment of a local high performance cluster. This study investigated a reliable and efficient execution of GATE MC simulations using a commercial cloud computing services. Amazon's Elastic Compute Cloud was used to launch several nodes equipped with GATE. Job data was initially broken up on the local computer, then uploaded to the worker nodes on the cloud. The results were automatically downloaded and aggregated on the local computer for display and analysis. Five simulations were repeated for every cluster size between 1 and 20 nodes. Ultimately, increasing cluster size resulted in a decrease in calculation time that could be expressed with an inverse power model. Comparing the benchmark results to the published values and error margins indicated that the simulation results were not affected by the cluster size and thus that integrity of a calculation is preserved in a cloud computing environment. The runtime of a 53 minute long simulation was decreased to 3.11 minutes when run on a 20-node cluster. The ability to improve the speed of simulation suggests that fast MC simulations are viable for imaging and radiotherapy applications. With high power computing continuing to lower in price and accessibility, implementing Monte Carlo techniques with cloud computing for clinical applications will continue to become more attractive.

  9. Architecture Design of Healthcare Software-as-a-Service Platform for Cloud-Based Clinical Decision Support Service.

    PubMed

    Oh, Sungyoung; Cha, Jieun; Ji, Myungkyu; Kang, Hyekyung; Kim, Seok; Heo, Eunyoung; Han, Jong Soo; Kang, Hyunggoo; Chae, Hoseok; Hwang, Hee; Yoo, Sooyoung

    2015-04-01

    To design a cloud computing-based Healthcare Software-as-a-Service (SaaS) Platform (HSP) for delivering healthcare information services with low cost, high clinical value, and high usability. We analyzed the architecture requirements of an HSP, including the interface, business services, cloud SaaS, quality attributes, privacy and security, and multi-lingual capacity. For cloud-based SaaS services, we focused on Clinical Decision Service (CDS) content services, basic functional services, and mobile services. Microsoft's Azure cloud computing for Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) was used. The functional and software views of an HSP were designed in a layered architecture. External systems can be interfaced with the HSP using SOAP and REST/JSON. The multi-tenancy model of the HSP was designed as a shared database, with a separate schema for each tenant through a single application, although healthcare data can be physically located on a cloud or in a hospital, depending on regulations. The CDS services were categorized into rule-based services for medications, alert registration services, and knowledge services. We expect that cloud-based HSPs will allow small and mid-sized hospitals, in addition to large-sized hospitals, to adopt information infrastructures and health information technology with low system operation and maintenance costs.

  10. Application of Cloud Computing at KTU: MS Live@Edu Case

    ERIC Educational Resources Information Center

    Miseviciene, Regina; Budnikas, Germanas; Ambraziene, Danute

    2011-01-01

    Cloud computing is a significant alternative in today's educational perspective. The technology gives the students and teachers the opportunity to quickly access various application platforms and resources through the web pages on-demand. Unfortunately, not all educational institutions often have an ability to take full advantages of the newest…

  11. Enterprise Cloud Architecture for Chinese Ministry of Railway

    NASA Astrophysics Data System (ADS)

    Shan, Xumei; Liu, Hefeng

    Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.

  12. Dynamic VM Provisioning for TORQUE in a Cloud Environment

    NASA Astrophysics Data System (ADS)

    Zhang, S.; Boland, L.; Coddington, P.; Sevior, M.

    2014-06-01

    Cloud computing, also known as an Infrastructure-as-a-Service (IaaS), is attracting more interest from the commercial and educational sectors as a way to provide cost-effective computational infrastructure. It is an ideal platform for researchers who must share common resources but need to be able to scale up to massive computational requirements for specific periods of time. This paper presents the tools and techniques developed to allow the open source TORQUE distributed resource manager and Maui cluster scheduler to dynamically integrate OpenStack cloud resources into existing high throughput computing clusters.

  13. Consolidation of cloud computing in ATLAS

    NASA Astrophysics Data System (ADS)

    Taylor, Ryan P.; Domingues Cordeiro, Cristovao Jose; Giordano, Domenico; Hover, John; Kouba, Tomas; Love, Peter; McNab, Andrew; Schovancova, Jaroslava; Sobie, Randall; ATLAS Collaboration

    2017-10-01

    Throughout the first half of LHC Run 2, ATLAS cloud computing has undergone a period of consolidation, characterized by building upon previously established systems, with the aim of reducing operational effort, improving robustness, and reaching higher scale. This paper describes the current state of ATLAS cloud computing. Cloud activities are converging on a common contextualization approach for virtual machines, and cloud resources are sharing monitoring and service discovery components. We describe the integration of Vacuum resources, streamlined usage of the Simulation at Point 1 cloud for offline processing, extreme scaling on Amazon compute resources, and procurement of commercial cloud capacity in Europe. Finally, building on the previously established monitoring infrastructure, we have deployed a real-time monitoring and alerting platform which coalesces data from multiple sources, provides flexible visualization via customizable dashboards, and issues alerts and carries out corrective actions in response to problems.

  14. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

    PubMed Central

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872

  15. A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

    PubMed

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.

  16. Teaching Cybersecurity Using the Cloud

    ERIC Educational Resources Information Center

    Salah, Khaled; Hammoud, Mohammad; Zeadally, Sherali

    2015-01-01

    Cloud computing platforms can be highly attractive to conduct course assignments and empower students with valuable and indispensable hands-on experience. In particular, the cloud can offer teaching staff and students (whether local or remote) on-demand, elastic, dedicated, isolated, (virtually) unlimited, and easily configurable virtual machines.…

  17. Mobile healthcare information management utilizing Cloud Computing and Android OS.

    PubMed

    Doukas, Charalampos; Pliakas, Thomas; Maglogiannis, Ilias

    2010-01-01

    Cloud Computing provides functionality for managing information data in a distributed, ubiquitous and pervasive manner supporting several platforms, systems and applications. This work presents the implementation of a mobile system that enables electronic healthcare data storage, update and retrieval using Cloud Computing. The mobile application is developed using Google's Android operating system and provides management of patient health records and medical images (supporting DICOM format and JPEG2000 coding). The developed system has been evaluated using the Amazon's S3 cloud service. This article summarizes the implementation details and presents initial results of the system in practice.

  18. Retrieving and Indexing Spatial Data in the Cloud Computing Environment

    NASA Astrophysics Data System (ADS)

    Wang, Yonggang; Wang, Sheng; Zhou, Daliang

    In order to solve the drawbacks of spatial data storage in common Cloud Computing platform, we design and present a framework for retrieving, indexing, accessing and managing spatial data in the Cloud environment. An interoperable spatial data object model is provided based on the Simple Feature Coding Rules from the OGC such as Well Known Binary (WKB) and Well Known Text (WKT). And the classic spatial indexing algorithms like Quad-Tree and R-Tree are re-designed in the Cloud Computing environment. In the last we develop a prototype software based on Google App Engine to implement the proposed model.

  19. Satellite Cloud and Radiative Property Processing and Distribution System on the NASA Langley ASDC OpenStack and OpenShift Cloud Platform

    NASA Astrophysics Data System (ADS)

    Nguyen, L.; Chee, T.; Palikonda, R.; Smith, W. L., Jr.; Bedka, K. M.; Spangenberg, D.; Vakhnin, A.; Lutz, N. E.; Walter, J.; Kusterer, J.

    2017-12-01

    Cloud Computing offers new opportunities for large-scale scientific data producers to utilize Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) IT resources to process and deliver data products in an operational environment where timely delivery, reliability, and availability are critical. The NASA Langley Research Center Atmospheric Science Data Center (ASDC) is building and testing a private and public facing cloud for users in the Science Directorate to utilize as an everyday production environment. The NASA SatCORPS (Satellite ClOud and Radiation Property Retrieval System) team processes and derives near real-time (NRT) global cloud products from operational geostationary (GEO) satellite imager datasets. To deliver these products, we will utilize the public facing cloud and OpenShift to deploy a load-balanced webserver for data storage, access, and dissemination. The OpenStack private cloud will host data ingest and computational capabilities for SatCORPS processing. This paper will discuss the SatCORPS migration towards, and usage of, the ASDC Cloud Services in an operational environment. Detailed lessons learned from use of prior cloud providers, specifically the Amazon Web Services (AWS) GovCloud and the Government Cloud administered by the Langley Managed Cloud Environment (LMCE) will also be discussed.

  20. Cloud computing in medical imaging.

    PubMed

    Kagadis, George C; Kloukinas, Christos; Moore, Kevin; Philbin, Jim; Papadimitroulas, Panagiotis; Alexakos, Christos; Nagy, Paul G; Visvikis, Dimitris; Hendee, William R

    2013-07-01

    Over the past century technology has played a decisive role in defining, driving, and reinventing procedures, devices, and pharmaceuticals in healthcare. Cloud computing has been introduced only recently but is already one of the major topics of discussion in research and clinical settings. The provision of extensive, easily accessible, and reconfigurable resources such as virtual systems, platforms, and applications with low service cost has caught the attention of many researchers and clinicians. Healthcare researchers are moving their efforts to the cloud, because they need adequate resources to process, store, exchange, and use large quantities of medical data. This Vision 20/20 paper addresses major questions related to the applicability of advanced cloud computing in medical imaging. The paper also considers security and ethical issues that accompany cloud computing.

  1. The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

    PubMed

    Thackston, Russell; Fortenberry, Ryan C

    2015-05-05

    The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.

  2. Cloud Computing and Validated Learning for Accelerating Innovation in IoT

    ERIC Educational Resources Information Center

    Suciu, George; Todoran, Gyorgy; Vulpe, Alexandru; Suciu, Victor; Bulca, Cristina; Cheveresan, Romulus

    2015-01-01

    Innovation in Internet of Things (IoT) requires more than just creation of technology and use of cloud computing or big data platforms. It requires accelerated commercialization or aptly called go-to-market processes. To successfully accelerate, companies need a new type of product development, the so-called validated learning process.…

  3. A Framework for Collaborative and Convenient Learning on Cloud Computing Platforms

    ERIC Educational Resources Information Center

    Sharma, Deepika; Kumar, Vikas

    2017-01-01

    The depth of learning resides in collaborative work with more engagement and fun. Technology can enhance collaboration with a higher level of convenience and cloud computing can facilitate this in a cost effective and scalable manner. However, to deploy a successful online learning environment, elementary components of learning pedagogy must be…

  4. Elastic Cloud Computing Architecture and System for Heterogeneous Spatiotemporal Computing

    NASA Astrophysics Data System (ADS)

    Shi, X.

    2017-10-01

    Spatiotemporal computation implements a variety of different algorithms. When big data are involved, desktop computer or standalone application may not be able to complete the computation task due to limited memory and computing power. Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure and platforms. Some are perfect for implementation on a cluster of graphics processing units (GPUs), while GPUs may not be useful on certain kind of spatiotemporal computation. This is the same situation in utilizing a cluster of Intel's many-integrated-core (MIC) or Xeon Phi, as well as Hadoop or Spark platforms, to handle big spatiotemporal data. Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation could be similar or better than GPUs and MICs. It is expected that an elastic cloud computing architecture and system that integrates all of GPUs, MICs, and FPGAs could be developed and deployed to support spatiotemporal computing over heterogeneous data types and computational problems.

  5. Cloudbus Toolkit for Market-Oriented Cloud Computing

    NASA Astrophysics Data System (ADS)

    Buyya, Rajkumar; Pandey, Suraj; Vecchiola, Christian

    This keynote paper: (1) presents the 21st century vision of computing and identifies various IT paradigms promising to deliver computing as a utility; (2) defines the architecture for creating market-oriented Clouds and computing atmosphere by leveraging technologies such as virtual machines; (3) provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; (4) presents the work carried out as part of our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a Service software system containing SDK (Software Development Kit) for construction of Cloud applications and deployment on private or public Clouds, in addition to supporting market-oriented resource management; (ii) internetworking of Clouds for dynamic creation of federated computing environments for scaling of elastic applications; (iii) creation of 3rd party Cloud brokering services for building content delivery networks and e-Science applications and their deployment on capabilities of IaaS providers such as Amazon along with Grid mashups; (iv) CloudSim supporting modelling and simulation of Clouds for performance studies; (v) Energy Efficient Resource Allocation Mechanisms and Techniques for creation and management of Green Clouds; and (vi) pathways for future research.

  6. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less

  7. Toward a web-based real-time radiation treatment planning system in a cloud computing environment.

    PubMed

    Na, Yong Hum; Suh, Tae-Suk; Kapp, Daniel S; Xing, Lei

    2013-09-21

    To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an 'on-demand' basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture's constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm(2)) from the Varian TrueBeam(TM) STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.

  8. Toward a web-based real-time radiation treatment planning system in a cloud computing environment

    NASA Astrophysics Data System (ADS)

    Hum Na, Yong; Suh, Tae-Suk; Kapp, Daniel S.; Xing, Lei

    2013-09-01

    To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an ‘on-demand’ basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture’s constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm2) from the Varian TrueBeamTM STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.

  9. Architecture Design of Healthcare Software-as-a-Service Platform for Cloud-Based Clinical Decision Support Service

    PubMed Central

    Oh, Sungyoung; Cha, Jieun; Ji, Myungkyu; Kang, Hyekyung; Kim, Seok; Heo, Eunyoung; Han, Jong Soo; Kang, Hyunggoo; Chae, Hoseok; Hwang, Hee

    2015-01-01

    Objectives To design a cloud computing-based Healthcare Software-as-a-Service (SaaS) Platform (HSP) for delivering healthcare information services with low cost, high clinical value, and high usability. Methods We analyzed the architecture requirements of an HSP, including the interface, business services, cloud SaaS, quality attributes, privacy and security, and multi-lingual capacity. For cloud-based SaaS services, we focused on Clinical Decision Service (CDS) content services, basic functional services, and mobile services. Microsoft's Azure cloud computing for Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) was used. Results The functional and software views of an HSP were designed in a layered architecture. External systems can be interfaced with the HSP using SOAP and REST/JSON. The multi-tenancy model of the HSP was designed as a shared database, with a separate schema for each tenant through a single application, although healthcare data can be physically located on a cloud or in a hospital, depending on regulations. The CDS services were categorized into rule-based services for medications, alert registration services, and knowledge services. Conclusions We expect that cloud-based HSPs will allow small and mid-sized hospitals, in addition to large-sized hospitals, to adopt information infrastructures and health information technology with low system operation and maintenance costs. PMID:25995962

  10. Enabling BOINC in infrastructure as a service cloud system

    NASA Astrophysics Data System (ADS)

    Montes, Diego; Añel, Juan A.; Pena, Tomás F.; Uhe, Peter; Wallom, David C. H.

    2017-02-01

    Volunteer or crowd computing is becoming increasingly popular for solving complex research problems from an increasingly diverse range of areas. The majority of these have been built using the Berkeley Open Infrastructure for Network Computing (BOINC) platform, which provides a range of different services to manage all computation aspects of a project. The BOINC system is ideal in those cases where not only does the research community involved need low-cost access to massive computing resources but also where there is a significant public interest in the research being done.We discuss the way in which cloud services can help BOINC-based projects to deliver results in a fast, on demand manner. This is difficult to achieve using volunteers, and at the same time, using scalable cloud resources for short on demand projects can optimize the use of the available resources. We show how this design can be used as an efficient distributed computing platform within the cloud, and outline new approaches that could open up new possibilities in this field, using Climateprediction.net (http://www.climateprediction.net/) as a case study.

  11. Construction and application of Red5 cluster based on OpenStack

    NASA Astrophysics Data System (ADS)

    Wang, Jiaqing; Song, Jianxin

    2017-08-01

    With the application and development of cloud computing technology in various fields, the resource utilization rate of the data center has been improved obviously, and the system based on cloud computing platform has also improved the expansibility and stability. In the traditional way, Red5 cluster resource utilization is low and the system stability is poor. This paper uses cloud computing to efficiently calculate the resource allocation ability, and builds a Red5 server cluster based on OpenStack. Multimedia applications can be published to the Red5 cloud server cluster. The system achieves the flexible construction of computing resources, but also greatly improves the stability of the cluster and service efficiency.

  12. Capturing and analyzing wheelchair maneuvering patterns with mobile cloud computing.

    PubMed

    Fu, Jicheng; Hao, Wei; White, Travis; Yan, Yuqing; Jones, Maria; Jan, Yih-Kuen

    2013-01-01

    Power wheelchairs have been widely used to provide independent mobility to people with disabilities. Despite great advancements in power wheelchair technology, research shows that wheelchair related accidents occur frequently. To ensure safe maneuverability, capturing wheelchair maneuvering patterns is fundamental to enable other research, such as safe robotic assistance for wheelchair users. In this study, we propose to record, store, and analyze wheelchair maneuvering data by means of mobile cloud computing. Specifically, the accelerometer and gyroscope sensors in smart phones are used to record wheelchair maneuvering data in real-time. Then, the recorded data are periodically transmitted to the cloud for storage and analysis. The analyzed results are then made available to various types of users, such as mobile phone users, traditional desktop users, etc. The combination of mobile computing and cloud computing leverages the advantages of both techniques and extends the smart phone's capabilities of computing and data storage via the Internet. We performed a case study to implement the mobile cloud computing framework using Android smart phones and Google App Engine, a popular cloud computing platform. Experimental results demonstrated the feasibility of the proposed mobile cloud computing framework.

  13. Interfacing HTCondor-CE with OpenStack

    NASA Astrophysics Data System (ADS)

    Bockelman, B.; Caballero Bejar, J.; Hover, J.

    2017-10-01

    Over the past few years, Grid Computing technologies have reached a high level of maturity. One key aspect of this success has been the development and adoption of newer Compute Elements to interface the external Grid users with local batch systems. These new Compute Elements allow for better handling of jobs requirements and a more precise management of diverse local resources. However, despite this level of maturity, the Grid Computing world is lacking diversity in local execution platforms. As Grid Computing technologies have historically been driven by the needs of the High Energy Physics community, most resource providers run the platform (operating system version and architecture) that best suits the needs of their particular users. In parallel, the development of virtualization and cloud technologies has accelerated recently, making available a variety of solutions, both commercial and academic, proprietary and open source. Virtualization facilitates performing computational tasks on platforms not available at most computing sites. This work attempts to join the technologies, allowing users to interact with computing sites through one of the standard Computing Elements, HTCondor-CE, but running their jobs within VMs on a local cloud platform, OpenStack, when needed. The system will re-route, in a transparent way, end user jobs into dynamically-launched VM worker nodes when they have requirements that cannot be satisfied by the static local batch system nodes. Also, once the automated mechanisms are in place, it becomes straightforward to allow an end user to invoke a custom Virtual Machine at the site. This will allow cloud resources to be used without requiring the user to establish a separate account. Both scenarios are described in this work.

  14. Cloud computing and validation of expandable in silico livers.

    PubMed

    Ropella, Glen E P; Hunt, C Anthony

    2010-12-03

    In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware.

  15. IAServ: an intelligent home care web services platform in a cloud for aging-in-place.

    PubMed

    Su, Chuan-Jun; Chiang, Chang-Yu

    2013-11-12

    As the elderly population has been rapidly expanding and the core tax-paying population has been shrinking, the need for adequate elderly health and housing services continues to grow while the resources to provide such services are becoming increasingly scarce. Thus, increasing the efficiency of the delivery of healthcare services through the use of modern technology is a pressing issue. The seamless integration of such enabling technologies as ontology, intelligent agents, web services, and cloud computing is transforming healthcare from hospital-based treatments to home-based self-care and preventive care. A ubiquitous healthcare platform based on this technological integration, which synergizes service providers with patients' needs to be developed to provide personalized healthcare services at the right time, in the right place, and the right manner. This paper presents the development and overall architecture of IAServ (the Intelligent Aging-in-place Home care Web Services Platform) to provide personalized healthcare service ubiquitously in a cloud computing setting to support the most desirable and cost-efficient method of care for the aged-aging in place. The IAServ is expected to offer intelligent, pervasive, accurate and contextually-aware personal care services. Architecturally the implemented IAServ leverages web services and cloud computing to provide economic, scalable, and robust healthcare services over the Internet.

  16. IAServ: An Intelligent Home Care Web Services Platform in a Cloud for Aging-in-Place

    PubMed Central

    Su, Chuan-Jun; Chiang, Chang-Yu

    2013-01-01

    As the elderly population has been rapidly expanding and the core tax-paying population has been shrinking, the need for adequate elderly health and housing services continues to grow while the resources to provide such services are becoming increasingly scarce. Thus, increasing the efficiency of the delivery of healthcare services through the use of modern technology is a pressing issue. The seamless integration of such enabling technologies as ontology, intelligent agents, web services, and cloud computing is transforming healthcare from hospital-based treatments to home-based self-care and preventive care. A ubiquitous healthcare platform based on this technological integration, which synergizes service providers with patients’ needs to be developed to provide personalized healthcare services at the right time, in the right place, and the right manner. This paper presents the development and overall architecture of IAServ (the Intelligent Aging-in-place Home care Web Services Platform) to provide personalized healthcare service ubiquitously in a cloud computing setting to support the most desirable and cost-efficient method of care for the aged-aging in place. The IAServ is expected to offer intelligent, pervasive, accurate and contextually-aware personal care services. Architecturally the implemented IAServ leverages web services and cloud computing to provide economic, scalable, and robust healthcare services over the Internet. PMID:24225647

  17. Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets

    PubMed Central

    Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L

    2014-01-01

    Background As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Methods Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Results Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Conclusions Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. PMID:24464852

  18. MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME.

    PubMed

    Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

    2016-01-01

    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.

  19. Bioinformatics clouds for big data manipulation.

    PubMed

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  20. Mapping urban green open space in Bontang city using QGIS and cloud computing

    NASA Astrophysics Data System (ADS)

    Agus, F.; Ramadiani; Silalahi, W.; Armanda, A.; Kusnandar

    2018-04-01

    Digital mapping techniques are available freely and openly so that map-based application development is easier, faster and cheaper. A rapid development of Cloud Computing Geographic Information System makes this system can help the needs of the community for the provision of geospatial information online. The presence of urban Green Open Space (GOS) provide great benefits as an oxygen supplier, carbon-binding agent and can contribute to providing comfort and beauty of city life. This study aims to propose a platform application of GIS Cloud Computing (CC) of Bontang City GOS mapping. The GIS-CC platform uses the basic map available that’s free and open source. The research used survey method to collect GOS data obtained from Bontang City Government, while application developing works Quantum GIS-CC. The result section describes the existence of GOS Bontang City and the design of GOS mapping application.

  1. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline.

    PubMed

    Reid, Jeffrey G; Carroll, Andrew; Veeraraghavan, Narayanan; Dahdouli, Mahmoud; Sundquist, Andreas; English, Adam; Bainbridge, Matthew; White, Simon; Salerno, William; Buhay, Christian; Yu, Fuli; Muzny, Donna; Daly, Richard; Duyk, Geoff; Gibbs, Richard A; Boerwinkle, Eric

    2014-01-29

    Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.

  2. [Construction and analysis of a monitoring system with remote real-time multiple physiological parameters based on cloud computing].

    PubMed

    Zhu, Lingyun; Li, Lianjie; Meng, Chunyan

    2014-12-01

    There have been problems in the existing multiple physiological parameter real-time monitoring system, such as insufficient server capacity for physiological data storage and analysis so that data consistency can not be guaranteed, poor performance in real-time, and other issues caused by the growing scale of data. We therefore pro posed a new solution which was with multiple physiological parameters and could calculate clustered background data storage and processing based on cloud computing. Through our studies, a batch processing for longitudinal analysis of patients' historical data was introduced. The process included the resource virtualization of IaaS layer for cloud platform, the construction of real-time computing platform of PaaS layer, the reception and analysis of data stream of SaaS layer, and the bottleneck problem of multi-parameter data transmission, etc. The results were to achieve in real-time physiological information transmission, storage and analysis of a large amount of data. The simulation test results showed that the remote multiple physiological parameter monitoring system based on cloud platform had obvious advantages in processing time and load balancing over the traditional server model. This architecture solved the problems including long turnaround time, poor performance of real-time analysis, lack of extensibility and other issues, which exist in the traditional remote medical services. Technical support was provided in order to facilitate a "wearable wireless sensor plus mobile wireless transmission plus cloud computing service" mode moving towards home health monitoring for multiple physiological parameter wireless monitoring.

  3. The Globus Galaxies Platform. Delivering Science Gateways as a Service

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madduri, Ravi; Chard, Kyle; Chard, Ryan

    We use public cloud computers to host sophisticated scientific data; software is then used to transform scientific practice by enabling broad access to capabilities previously available only to the few. The primary obstacle to more widespread use of public clouds to host scientific software (‘cloud-based science gateways’) has thus far been the considerable gap between the specialized needs of science applications and the capabilities provided by cloud infrastructures. We describe here a domain-independent, cloud-based science gateway platform, the Globus Galaxies platform, which overcomes this gap by providing a set of hosted services that directly address the needs of science gatewaymore » developers. The design and implementation of this platform leverages our several years of experience with Globus Genomics, a cloud-based science gateway that has served more than 200 genomics researchers across 30 institutions. Building on that foundation, we have also implemented a platform that leverages the popular Galaxy system for application hosting and workflow execution; Globus services for data transfer, user and group management, and authentication; and a cost-aware elastic provisioning model specialized for public cloud resources. We describe here the capabilities and architecture of this platform, present six scientific domains in which we have successfully applied it, report on user experiences, and analyze the economics of our deployments. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less

  4. Creating a Rackspace and NASA Nebula compatible cloud using the OpenStack project (Invited)

    NASA Astrophysics Data System (ADS)

    Clark, R.

    2010-12-01

    NASA and Rackspace have both provided technology to the OpenStack that allows anyone to create a private Infrastructure as a Service (IaaS) cloud using open source software and commodity hardware. OpenStack is designed and developed completely in the open and with an open governance process. NASA donated Nova, which powers the compute portion of NASA Nebula Cloud Computing Platform, and Rackspace donated Swift, which powers Rackspace Cloud Files. The project is now in continuous development by NASA, Rackspace, and hundreds of other participants. When you create a private cloud using Openstack, you will have the ability to easily interact with your private cloud, a government cloud, and an ecosystem of public cloud providers, using the same API.

  5. Virtualization and cloud computing in dentistry.

    PubMed

    Chow, Frank; Muftu, Ali; Shorter, Richard

    2014-01-01

    The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.

  6. Trusted computing strengthens cloud authentication.

    PubMed

    Ghazizadeh, Eghbal; Zamani, Mazdak; Ab Manan, Jamalul-lail; Alizadeh, Mojtaba

    2014-01-01

    Cloud computing is a new generation of technology which is designed to provide the commercial necessities, solve the IT management issues, and run the appropriate applications. Another entry on the list of cloud functions which has been handled internally is Identity Access Management (IAM). Companies encounter IAM as security challenges while adopting more technologies became apparent. Trust Multi-tenancy and trusted computing based on a Trusted Platform Module (TPM) are great technologies for solving the trust and security concerns in the cloud identity environment. Single sign-on (SSO) and OpenID have been released to solve security and privacy problems for cloud identity. This paper proposes the use of trusted computing, Federated Identity Management, and OpenID Web SSO to solve identity theft in the cloud. Besides, this proposed model has been simulated in .Net environment. Security analyzing, simulation, and BLP confidential model are three ways to evaluate and analyze our proposed model.

  7. Trusted Computing Strengthens Cloud Authentication

    PubMed Central

    2014-01-01

    Cloud computing is a new generation of technology which is designed to provide the commercial necessities, solve the IT management issues, and run the appropriate applications. Another entry on the list of cloud functions which has been handled internally is Identity Access Management (IAM). Companies encounter IAM as security challenges while adopting more technologies became apparent. Trust Multi-tenancy and trusted computing based on a Trusted Platform Module (TPM) are great technologies for solving the trust and security concerns in the cloud identity environment. Single sign-on (SSO) and OpenID have been released to solve security and privacy problems for cloud identity. This paper proposes the use of trusted computing, Federated Identity Management, and OpenID Web SSO to solve identity theft in the cloud. Besides, this proposed model has been simulated in .Net environment. Security analyzing, simulation, and BLP confidential model are three ways to evaluate and analyze our proposed model. PMID:24701149

  8. The design of an m-Health monitoring system based on a cloud computing platform

    NASA Astrophysics Data System (ADS)

    Xu, Boyi; Xu, Lida; Cai, Hongming; Jiang, Lihong; Luo, Yang; Gu, Yizhi

    2017-01-01

    Compared to traditional medical services provided within hospitals, m-Health monitoring systems (MHMSs) face more challenges in personalised health data processing. To achieve personalised and high-quality health monitoring by means of new technologies, such as mobile network and cloud computing, in this paper, a framework of an m-Health monitoring system based on a cloud computing platform (Cloud-MHMS) is designed to implement pervasive health monitoring. Furthermore, the modules of the framework, which are Cloud Storage and Multiple Tenants Access Control Layer, Healthcare Data Annotation Layer, and Healthcare Data Analysis Layer, are discussed. In the data storage layer, a multiple tenant access method is designed to protect patient privacy. In the data annotation layer, linked open data are adopted to augment health data interoperability semantically. In the data analysis layer, the process mining algorithm and similarity calculating method are implemented to support personalised treatment plan selection. These three modules cooperate to implement the core functions in the process of health monitoring, which are data storage, data processing, and data analysis. Finally, we study the application of our architecture in the monitoring of antimicrobial drug usage to demonstrate the usability of our method in personal healthcare analysis.

  9. Facilitating NASA Earth Science Data Processing Using Nebula Cloud Computing

    NASA Astrophysics Data System (ADS)

    Chen, A.; Pham, L.; Kempler, S.; Theobald, M.; Esfandiari, A.; Campino, J.; Vollmer, B.; Lynnes, C.

    2011-12-01

    Cloud Computing technology has been used to offer high-performance and low-cost computing and storage resources for both scientific problems and business services. Several cloud computing services have been implemented in the commercial arena, e.g. Amazon's EC2 & S3, Microsoft's Azure, and Google App Engine. There are also some research and application programs being launched in academia and governments to utilize Cloud Computing. NASA launched the Nebula Cloud Computing platform in 2008, which is an Infrastructure as a Service (IaaS) to deliver on-demand distributed virtual computers. Nebula users can receive required computing resources as a fully outsourced service. NASA Goddard Earth Science Data and Information Service Center (GES DISC) migrated several GES DISC's applications to the Nebula as a proof of concept, including: a) The Simple, Scalable, Script-based Science Processor for Measurements (S4PM) for processing scientific data; b) the Atmospheric Infrared Sounder (AIRS) data process workflow for processing AIRS raw data; and c) the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (GIOVANNI) for online access to, analysis, and visualization of Earth science data. This work aims to evaluate the practicability and adaptability of the Nebula. The initial work focused on the AIRS data process workflow to evaluate the Nebula. The AIRS data process workflow consists of a series of algorithms being used to process raw AIRS level 0 data and output AIRS level 2 geophysical retrievals. Migrating the entire workflow to the Nebula platform is challenging, but practicable. After installing several supporting libraries and the processing code itself, the workflow is able to process AIRS data in a similar fashion to its current (non-cloud) configuration. We compared the performance of processing 2 days of AIRS level 0 data through level 2 using a Nebula virtual computer and a local Linux computer. The result shows that Nebula has significantly better performance than the local machine. Much of the difference was due to newer equipment in the Nebula than the legacy computer, which is suggestive of a potential economic advantage beyond elastic power, i.e., access to up-to-date hardware vs. legacy hardware that must be maintained past its prime to amortize the cost. In addition to a trade study of advantages and challenges of porting complex processing to the cloud, a tutorial was developed to enable further progress in utilizing the Nebula for Earth Science applications and understanding better the potential for Cloud Computing in further data- and computing-intensive Earth Science research. In particular, highly bursty computing such as that experienced in the user-demand-driven Giovanni system may become more tractable in a Cloud environment. Our future work will continue to focus on migrating more GES DISC's applications/instances, e.g. Giovanni instances, to the Nebula platform and making matured migrated applications to be in operation on the Nebula.

  10. Cloudgene: A graphical execution platform for MapReduce programs on private and public clouds

    PubMed Central

    2012-01-01

    Background The MapReduce framework enables a scalable processing and analyzing of large datasets by distributing the computational load on connected computer nodes, referred to as a cluster. In Bioinformatics, MapReduce has already been adopted to various case scenarios such as mapping next generation sequencing data to a reference genome, finding SNPs from short read data or matching strings in genotype files. Nevertheless, tasks like installing and maintaining MapReduce on a cluster system, importing data into its distributed file system or executing MapReduce programs require advanced knowledge in computer science and could thus prevent scientists from usage of currently available and useful software solutions. Results Here we present Cloudgene, a freely available platform to improve the usability of MapReduce programs in Bioinformatics by providing a graphical user interface for the execution, the import and export of data and the reproducibility of workflows on in-house (private clouds) and rented clusters (public clouds). The aim of Cloudgene is to build a standardized graphical execution environment for currently available and future MapReduce programs, which can all be integrated by using its plug-in interface. Since Cloudgene can be executed on private clusters, sensitive datasets can be kept in house at all time and data transfer times are therefore minimized. Conclusions Our results show that MapReduce programs can be integrated into Cloudgene with little effort and without adding any computational overhead to existing programs. This platform gives developers the opportunity to focus on the actual implementation task and provides scientists a platform with the aim to hide the complexity of MapReduce. In addition to MapReduce programs, Cloudgene can also be used to launch predefined systems (e.g. Cloud BioLinux, RStudio) in public clouds. Currently, five different bioinformatic programs using MapReduce and two systems are integrated and have been successfully deployed. Cloudgene is freely available at http://cloudgene.uibk.ac.at. PMID:22888776

  11. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  12. Large Spatial Scale Ground Displacement Mapping through the P-SBAS Processing of Sentinel-1 Data on a Cloud Computing Environment

    NASA Astrophysics Data System (ADS)

    Casu, F.; Bonano, M.; de Luca, C.; Lanari, R.; Manunta, M.; Manzo, M.; Zinno, I.

    2017-12-01

    Since its launch in 2014, the Sentinel-1 (S1) constellation has played a key role on SAR data availability and dissemination all over the World. Indeed, the free and open access data policy adopted by the European Copernicus program together with the global coverage acquisition strategy, make the Sentinel constellation as a game changer in the Earth Observation scenario. Being the SAR data become ubiquitous, the technological and scientific challenge is focused on maximizing the exploitation of such huge data flow. In this direction, the use of innovative processing algorithms and distributed computing infrastructures, such as the Cloud Computing platforms, can play a crucial role. In this work we present a Cloud Computing solution for the advanced interferometric (DInSAR) processing chain based on the Parallel SBAS (P-SBAS) approach, aimed at processing S1 Interferometric Wide Swath (IWS) data for the generation of large spatial scale deformation time series in efficient, automatic and systematic way. Such a DInSAR chain ingests Sentinel 1 SLC images and carries out several processing steps, to finally compute deformation time series and mean deformation velocity maps. Different parallel strategies have been designed ad hoc for each processing step of the P-SBAS S1 chain, encompassing both multi-core and multi-node programming techniques, in order to maximize the computational efficiency achieved within a Cloud Computing environment and cut down the relevant processing times. The presented P-SBAS S1 processing chain has been implemented on the Amazon Web Services platform and a thorough analysis of the attained parallel performances has been performed to identify and overcome the major bottlenecks to the scalability. The presented approach is used to perform national-scale DInSAR analyses over Italy, involving the processing of more than 3000 S1 IWS images acquired from both ascending and descending orbits. Such an experiment confirms the big advantage of exploiting large computational and storage resources of Cloud Computing platforms for large scale DInSAR analysis. The presented Cloud Computing P-SBAS processing chain can be a precious tool in the perspective of developing operational services disposable for the EO scientific community related to hazard monitoring and risk prevention and mitigation.

  13. Industrial Cloud: Toward Inter-enterprise Integration

    NASA Astrophysics Data System (ADS)

    Wlodarczyk, Tomasz Wiktor; Rong, Chunming; Thorsen, Kari Anne Haaland

    Industrial cloud is introduced as a new inter-enterprise integration concept in cloud computing. The characteristics of an industrial cloud are given by its definition and architecture and compared with other general cloud concepts. The concept is then demonstrated by a practical use case, based on Integrated Operations (IO) in the Norwegian Continental Shelf (NCS), showing how industrial digital information integration platform gives competitive advantage to the companies involved. Further research and development challenges are also discussed.

  14. Green Cloud on the Horizon

    NASA Astrophysics Data System (ADS)

    Ali, Mufajjul

    This paper proposes a Green Cloud model for mobile Cloud computing. The proposed model leverage on the current trend of IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service), and look at new paradigm called "Network as a Service" (NaaS). The Green Cloud model proposes various Telco's revenue generating streams and services with the CaaS (Cloud as a Service) for the near future.

  15. Cloud Computing for Protein-Ligand Binding Site Comparison

    PubMed Central

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery. PMID:23762824

  16. Cloud computing for protein-ligand binding site comparison.

    PubMed

    Hung, Che-Lun; Hua, Guan-Jie

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.

  17. Waggle: A Framework for Intelligent Attentive Sensing and Actuation

    NASA Astrophysics Data System (ADS)

    Sankaran, R.; Jacob, R. L.; Beckman, P. H.; Catlett, C. E.; Keahey, K.

    2014-12-01

    Advances in sensor-driven computation and computationally steered sensing will greatly enable future research in fields including environmental and atmospheric sciences. We will present "Waggle," an open-source hardware and software infrastructure developed with two goals: (1) reducing the separation and latency between sensing and computing and (2) improving the reliability and longevity of sensing-actuation platforms in challenging and costly deployments. Inspired by "deep-space probe" systems, the Waggle platform design includes features that can support longitudinal studies, deployments with varying communication links, and remote management capabilities. Waggle lowers the barrier for scientists to incorporate real-time data from their sensors into their computations and to manipulate the sensors or provide feedback through actuators. A standardized software and hardware design allows quick addition of new sensors/actuators and associated software in the nodes and enables them to be coupled with computational codes both insitu and on external compute infrastructure. The Waggle framework currently drives the deployment of two observational systems - a portable and self-sufficient weather platform for study of small-scale effects in Chicago's urban core and an open-ended distributed instrument in Chicago that aims to support several research pursuits across a broad range of disciplines including urban planning, microbiology and computer science. Built around open-source software, hardware, and Linux OS, the Waggle system comprises two components - the Waggle field-node and Waggle cloud-computing infrastructure. Waggle field-node affords a modular, scalable, fault-tolerant, secure, and extensible platform for hosting sensors and actuators in the field. It supports insitu computation and data storage, and integration with cloud-computing infrastructure. The Waggle cloud infrastructure is designed with the goal of scaling to several hundreds of thousands of Waggle nodes. It supports aggregating data from sensors hosted by the nodes, staging computation, relaying feedback to the nodes and serving data to end-users. We will discuss the Waggle design principles and their applicability to various observational research pursuits, and demonstrate its capabilities.

  18. Scientific Services on the Cloud

    NASA Astrophysics Data System (ADS)

    Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong

    Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.

  19. Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets.

    PubMed

    Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L

    2014-01-01

    As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Google Earth Engine: a new cloud-computing platform for global-scale earth observation data and analysis

    NASA Astrophysics Data System (ADS)

    Moore, R. T.; Hansen, M. C.

    2011-12-01

    Google Earth Engine is a new technology platform that enables monitoring and measurement of changes in the earth's environment, at planetary scale, on a large catalog of earth observation data. The platform offers intrinsically-parallel computational access to thousands of computers in Google's data centers. Initial efforts have focused primarily on global forest monitoring and measurement, in support of REDD+ activities in the developing world. The intent is to put this platform into the hands of scientists and developing world nations, in order to advance the broader operational deployment of existing scientific methods, and strengthen the ability for public institutions and civil society to better understand, manage and report on the state of their natural resources. Earth Engine currently hosts online nearly the complete historical Landsat archive of L5 and L7 data collected over more than twenty-five years. Newly-collected Landsat imagery is downloaded from USGS EROS Center into Earth Engine on a daily basis. Earth Engine also includes a set of historical and current MODIS data products. The platform supports generation, on-demand, of spatial and temporal mosaics, "best-pixel" composites (for example to remove clouds and gaps in satellite imagery), as well as a variety of spectral indices. Supervised learning methods are available over the Landsat data catalog. The platform also includes a new application programming framework, or "API", that allows scientists access to these computational and data resources, to scale their current algorithms or develop new ones. Under the covers of the Google Earth Engine API is an intrinsically-parallel image-processing system. Several forest monitoring applications powered by this API are currently in development and expected to be operational in 2011. Combining science with massive data and technology resources in a cloud-computing framework can offer advantages of computational speed, ease-of-use and collaboration, as well as transparency in data and methods. Methods developed for global processing of MODIS data to map land cover are being adopted for use with Landsat data. Specifically, the MODIS Vegetation Continuous Field product methodology has been applied for mapping forest extent and change at national scales using Landsat time-series data sets. Scaling this method to continental and global scales is enabled by Google Earth Engine computing capabilities. By combining the supervised learning VCF approach with the Landsat archive and cloud computing, unprecedented monitoring of land cover dynamics is enabled.

  1. Heart beats in the cloud: distributed analysis of electrophysiological ‘Big Data’ using cloud computing for epilepsy clinical research

    PubMed Central

    Sahoo, Satya S; Jayapandian, Catherine; Garg, Gaurav; Kaffashi, Farhad; Chung, Stephanie; Bozorgi, Alireza; Chen, Chien-Hun; Loparo, Kenneth; Lhatoo, Samden D; Zhang, Guo-Qiang

    2014-01-01

    Objective The rapidly growing volume of multimodal electrophysiological signal data is playing a critical role in patient care and clinical research across multiple disease domains, such as epilepsy and sleep medicine. To facilitate secondary use of these data, there is an urgent need to develop novel algorithms and informatics approaches using new cloud computing technologies as well as ontologies for collaborative multicenter studies. Materials and methods We present the Cloudwave platform, which (a) defines parallelized algorithms for computing cardiac measures using the MapReduce parallel programming framework, (b) supports real-time interaction with large volumes of electrophysiological signals, and (c) features signal visualization and querying functionalities using an ontology-driven web-based interface. Cloudwave is currently used in the multicenter National Institute of Neurological Diseases and Stroke (NINDS)-funded Prevention and Risk Identification of SUDEP (sudden unexplained death in epilepsy) Mortality (PRISM) project to identify risk factors for sudden death in epilepsy. Results Comparative evaluations of Cloudwave with traditional desktop approaches to compute cardiac measures (eg, QRS complexes, RR intervals, and instantaneous heart rate) on epilepsy patient data show one order of magnitude improvement for single-channel ECG data and 20 times improvement for four-channel ECG data. This enables Cloudwave to support real-time user interaction with signal data, which is semantically annotated with a novel epilepsy and seizure ontology. Discussion Data privacy is a critical issue in using cloud infrastructure, and cloud platforms, such as Amazon Web Services, offer features to support Health Insurance Portability and Accountability Act standards. Conclusion The Cloudwave platform is a new approach to leverage of large-scale electrophysiological data for advancing multicenter clinical research. PMID:24326538

  2. Heart beats in the cloud: distributed analysis of electrophysiological 'Big Data' using cloud computing for epilepsy clinical research.

    PubMed

    Sahoo, Satya S; Jayapandian, Catherine; Garg, Gaurav; Kaffashi, Farhad; Chung, Stephanie; Bozorgi, Alireza; Chen, Chien-Hun; Loparo, Kenneth; Lhatoo, Samden D; Zhang, Guo-Qiang

    2014-01-01

    The rapidly growing volume of multimodal electrophysiological signal data is playing a critical role in patient care and clinical research across multiple disease domains, such as epilepsy and sleep medicine. To facilitate secondary use of these data, there is an urgent need to develop novel algorithms and informatics approaches using new cloud computing technologies as well as ontologies for collaborative multicenter studies. We present the Cloudwave platform, which (a) defines parallelized algorithms for computing cardiac measures using the MapReduce parallel programming framework, (b) supports real-time interaction with large volumes of electrophysiological signals, and (c) features signal visualization and querying functionalities using an ontology-driven web-based interface. Cloudwave is currently used in the multicenter National Institute of Neurological Diseases and Stroke (NINDS)-funded Prevention and Risk Identification of SUDEP (sudden unexplained death in epilepsy) Mortality (PRISM) project to identify risk factors for sudden death in epilepsy. Comparative evaluations of Cloudwave with traditional desktop approaches to compute cardiac measures (eg, QRS complexes, RR intervals, and instantaneous heart rate) on epilepsy patient data show one order of magnitude improvement for single-channel ECG data and 20 times improvement for four-channel ECG data. This enables Cloudwave to support real-time user interaction with signal data, which is semantically annotated with a novel epilepsy and seizure ontology. Data privacy is a critical issue in using cloud infrastructure, and cloud platforms, such as Amazon Web Services, offer features to support Health Insurance Portability and Accountability Act standards. The Cloudwave platform is a new approach to leverage of large-scale electrophysiological data for advancing multicenter clinical research.

  3. AstroCloud: An Agile platform for data visualization and specific analyzes in 2D and 3D

    NASA Astrophysics Data System (ADS)

    Molina, F. Z.; Salgado, R.; Bergel, A.; Infante, A.

    2017-07-01

    Nowadays, astronomers commonly run their own tools, or distributed computational packages, for data analysis and then visualizing the results with generic applications. This chain of processes comes at high cost: (a) analyses are manually applied, they are therefore difficult to be automatized, and (b) data have to be serialized, thus increasing the cost of parsing and saving intermediary data. We are developing AstroCloud, an agile visualization multipurpose platform intended for specific analyses of astronomical images (https://astrocloudy.wordpress.com). This platform incorporates domain-specific languages which make it easily extensible. AstroCloud supports customized plug-ins, which translate into time reduction on data analysis. Moreover, it also supports 2D and 3D rendering, including interactive features in real time. AstroCloud is under development, we are currently implementing different choices for data reduction and physical analyzes.

  4. GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment.

    PubMed

    Liu, Yangchuan; Tang, Yuguo; Gao, Xin

    2017-12-01

    The GATE Monte Carlo simulation platform has good application prospects of treatment planning and quality assurance. However, accurate dose calculation using GATE is time consuming. The purpose of this study is to implement a novel cloud computing method for accurate GATE Monte Carlo simulation of dose distribution using MapReduce. An Amazon Machine Image installed with Hadoop and GATE is created to set up Hadoop clusters on Amazon Elastic Compute Cloud (EC2). Macros, the input files for GATE, are split into a number of self-contained sub-macros. Through Hadoop Streaming, the sub-macros are executed by GATE in Map tasks and the sub-results are aggregated into final outputs in Reduce tasks. As an evaluation, GATE simulations were performed in a cubical water phantom for X-ray photons of 6 and 18 MeV. The parallel simulation on the cloud computing platform is as accurate as the single-threaded simulation on a local server and the simulation correctness is not affected by the failure of some worker nodes. The cloud-based simulation time is approximately inversely proportional to the number of worker nodes. For the simulation of 10 million photons on a cluster with 64 worker nodes, time decreases of 41× and 32× were achieved compared to the single worker node case and the single-threaded case, respectively. The test of Hadoop's fault tolerance showed that the simulation correctness was not affected by the failure of some worker nodes. The results verify that the proposed method provides a feasible cloud computing solution for GATE.

  5. Cloud Computing

    DTIC Science & Technology

    2009-11-12

    Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of Capability Based on access Based...Mellon University Software -as-a- Service ( SaaS ) Application-specific capabilities, e.g., service that provides customer management Allows organizations...as a Service ( SaaS ) Model of software deployment in which a provider licenses an application to customers for use as a service on

  6. MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME

    PubMed Central

    Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

    2017-01-01

    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments. PMID:28190948

  7. The monitoring and managing application of cloud computing based on Internet of Things.

    PubMed

    Luo, Shiliang; Ren, Bin

    2016-07-01

    Cloud computing and the Internet of Things are the two hot points in the Internet application field. The application of the two new technologies is in hot discussion and research, but quite less on the field of medical monitoring and managing application. Thus, in this paper, we study and analyze the application of cloud computing and the Internet of Things on the medical field. And we manage to make a combination of the two techniques in the medical monitoring and managing field. The model architecture for remote monitoring cloud platform of healthcare information (RMCPHI) was established firstly. Then the RMCPHI architecture was analyzed. Finally an efficient PSOSAA algorithm was proposed for the medical monitoring and managing application of cloud computing. Simulation results showed that our proposed scheme can improve the efficiency about 50%. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

    PubMed Central

    2014-01-01

    Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911

  9. Arkas: Rapid reproducible RNAseq analysis

    PubMed Central

    Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

    2017-01-01

    The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways .  Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing.   Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import.  Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134

  10. Cloud computing and validation of expandable in silico livers

    PubMed Central

    2010-01-01

    Background In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. Results The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. Conclusions The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware. PMID:21129207

  11. Cloud Computing Boosts Business Intelligence of Telecommunication Industry

    NASA Astrophysics Data System (ADS)

    Xu, Meng; Gao, Dan; Deng, Chao; Luo, Zhiguo; Sun, Shaoling

    Business Intelligence becomes an attracting topic in today's data intensive applications, especially in telecommunication industry. Meanwhile, Cloud Computing providing IT supporting Infrastructure with excellent scalability, large scale storage, and high performance becomes an effective way to implement parallel data processing and data mining algorithms. BC-PDM (Big Cloud based Parallel Data Miner) is a new MapReduce based parallel data mining platform developed by CMRI (China Mobile Research Institute) to fit the urgent requirements of business intelligence in telecommunication industry. In this paper, the architecture, functionality and performance of BC-PDM are presented, together with the experimental evaluation and case studies of its applications. The evaluation result demonstrates both the usability and the cost-effectiveness of Cloud Computing based Business Intelligence system in applications of telecommunication industry.

  12. Trust-Enhanced Cloud Service Selection Model Based on QoS Analysis.

    PubMed

    Pan, Yuchen; Ding, Shuai; Fan, Wenjuan; Li, Jing; Yang, Shanlin

    2015-01-01

    Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard's Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments.

  13. Trust-Enhanced Cloud Service Selection Model Based on QoS Analysis

    PubMed Central

    Pan, Yuchen; Ding, Shuai; Fan, Wenjuan; Li, Jing; Yang, Shanlin

    2015-01-01

    Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard’s Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments. PMID:26606388

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vang, Leng; Prescott, Steven R; Smith, Curtis

    In collaborating scientific research arena it is important to have an environment where analysts have access to a shared of information documents, software tools and be able to accurately maintain and track historical changes in models. A new cloud-based environment would be accessible remotely from anywhere regardless of computing platforms given that the platform has available of Internet access and proper browser capabilities. Information stored at this environment would be restricted based on user assigned credentials. This report reviews development of a Cloud-based Architecture Capabilities (CAC) as a web portal for PRA tools.

  15. Cloud-Based Virtual Laboratory for Network Security Education

    ERIC Educational Resources Information Center

    Xu, Le; Huang, Dijiang; Tsai, Wei-Tek

    2014-01-01

    Hands-on experiments are essential for computer network security education. Existing laboratory solutions usually require significant effort to build, configure, and maintain and often do not support reconfigurability, flexibility, and scalability. This paper presents a cloud-based virtual laboratory education platform called V-Lab that provides a…

  16. From Analysis to Impact: Challenges and Outcomes from Google's Cloud-based Platforms for Analyzing and Leveraging Petapixels of Geospatial Data

    NASA Astrophysics Data System (ADS)

    Thau, D.

    2017-12-01

    For the past seven years, Google has made petabytes of Earth observation data, and the tools to analyze it, freely available to researchers around the world via cloud computing. These data and tools were initially available via Google Earth Engine and are increasingly available on the Google Cloud Platform. We have introduced a number of APIs for both the analysis and presentation of geospatial data that have been successfully used to create impactful datasets and web applications, including studies of global surface water availability, global tree cover change, and crop yield estimation. Each of these projects used the cloud to analyze thousands to millions of Landsat scenes. The APIs support a range of publishing options, from outputting imagery and data for inclusion in papers, to providing tools for full scale web applications that provide analysis tools of their own. Over the course of developing these tools, we have learned a number of lessons about how to build a publicly available cloud platform for geospatial analysis, and about how the characteristics of an API can affect the kinds of impacts a platform can enable. This study will present an overview of how Google Earth Engine works and how Google's geospatial capabilities are extending to Google Cloud Platform. We will provide a number of case studies describing how these platforms, and the data they host, have been leveraged to build impactful decision support tools used by governments, researchers, and other institutions, and we will describe how the available APIs have shaped (or constrained) those tools. [Image Credit: Tyler A. Erickson

  17. Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing

    NASA Astrophysics Data System (ADS)

    Tang, Jingyin; Matyas, Corene J.

    2018-02-01

    Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.

  18. Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis.

    PubMed

    Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E

    2018-04-15

    Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele. darrell.hurt@nih.gov.

  19. Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis

    PubMed Central

    Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E

    2018-01-01

    Abstract Motivation Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. Results The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. Availability and implementation https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele Contact darrell.hurt@nih.gov PMID:29028892

  20. Evaluating open-source cloud computing solutions for geosciences

    NASA Astrophysics Data System (ADS)

    Huang, Qunying; Yang, Chaowei; Liu, Kai; Xia, Jizhe; Xu, Chen; Li, Jing; Gui, Zhipeng; Sun, Min; Li, Zhenglong

    2013-09-01

    Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.

  1. COMBAT: mobile-Cloud-based cOmpute/coMmunications infrastructure for BATtlefield applications

    NASA Astrophysics Data System (ADS)

    Soyata, Tolga; Muraleedharan, Rajani; Langdon, Jonathan; Funai, Colin; Ames, Scott; Kwon, Minseok; Heinzelman, Wendi

    2012-05-01

    The amount of data processed annually over the Internet has crossed the zetabyte boundary, yet this Big Data cannot be efficiently processed or stored using today's mobile devices. Parallel to this explosive growth in data, a substantial increase in mobile compute-capability and the advances in cloud computing have brought the state-of-the- art in mobile-cloud computing to an inflection point, where the right architecture may allow mobile devices to run applications utilizing Big Data and intensive computing. In this paper, we propose the MObile Cloud-based Hybrid Architecture (MOCHA), which formulates a solution to permit mobile-cloud computing applications such as object recognition in the battlefield by introducing a mid-stage compute- and storage-layer, called the cloudlet. MOCHA is built on the key observation that many mobile-cloud applications have the following characteristics: 1) they are compute-intensive, requiring the compute-power of a supercomputer, and 2) they use Big Data, requiring a communications link to cloud-based database sources in near-real-time. In this paper, we describe the operation of MOCHA in battlefield applications, by formulating the aforementioned mobile and cloudlet to be housed within a soldier's vest and inside a military vehicle, respectively, and enabling access to the cloud through high latency satellite links. We provide simulations using the traditional mobile-cloud approach as well as utilizing MOCHA with a mid-stage cloudlet to quantify the utility of this architecture. We show that the MOCHA platform for mobile-cloud computing promises a future for critical battlefield applications that access Big Data, which is currently not possible using existing technology.

  2. Development of a cloud-based Bioinformatics Training Platform.

    PubMed

    Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A

    2017-05-01

    The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.

  3. Development of a cloud-based Bioinformatics Training Platform

    PubMed Central

    Revote, Jerico; Watson-Haigh, Nathan S.; Quenette, Steve; Bethwaite, Blair; McGrath, Annette

    2017-01-01

    Abstract The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. PMID:27084333

  4. Trusted measurement model based on multitenant behaviors.

    PubMed

    Ning, Zhen-Hu; Shen, Chang-Xiang; Zhao, Yong; Liang, Peng

    2014-01-01

    With a fast growing pervasive computing, especially cloud computing, the behaviour measurement is at the core and plays a vital role. A new behaviour measurement tailored for Multitenants in cloud computing is needed urgently to fundamentally establish trust relationship. Based on our previous research, we propose an improved trust relationship scheme which captures the world of cloud computing where multitenants share the same physical computing platform. Here, we first present the related work on multitenant behaviour; secondly, we give the scheme of behaviour measurement where decoupling of multitenants is taken into account; thirdly, we explicitly explain our decoupling algorithm for multitenants; fourthly, we introduce a new way of similarity calculation for deviation control, which fits the coupled multitenants under study well; lastly, we design the experiments to test our scheme.

  5. Trusted Measurement Model Based on Multitenant Behaviors

    PubMed Central

    Ning, Zhen-Hu; Shen, Chang-Xiang; Zhao, Yong; Liang, Peng

    2014-01-01

    With a fast growing pervasive computing, especially cloud computing, the behaviour measurement is at the core and plays a vital role. A new behaviour measurement tailored for Multitenants in cloud computing is needed urgently to fundamentally establish trust relationship. Based on our previous research, we propose an improved trust relationship scheme which captures the world of cloud computing where multitenants share the same physical computing platform. Here, we first present the related work on multitenant behaviour; secondly, we give the scheme of behaviour measurement where decoupling of multitenants is taken into account; thirdly, we explicitly explain our decoupling algorithm for multitenants; fourthly, we introduce a new way of similarity calculation for deviation control, which fits the coupled multitenants under study well; lastly, we design the experiments to test our scheme. PMID:24987731

  6. Seismic waveform modeling over cloud

    NASA Astrophysics Data System (ADS)

    Luo, Cong; Friederich, Wolfgang

    2016-04-01

    With the fast growing computational technologies, numerical simulation of seismic wave propagation achieved huge successes. Obtaining the synthetic waveforms through numerical simulation receives an increasing amount of attention from seismologists. However, computational seismology is a data-intensive research field, and the numerical packages usually come with a steep learning curve. Users are expected to master considerable amount of computer knowledge and data processing skills. Training users to use the numerical packages, correctly access and utilize the computational resources is a troubled task. In addition to that, accessing to HPC is also a common difficulty for many users. To solve these problems, a cloud based solution dedicated on shallow seismic waveform modeling has been developed with the state-of-the-art web technologies. It is a web platform integrating both software and hardware with multilayer architecture: a well designed SQL database serves as the data layer, HPC and dedicated pipeline for it is the business layer. Through this platform, users will no longer need to compile and manipulate various packages on the local machine within local network to perform a simulation. By providing users professional access to the computational code through its interfaces and delivering our computational resources to the users over cloud, users can customize the simulation at expert-level, submit and run the job through it.

  7. User Inspired Management of Scientific Jobs in Grids and Clouds

    ERIC Educational Resources Information Center

    Withana, Eran Chinthaka

    2011-01-01

    From time-critical, real time computational experimentation to applications which process petabytes of data there is a continuing search for faster, more responsive computing platforms capable of supporting computational experimentation. Weather forecast models, for instance, process gigabytes of data to produce regional (mesoscale) predictions on…

  8. Cloud-based Jupyter Notebooks for Water Data Analysis

    NASA Astrophysics Data System (ADS)

    Castronova, A. M.; Brazil, L.; Seul, M.

    2017-12-01

    The development and adoption of technologies by the water science community to improve our ability to openly collaborate and share workflows will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. Jupyter notebooks offer one solution by providing an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Adoption of this technology within the water sciences, coupled with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data intensive toolchains. Moreover, implementing this software stack in a cloud-based environment extends its native functionality to provide researchers a mechanism to build and execute toolchains that are too large or computationally demanding for typical desktop computers. Additionally, this cloud-based solution enables scientists to disseminate data processing routines alongside journal publications in an effort to support reproducibility. For example, these data collection and analysis toolchains can be shared, archived, and published using the HydroShare platform or downloaded and executed locally to reproduce scientific analysis. This work presents the design and implementation of a cloud-based Jupyter environment and its application for collecting, aggregating, and munging various datasets in a transparent, sharable, and self-documented manner. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss recent efforts towards achieving these goals, and describe the architectural design of the notebook server in an effort to support collaborative and reproducible science.

  9. Design and deployment of an elastic network test-bed in IHEP data center based on SDN

    NASA Astrophysics Data System (ADS)

    Zeng, Shan; Qi, Fazhi; Chen, Gang

    2017-10-01

    High energy physics experiments produce huge amounts of raw data, while because of the sharing characteristics of the network resources, there is no guarantee of the available bandwidth for each experiment which may cause link congestion problems. On the other side, with the development of cloud computing technologies, IHEP have established a cloud platform based on OpenStack which can ensure the flexibility of the computing and storage resources, and more and more computing applications have been deployed on virtual machines established by OpenStack. However, under the traditional network architecture, network capability can’t be required elastically, which becomes the bottleneck of restricting the flexible application of cloud computing. In order to solve the above problems, we propose an elastic cloud data center network architecture based on SDN, and we also design a high performance controller cluster based on OpenDaylight. In the end, we present our current test results.

  10. Computational biology in the cloud: methods and new insights from computing at scale.

    PubMed

    Kasson, Peter M

    2013-01-01

    The past few years have seen both explosions in the size of biological data sets and the proliferation of new, highly flexible on-demand computing capabilities. The sheer amount of information available from genomic and metagenomic sequencing, high-throughput proteomics, experimental and simulation datasets on molecular structure and dynamics affords an opportunity for greatly expanded insight, but it creates new challenges of scale for computation, storage, and interpretation of petascale data. Cloud computing resources have the potential to help solve these problems by offering a utility model of computing and storage: near-unlimited capacity, the ability to burst usage, and cheap and flexible payment models. Effective use of cloud computing on large biological datasets requires dealing with non-trivial problems of scale and robustness, since performance-limiting factors can change substantially when a dataset grows by a factor of 10,000 or more. New computing paradigms are thus often needed. The use of cloud platforms also creates new opportunities to share data, reduce duplication, and to provide easy reproducibility by making the datasets and computational methods easily available.

  11. Three-Dimensional Space to Assess Cloud Interoperability

    DTIC Science & Technology

    2013-03-01

    12 1. Portability and Mobility ...collection of network-enabled services that guarantees to provide a scalable, easy accessible, reliable, and personalized computing infrastructure , based on...are used in research to describe cloud models, such as SaaS (Software as a Service), PaaS (Platform as a service), IaaS ( Infrastructure as a Service

  12. Secure and Resilient Cloud Computing for the Department of Defense

    DTIC Science & Technology

    2015-11-16

    platform as a service (PaaS), and software as a service ( SaaS )—that target system administrators, developers, and end-users respectively (see Table 2...interfaces (API) and services Medium Amazon Elastic MapReduce, MathWorks Cloud, Red Hat OpenShift SaaS Full-fledged applications Low Google gMail

  13. JINR cloud infrastructure evolution

    NASA Astrophysics Data System (ADS)

    Baranov, A. V.; Balashov, N. A.; Kutovskiy, N. A.; Semenov, R. N.

    2016-09-01

    To fulfil JINR commitments in different national and international projects related to the use of modern information technologies such as cloud and grid computing as well as to provide a modern tool for JINR users for their scientific research a cloud infrastructure was deployed at Laboratory of Information Technologies of Joint Institute for Nuclear Research. OpenNebula software was chosen as a cloud platform. Initially it was set up in simple configuration with single front-end host and a few cloud nodes. Some custom development was done to tune JINR cloud installation to fit local needs: web form in the cloud web-interface for resources request, a menu item with cloud utilization statistics, user authentication via Kerberos, custom driver for OpenVZ containers. Because of high demand in that cloud service and its resources over-utilization it was re-designed to cover increasing users' needs in capacity, availability and reliability. Recently a new cloud instance has been deployed in high-availability configuration with distributed network file system and additional computing power.

  14. Framework Development Supporting the Safety Portal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prescott, Steven Ralph; Kvarfordt, Kellie Jean; Vang, Leng

    2015-07-01

    In a collaborating scientific research arena it is important to have an environment where analysts have access to a shared repository of information, documents, and software tools, and be able to accurately maintain and track historical changes in models. The new Safety Portal cloud-based environment will be accessible remotely from anywhere regardless of computing platforms given that the platform has available Internet access and proper browser capabilities. Information stored at this environment would be restricted based on user assigned credentials. This report discusses current development of a cloud-based web portal for PRA tools.

  15. SU-D-BRD-02: A Web-Based Image Processing and Plan Evaluation Platform (WIPPEP) for Future Cloud-Based Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chai, X; Liu, L; Xing, L

    Purpose: Visualization and processing of medical images and radiation treatment plan evaluation have traditionally been constrained to local workstations with limited computation power and ability of data sharing and software update. We present a web-based image processing and planning evaluation platform (WIPPEP) for radiotherapy applications with high efficiency, ubiquitous web access, and real-time data sharing. Methods: This software platform consists of three parts: web server, image server and computation server. Each independent server communicates with each other through HTTP requests. The web server is the key component that provides visualizations and user interface through front-end web browsers and relay informationmore » to the backend to process user requests. The image server serves as a PACS system. The computation server performs the actual image processing and dose calculation. The web server backend is developed using Java Servlets and the frontend is developed using HTML5, Javascript, and jQuery. The image server is based on open source DCME4CHEE PACS system. The computation server can be written in any programming language as long as it can send/receive HTTP requests. Our computation server was implemented in Delphi, Python and PHP, which can process data directly or via a C++ program DLL. Results: This software platform is running on a 32-core CPU server virtually hosting the web server, image server, and computation servers separately. Users can visit our internal website with Chrome browser, select a specific patient, visualize image and RT structures belonging to this patient and perform image segmentation running Delphi computation server and Monte Carlo dose calculation on Python or PHP computation server. Conclusion: We have developed a webbased image processing and plan evaluation platform prototype for radiotherapy. This system has clearly demonstrated the feasibility of performing image processing and plan evaluation platform through a web browser and exhibited potential for future cloud based radiotherapy.« less

  16. Web-based Tsunami Early Warning System with instant Tsunami Propagation Calculations in the GPU Cloud

    NASA Astrophysics Data System (ADS)

    Hammitzsch, M.; Spazier, J.; Reißland, S.

    2014-12-01

    Usually, tsunami early warning and mitigation systems (TWS or TEWS) are based on several software components deployed in a client-server based infrastructure. The vast majority of systems importantly include desktop-based clients with a graphical user interface (GUI) for the operators in early warning centers. However, in times of cloud computing and ubiquitous computing the use of concepts and paradigms, introduced by continuously evolving approaches in information and communications technology (ICT), have to be considered even for early warning systems (EWS). Based on the experiences and the knowledge gained in three research projects - 'German Indonesian Tsunami Early Warning System' (GITEWS), 'Distant Early Warning System' (DEWS), and 'Collaborative, Complex, and Critical Decision-Support in Evolving Crises' (TRIDEC) - new technologies are exploited to implement a cloud-based and web-based prototype to open up new prospects for EWS. This prototype, named 'TRIDEC Cloud', merges several complementary external and in-house cloud-based services into one platform for automated background computation with graphics processing units (GPU), for web-mapping of hazard specific geospatial data, and for serving relevant functionality to handle, share, and communicate threat specific information in a collaborative and distributed environment. The prototype in its current version addresses tsunami early warning and mitigation. The integration of GPU accelerated tsunami simulation computations have been an integral part of this prototype to foster early warning with on-demand tsunami predictions based on actual source parameters. However, the platform is meant for researchers around the world to make use of the cloud-based GPU computation to analyze other types of geohazards and natural hazards and react upon the computed situation picture with a web-based GUI in a web browser at remote sites. The current website is an early alpha version for demonstration purposes to give the concept a whirl and to shape science's future. Further functionality, improvements and possible profound changes have to implemented successively based on the users' evolving needs.

  17. CloudMC: a cloud computing application for Monte Carlo simulation.

    PubMed

    Miras, H; Jiménez, R; Miras, C; Gomà, C

    2013-04-21

    This work presents CloudMC, a cloud computing application-developed in Windows Azure®, the platform of the Microsoft® cloud-for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based-the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice.

  18. Capabilities and Advantages of Cloud Computing in the Implementation of Electronic Health Record.

    PubMed

    Ahmadi, Maryam; Aslani, Nasim

    2018-01-01

    With regard to the high cost of the Electronic Health Record (EHR), in recent years the use of new technologies, in particular cloud computing, has increased. The purpose of this study was to review systematically the studies conducted in the field of cloud computing. The present study was a systematic review conducted in 2017. Search was performed in the Scopus, Web of Sciences, IEEE, Pub Med and Google Scholar databases by combination keywords. From the 431 article that selected at the first, after applying the inclusion and exclusion criteria, 27 articles were selected for surveyed. Data gathering was done by a self-made check list and was analyzed by content analysis method. The finding of this study showed that cloud computing is a very widespread technology. It includes domains such as cost, security and privacy, scalability, mutual performance and interoperability, implementation platform and independence of Cloud Computing, ability to search and exploration, reducing errors and improving the quality, structure, flexibility and sharing ability. It will be effective for electronic health record. According to the findings of the present study, higher capabilities of cloud computing are useful in implementing EHR in a variety of contexts. It also provides wide opportunities for managers, analysts and providers of health information systems. Considering the advantages and domains of cloud computing in the establishment of HER, it is recommended to use this technology.

  19. Capabilities and Advantages of Cloud Computing in the Implementation of Electronic Health Record

    PubMed Central

    Ahmadi, Maryam; Aslani, Nasim

    2018-01-01

    Background: With regard to the high cost of the Electronic Health Record (EHR), in recent years the use of new technologies, in particular cloud computing, has increased. The purpose of this study was to review systematically the studies conducted in the field of cloud computing. Methods: The present study was a systematic review conducted in 2017. Search was performed in the Scopus, Web of Sciences, IEEE, Pub Med and Google Scholar databases by combination keywords. From the 431 article that selected at the first, after applying the inclusion and exclusion criteria, 27 articles were selected for surveyed. Data gathering was done by a self-made check list and was analyzed by content analysis method. Results: The finding of this study showed that cloud computing is a very widespread technology. It includes domains such as cost, security and privacy, scalability, mutual performance and interoperability, implementation platform and independence of Cloud Computing, ability to search and exploration, reducing errors and improving the quality, structure, flexibility and sharing ability. It will be effective for electronic health record. Conclusion: According to the findings of the present study, higher capabilities of cloud computing are useful in implementing EHR in a variety of contexts. It also provides wide opportunities for managers, analysts and providers of health information systems. Considering the advantages and domains of cloud computing in the establishment of HER, it is recommended to use this technology. PMID:29719309

  20. Cloud Environment Automation: from infrastructure deployment to application monitoring

    NASA Astrophysics Data System (ADS)

    Aiftimiei, C.; Costantini, A.; Bucchi, R.; Italiano, A.; Michelotto, D.; Panella, M.; Pergolesi, M.; Saletta, M.; Traldi, S.; Vistoli, C.; Zizzi, G.; Salomoni, D.

    2017-10-01

    The potential offered by the cloud paradigm is often limited by technical issues, rules and regulations. In particular, the activities related to the design and deployment of the Infrastructure as a Service (IaaS) cloud layer can be difficult to apply and time-consuming for the infrastructure maintainers. In this paper the research activity, carried out during the Open City Platform (OCP) research project [1], aimed at designing and developing an automatic tool for cloud-based IaaS deployment is presented. Open City Platform is an industrial research project funded by the Italian Ministry of University and Research (MIUR), started in 2014. It intends to research, develop and test new technological solutions open, interoperable and usable on-demand in the field of Cloud Computing, along with new sustainable organizational models that can be deployed for and adopted by the Public Administrations (PA). The presented work and the related outcomes are aimed at simplifying the deployment and maintenance of a complete IaaS cloud-based infrastructure.

  1. Bioinformatics on the cloud computing platform Azure.

    PubMed

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.

  2. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  3. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

    PubMed

    Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

    2015-01-01

    With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.

  4. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

    PubMed Central

    Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.

    2015-01-01

    With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746

  5. A cloud platform for remote diagnosis of breast cancer in mammography by fusion of machine and human intelligence

    NASA Astrophysics Data System (ADS)

    Jiang, Guodong; Fan, Ming; Li, Lihua

    2016-03-01

    Mammography is the gold standard for breast cancer screening, reducing mortality by about 30%. The application of a computer-aided detection (CAD) system to assist a single radiologist is important to further improve mammographic sensitivity for breast cancer detection. In this study, a design and realization of the prototype for remote diagnosis system in mammography based on cloud platform were proposed. To build this system, technologies were utilized including medical image information construction, cloud infrastructure and human-machine diagnosis model. Specifically, on one hand, web platform for remote diagnosis was established by J2EE web technology. Moreover, background design was realized through Hadoop open-source framework. On the other hand, storage system was built up with Hadoop distributed file system (HDFS) technology which enables users to easily develop and run on massive data application, and give full play to the advantages of cloud computing which is characterized by high efficiency, scalability and low cost. In addition, the CAD system was realized through MapReduce frame. The diagnosis module in this system implemented the algorithms of fusion of machine and human intelligence. Specifically, we combined results of diagnoses from doctors' experience and traditional CAD by using the man-machine intelligent fusion model based on Alpha-Integration and multi-agent algorithm. Finally, the applications on different levels of this system in the platform were also discussed. This diagnosis system will have great importance for the balanced health resource, lower medical expense and improvement of accuracy of diagnosis in basic medical institutes.

  6. Identification of Program Signatures from Cloud Computing System Telemetry Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nichols, Nicole M.; Greaves, Mark T.; Smith, William P.

    Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment. In this paper we demonstrate the ability of billing metrics to identify programs, in an active cloud computing environment, including multiple virtual machines running on the same hypervisor. The openmore » source cloud computing platform OpenStack, is used for private cloud management at Pacific Northwest National Laboratory. OpenStack provides a billing tool (Ceilometer) to collect system telemetry measurements. We identify four different programs running on four virtual machines under the same cloud user account. Programs were identified with up to 95% accuracy. This accuracy is dependent on the distinctiveness of telemetry measurements for the specific programs we tested. Future work will examine the scalability of this approach for a larger selection of programs to better understand the uniqueness needed to identify a program. Additionally, future work should address the separation of signatures when multiple programs are running on the same virtual machine.« less

  7. Impact of office productivity cloud computing on energy consumption and greenhouse gas emissions.

    PubMed

    Williams, Daniel R; Tang, Yinshan

    2013-05-07

    Cloud computing is usually regarded as being energy efficient and thus emitting less greenhouse gases (GHG) than traditional forms of computing. When the energy consumption of Microsoft's cloud computing Office 365 (O365) and traditional Office 2010 (O2010) software suites were tested and modeled, some cloud services were found to consume more energy than the traditional form. The developed model in this research took into consideration the energy consumption at the three main stages of data transmission; data center, network, and end user device. Comparable products from each suite were selected and activities were defined for each product to represent a different computing type. Microsoft provided highly confidential data for the data center stage, while the networking and user device stages were measured directly. A new measurement and software apportionment approach was defined and utilized allowing the power consumption of cloud services to be directly measured for the user device stage. Results indicated that cloud computing is more energy efficient for Excel and Outlook which consumed less energy and emitted less GHG than the standalone counterpart. The power consumption of the cloud based Outlook (8%) and Excel (17%) was lower than their traditional counterparts. However, the power consumption of the cloud version of Word was 17% higher than its traditional equivalent. A third mixed access method was also measured for Word which emitted 5% more GHG than the traditional version. It is evident that cloud computing may not provide a unified way forward to reduce energy consumption and GHG. Direct conversion from the standalone package into the cloud provision platform can now consider energy and GHG emissions at the software development and cloud service design stage using the methods described in this research.

  8. A service brokering and recommendation mechanism for better selecting cloud services.

    PubMed

    Gui, Zhipeng; Yang, Chaowei; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Yu, Manzhu; Sun, Min; Zhou, Nanyin; Jin, Baoxuan

    2014-01-01

    Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI).

  9. Geo-spatial Service and Application based on National E-government Network Platform and Cloud

    NASA Astrophysics Data System (ADS)

    Meng, X.; Deng, Y.; Li, H.; Yao, L.; Shi, J.

    2014-04-01

    With the acceleration of China's informatization process, our party and government take a substantive stride in advancing development and application of digital technology, which promotes the evolution of e-government and its informatization. Meanwhile, as a service mode based on innovative resources, cloud computing may connect huge pools together to provide a variety of IT services, and has become one relatively mature technical pattern with further studies and massive practical applications. Based on cloud computing technology and national e-government network platform, "National Natural Resources and Geospatial Database (NRGD)" project integrated and transformed natural resources and geospatial information dispersed in various sectors and regions, established logically unified and physically dispersed fundamental database and developed national integrated information database system supporting main e-government applications. Cross-sector e-government applications and services are realized to provide long-term, stable and standardized natural resources and geospatial fundamental information products and services for national egovernment and public users.

  10. Patient-Centered e-Health Record over the Cloud.

    PubMed

    Koumaditis, Konstantinos; Themistocleous, Marinos; Vassilacopoulos, George; Prentza, Andrianna; Kyriazis, Dimosthenis; Malamateniou, Flora; Maglaveras, Nicos; Chouvarda, Ioanna; Mourouzis, Alexandros

    2014-01-01

    The purpose of this paper is to introduce the Patient-Centered e-Health (PCEH) conceptual aspects alongside a multidisciplinary project that combines state-of-the-art technologies like cloud computing. The project, by combining several aspects of PCEH, such as: (a) electronic Personal Healthcare Record (e-PHR), (b) homecare telemedicine technologies, (c) e-prescribing, e-referral, e-learning, with advanced technologies like cloud computing and Service Oriented Architecture (SOA), will lead to an innovative integrated e-health platform of many benefits to the society, the economy, the industry, and the research community. To achieve this, a consortium of experts, both from industry (two companies, one hospital and one healthcare organization) and academia (three universities), was set to investigate, analyse, design, build and test the new platform. This paper provides insights to the PCEH concept and to the current stage of the project. In doing so, we aim at increasing the awareness of this important endeavor and sharing the lessons learned so far throughout our work.

  11. Advances in the TRIDEC Cloud

    NASA Astrophysics Data System (ADS)

    Hammitzsch, Martin; Spazier, Johannes; Reißland, Sven

    2016-04-01

    The TRIDEC Cloud is a platform that merges several complementary cloud-based services for instant tsunami propagation calculations and automated background computation with graphics processing units (GPU), for web-mapping of hazard specific geospatial data, and for serving relevant functionality to handle, share, and communicate threat specific information in a collaborative and distributed environment. The platform offers a modern web-based graphical user interface so that operators in warning centres and stakeholders of other involved parties (e.g. CPAs, ministries) just need a standard web browser to access a full-fledged early warning and information system with unique interactive features such as Cloud Messages and Shared Maps. Furthermore, the TRIDEC Cloud can be accessed in different modes, e.g. the monitoring mode, which provides important functionality required to act in a real event, and the exercise-and-training mode, which enables training and exercises with virtual scenarios re-played by a scenario player. The software system architecture and open interfaces facilitate global coverage so that the system is applicable for any region in the world and allow the integration of different sensor systems as well as the integration of other hazard types and use cases different to tsunami early warning. Current advances of the TRIDEC Cloud platform will be summarized in this presentation.

  12. A sustainability model based on cloud infrastructures for core and downstream Copernicus services

    NASA Astrophysics Data System (ADS)

    Manunta, Michele; Calò, Fabiana; De Luca, Claudio; Elefante, Stefano; Farres, Jordi; Guzzetti, Fausto; Imperatore, Pasquale; Lanari, Riccardo; Lengert, Wolfgang; Zinno, Ivana; Casu, Francesco

    2014-05-01

    The incoming Sentinel missions have been designed to be the first remote sensing satellite system devoted to operational services. In particular, the Synthetic Aperture Radar (SAR) Sentinel-1 sensor, dedicated to globally acquire over land in the interferometric mode, guarantees an unprecedented capability to investigate and monitor the Earth surface deformations related to natural and man-made hazards. Thanks to the global coverage strategy and 12-day revisit time, jointly with the free and open access data policy, such a system will allow an extensive application of Differential Interferometric SAR (DInSAR) techniques. In such a framework, European Commission has been funding several projects through the GMES and Copernicus programs, aimed at preparing the user community to the operational and extensive use of Sentinel-1 products for risk mitigation and management purposes. Among them, the FP7-DORIS, an advanced GMES downstream service coordinated by Italian National Council of Research (CNR), is based on the fully exploitation of advanced DInSAR products in landslides and subsidence contexts. In particular, the DORIS project (www.doris-project.eu) has developed innovative scientific techniques and methodologies to support Civil Protection Authorities (CPA) during the pre-event, event, and post-event phases of the risk management cycle. Nonetheless, the huge data stream expected from the Sentinel-1 satellite may jeopardize the effective use of such data in emergency response and security scenarios. This potential bottleneck can be properly overcome through the development of modern infrastructures, able to efficiently provide computing resources as well as advanced services for big data management, processing and dissemination. In this framework, CNR and ESA have tightened up a cooperation to foster the use of GRID and cloud computing platforms for remote sensing data processing, and to make available to a large audience advanced and innovative tools for DInSAR products generation and exploitation. In particular, CNR is porting the multi-temporal DInSAR technique referred to as Small Baseline Subset (SBAS) into the ESA G-POD (Grid Processing On Demand) and CIOP (Cloud Computing Operational Pilot) platforms (Elefante et al., 2013) within the SuperSites Exploitation Platform (SSEP) project, which aim is contributing to the development of an ecosystem for big geo-data processing and dissemination. This work focuses on presenting the main results that have been achieved by the DORIS project concerning the use of advanced DInSAR products for supporting CPA during the risk management cycle. Furthermore, based on the DORIS experience, a sustainability model for Core and Downstream Copernicus services based on the effective exploitation of cloud platforms is proposed. In this framework, remote sensing community, both service providers and users, can significantly benefit from the Helix Nebula-The Science Cloud initiative, created by European scientific institutions, agencies, SMEs and enterprises to pave the way for the development and exploitation of a cloud computing infrastructure for science. REFERENCES Elefante, S., Imperatore, P. , Zinno, I., M. Manunta, E. Mathot, F. Brito, J. Farres, W. Lengert, R. Lanari, F. Casu, 2013, "SBAS-DINSAR Time series generation on cloud computing platforms". IEEE IGARSS Conference, Melbourne (AU), July 2013.

  13. Integration of hybrid wireless networks in cloud services oriented enterprise information systems

    NASA Astrophysics Data System (ADS)

    Li, Shancang; Xu, Lida; Wang, Xinheng; Wang, Jue

    2012-05-01

    This article presents a hybrid wireless network integration scheme in cloud services-based enterprise information systems (EISs). With the emerging hybrid wireless networks and cloud computing technologies, it is necessary to develop a scheme that can seamlessly integrate these new technologies into existing EISs. By combining the hybrid wireless networks and computing in EIS, a new framework is proposed, which includes frontend layer, middle layer and backend layers connected to IP EISs. Based on a collaborative architecture, cloud services management framework and process diagram are presented. As a key feature, the proposed approach integrates access control functionalities within the hybrid framework that provide users with filtered views on available cloud services based on cloud service access requirements and user security credentials. In future work, we will implement the proposed framework over SwanMesh platform by integrating the UPnP standard into an enterprise information system.

  14. Research on cloud-based remote measurement and analysis system

    NASA Astrophysics Data System (ADS)

    Gao, Zhiqiang; He, Lingsong; Su, Wei; Wang, Can; Zhang, Changfan

    2015-02-01

    The promising potential of cloud computing and its convergence with technologies such as cloud storage, cloud push, mobile computing allows for creation and delivery of newer type of cloud service. Combined with the thought of cloud computing, this paper presents a cloud-based remote measurement and analysis system. This system mainly consists of three parts: signal acquisition client, web server deployed on the cloud service, and remote client. This system is a special website developed using asp.net and Flex RIA technology, which solves the selective contradiction between two monitoring modes, B/S and C/S. This platform supplies customer condition monitoring and data analysis service by Internet, which was deployed on the cloud server. Signal acquisition device is responsible for data (sensor data, audio, video, etc.) collection and pushes the monitoring data to the cloud storage database regularly. Data acquisition equipment in this system is only conditioned with the function of data collection and network function such as smartphone and smart sensor. This system's scale can adjust dynamically according to the amount of applications and users, so it won't cause waste of resources. As a representative case study, we developed a prototype system based on Ali cloud service using the rotor test rig as the research object. Experimental results demonstrate that the proposed system architecture is feasible.

  15. TethysCluster: A comprehensive approach for harnessing cloud resources for hydrologic modeling

    NASA Astrophysics Data System (ADS)

    Nelson, J.; Jones, N.; Ames, D. P.

    2015-12-01

    Advances in water resources modeling are improving the information that can be supplied to support decisions affecting the safety and sustainability of society. However, as water resources models become more sophisticated and data-intensive they require more computational power to run. Purchasing and maintaining the computing facilities needed to support certain modeling tasks has been cost-prohibitive for many organizations. With the advent of the cloud, the computing resources needed to address this challenge are now available and cost-effective, yet there still remains a significant technical barrier to leverage these resources. This barrier inhibits many decision makers and even trained engineers from taking advantage of the best science and tools available. Here we present the Python tools TethysCluster and CondorPy, that have been developed to lower the barrier to model computation in the cloud by providing (1) programmatic access to dynamically scalable computing resources, (2) a batch scheduling system to queue and dispatch the jobs to the computing resources, (3) data management for job inputs and outputs, and (4) the ability to dynamically create, submit, and monitor computing jobs. These Python tools leverage the open source, computing-resource management, and job management software, HTCondor, to offer a flexible and scalable distributed-computing environment. While TethysCluster and CondorPy can be used independently to provision computing resources and perform large modeling tasks, they have also been integrated into Tethys Platform, a development platform for water resources web apps, to enable computing support for modeling workflows and decision-support systems deployed as web apps.

  16. Distributed Processing of Sentinel-2 Products using the BIGEARTH Platform

    NASA Astrophysics Data System (ADS)

    Bacu, Victor; Stefanut, Teodor; Nandra, Constantin; Mihon, Danut; Gorgan, Dorian

    2017-04-01

    The constellation of observational satellites orbiting around Earth is constantly increasing, providing more data that need to be processed in order to extract meaningful information and knowledge from it. Sentinel-2 satellites, part of the Copernicus Earth Observation program, aim to be used in agriculture, forestry and many other land management applications. ESA's SNAP toolbox can be used to process data gathered by Sentinel-2 satellites but is limited to the resources provided by a stand-alone computer. In this paper we present a cloud based software platform that makes use of this toolbox together with other remote sensing software applications to process Sentinel-2 products. The BIGEARTH software platform [1] offers an integrated solution for processing Earth Observation data coming from different sources (such as satellites or on-site sensors). The flow of processing is defined as a chain of tasks based on the WorDeL description language [2]. Each task could rely on a different software technology (such as Grass GIS and ESA's SNAP) in order to process the input data. One important feature of the BIGEARTH platform comes from this possibility of interconnection and integration, throughout the same flow of processing, of the various well known software technologies. All this integration is transparent from the user perspective. The proposed platform extends the SNAP capabilities by enabling specialists to easily scale the processing over distributed architectures, according to their specific needs and resources. The software platform [3] can be used in multiple configurations. In the basic one the software platform runs as a standalone application inside a virtual machine. Obviously in this case the computational resources are limited but it will give an overview of the functionalities of the software platform, and also the possibility to define the flow of processing and later on to execute it on a more complex infrastructure. The most complex and robust configuration is based on cloud computing and allows the installation on a private or public cloud infrastructure. In this configuration, the processing resources can be dynamically allocated and the execution time can be considerably improved by the available virtual resources and the number of parallelizable sequences in the processing flow. The presentation highlights the benefits and issues of the proposed solution by analyzing some significant experimental use cases. Main references for further information: [1] BigEarth project, http://cgis.utcluj.ro/projects/bigearth [2] Constantin Nandra, Dorian Gorgan: "Defining Earth data batch processing tasks by means of a flexible workflow description language", ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-4, 59-66, (2016). [3] Victor Bacu, Teodor Stefanut, Dorian Gorgan, "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015).

  17. cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design

    PubMed Central

    Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei

    2016-01-01

    Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509

  18. Scalable and responsive event processing in the cloud

    PubMed Central

    Suresh, Visalakshmi; Ezhilchelvan, Paul; Watson, Paul

    2013-01-01

    Event processing involves continuous evaluation of queries over streams of events. Response-time optimization is traditionally done over a fixed set of nodes and/or by using metrics measured at query-operator levels. Cloud computing makes it easy to acquire and release computing nodes as required. Leveraging this flexibility, we propose a novel, queueing-theory-based approach for meeting specified response-time targets against fluctuating event arrival rates by drawing only the necessary amount of computing resources from a cloud platform. In the proposed approach, the entire processing engine of a distinct query is modelled as an atomic unit for predicting response times. Several such units hosted on a single node are modelled as a multiple class M/G/1 system. These aspects eliminate intrusive, low-level performance measurements at run-time, and also offer portability and scalability. Using model-based predictions, cloud resources are efficiently used to meet response-time targets. The efficacy of the approach is demonstrated through cloud-based experiments. PMID:23230164

  19. cryoem-cloud-tools: A software platform to deploy and manage cryo-EM jobs in the cloud.

    PubMed

    Cianfrocco, Michael A; Lahiri, Indrajit; DiMaio, Frank; Leschziner, Andres E

    2018-06-01

    Access to streamlined computational resources remains a significant bottleneck for new users of cryo-electron microscopy (cryo-EM). To address this, we have developed tools that will submit cryo-EM analysis routines and atomic model building jobs directly to Amazon Web Services (AWS) from a local computer or laptop. These new software tools ("cryoem-cloud-tools") have incorporated optimal data movement, security, and cost-saving strategies, giving novice users access to complex cryo-EM data processing pipelines. Integrating these tools into the RELION processing pipeline and graphical user interface we determined a 2.2 Å structure of ß-galactosidase in ∼55 hours on AWS. We implemented a similar strategy to submit Rosetta atomic model building and refinement to AWS. These software tools dramatically reduce the barrier for entry of new users to cloud computing for cryo-EM and are freely available at cryoem-tools.cloud. Copyright © 2018. Published by Elsevier Inc.

  20. OpenNEX, a private-public partnership in support of the national climate assessment

    NASA Astrophysics Data System (ADS)

    Nemani, R. R.; Wang, W.; Michaelis, A.; Votava, P.; Ganguly, S.

    2016-12-01

    The NASA Earth Exchange (NEX) is a collaborative computing platform that has been developed with the objective of bringing scientists together with the software tools, massive global datasets, and supercomputing resources necessary to accelerate research in Earth systems science and global change. NEX is funded as an enabling tool for sustaining the national climate assessment. Over the past five years, researchers have used the NEX platform and produced a number of data sets highly relevant to the National Climate Assessment. These include high-resolution climate projections using different downscaling techniques and trends in historical climate from satellite data. To enable a broader community in exploiting the above datasets, the NEX team partnered with public cloud providers to create the OpenNEX platform. OpenNEX provides ready access to NEX data holdings on a number of public cloud platforms along with pertinent analysis tools and workflows in the form of Machine Images and Docker Containers, lectures and tutorials by experts. We will showcase some of the applications of OpenNEX data and tools by the community on Amazon Web Services, Google Cloud and the NEX Sandbox.

  1. Fast Semantic Segmentation of 3d Point Clouds with Strongly Varying Density

    NASA Astrophysics Data System (ADS)

    Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

    2016-06-01

    We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point's (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.

  2. The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services

    NASA Astrophysics Data System (ADS)

    Risch, Marcel; Altmann, Jörn; Guo, Li; Fleming, Alan; Courcoubetis, Costas

    Within this paper, we present the GridEcon Platform, a testbed for designing and evaluating economics-aware services in a commercial Cloud computing setting. The Platform is based on the idea that the exact working of such services is difficult to predict in the context of a market and, therefore, an environment for evaluating its behavior in an emulated market is needed. To identify the components of the GridEcon Platform, a number of economics-aware services and their interactions have been envisioned. The two most important components of the platform are the Marketplace and the Workflow Engine. The Workflow Engine allows the simple composition of a market environment by describing the service interactions between economics-aware services. The Marketplace allows trading goods using different market mechanisms. The capabilities of these components of the GridEcon Platform in conjunction with the economics-aware services are described in this paper in detail. The validation of an implemented market mechanism and a capacity planning service using the GridEcon Platform also demonstrated the usefulness of the GridEcon Platform.

  3. Avoidable Software Procurements

    DTIC Science & Technology

    2012-09-01

    software license, software usage, ELA, Software as a Service , SaaS , Software Asset...PaaS Platform as a Service SaaS Software as a Service SAM Software Asset Management SMS System Management Server SEWP Solutions for Enterprise Wide...delivery of full Cloud Services , we will see the transition of the Cloud Computing service model from Iaas to SaaS , or Software as a Service . Software

  4. Evaluating virtual hosted desktops for graphics-intensive astronomy

    NASA Astrophysics Data System (ADS)

    Meade, B. F.; Fluke, C. J.

    2018-04-01

    Visualisation of data is critical to understanding astronomical phenomena. Today, many instruments produce datasets that are too big to be downloaded to a local computer, yet many of the visualisation tools used by astronomers are deployed only on desktop computers. Cloud computing is increasingly used to provide a computation and simulation platform in astronomy, but it also offers great potential as a visualisation platform. Virtual hosted desktops, with graphics processing unit (GPU) acceleration, allow interactive, graphics-intensive desktop applications to operate co-located with astronomy datasets stored in remote data centres. By combining benchmarking and user experience testing, with a cohort of 20 astronomers, we investigate the viability of replacing physical desktop computers with virtual hosted desktops. In our work, we compare two Apple MacBook computers (one old and one new, representing hardware and opposite ends of the useful lifetime) with two virtual hosted desktops: one commercial (Amazon Web Services) and one in a private research cloud (the Australian NeCTAR Research Cloud). For two-dimensional image-based tasks and graphics-intensive three-dimensional operations - typical of astronomy visualisation workflows - we found that benchmarks do not necessarily provide the best indication of performance. When compared to typical laptop computers, virtual hosted desktops can provide a better user experience, even with lower performing graphics cards. We also found that virtual hosted desktops are equally simple to use, provide greater flexibility in choice of configuration, and may actually be a more cost-effective option for typical usage profiles.

  5. The ISB Cancer Genomics Cloud: A Flexible Cloud-Based Platform for Cancer Genomics Research.

    PubMed

    Reynolds, Sheila M; Miller, Michael; Lee, Phyliss; Leinonen, Kalle; Paquette, Suzanne M; Rodebaugh, Zack; Hahn, Abigail; Gibbs, David L; Slagel, Joseph; Longabaugh, William J; Dhankani, Varsha; Reyes, Madelyn; Pihl, Todd; Backus, Mark; Bookman, Matthew; Deflaux, Nicole; Bingham, Jonathan; Pot, David; Shmulevich, Ilya

    2017-11-01

    The ISB Cancer Genomics Cloud (ISB-CGC) is one of three pilot projects funded by the National Cancer Institute to explore new approaches to computing on large cancer datasets in a cloud environment. With a focus on Data as a Service, the ISB-CGC offers multiple avenues for accessing and analyzing The Cancer Genome Atlas, TARGET, and other important references such as GENCODE and COSMIC using the Google Cloud Platform. The open approach allows researchers to choose approaches best suited to the task at hand: from analyzing terabytes of data using complex workflows to developing new analysis methods in common languages such as Python, R, and SQL; to using an interactive web application to create synthetic patient cohorts and to explore the wealth of available genomic data. Links to resources and documentation can be found at www.isb-cgc.org Cancer Res; 77(21); e7-10. ©2017 AACR . ©2017 American Association for Cancer Research.

  6. Implementation and performance test of cloud platform based on Hadoop

    NASA Astrophysics Data System (ADS)

    Xu, Jingxian; Guo, Jianhong; Ren, Chunlan

    2018-01-01

    Hadoop, as an open source project for the Apache foundation, is a distributed computing framework that deals with large amounts of data and has been widely used in the Internet industry. Therefore, it is meaningful to study the implementation of Hadoop platform and the performance of test platform. The purpose of this subject is to study the method of building Hadoop platform and to study the performance of test platform. This paper presents a method to implement Hadoop platform and a test platform performance method. Experimental results show that the proposed test performance method is effective and it can detect the performance of Hadoop platform.

  7. A Service Brokering and Recommendation Mechanism for Better Selecting Cloud Services

    PubMed Central

    Gui, Zhipeng; Yang, Chaowei; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Yu, Manzhu; Sun, Min; Zhou, Nanyin; Jin, Baoxuan

    2014-01-01

    Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI). PMID:25170937

  8. BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark.

    PubMed

    Gulzar, Muhammad Ali; Interlandi, Matteo; Yoo, Seunghyun; Tetali, Sai Deep; Condie, Tyson; Millstein, Todd; Kim, Miryung

    2016-05-01

    Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today's data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user. First, BIGDEBUG's simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact.

  9. Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways

    NASA Astrophysics Data System (ADS)

    Kohlhoff, Kai J.; Shukla, Diwakar; Lawrenz, Morgan; Bowman, Gregory R.; Konerding, David E.; Belov, Dan; Altman, Russ B.; Pande, Vijay S.

    2014-01-01

    Simulations can provide tremendous insight into the atomistic details of biological mechanisms, but micro- to millisecond timescales are historically only accessible on dedicated supercomputers. We demonstrate that cloud computing is a viable alternative that brings long-timescale processes within reach of a broader community. We used Google's Exacycle cloud-computing platform to simulate two milliseconds of dynamics of a major drug target, the G-protein-coupled receptor β2AR. Markov state models aggregate independent simulations into a single statistical model that is validated by previous computational and experimental results. Moreover, our models provide an atomistic description of the activation of a G-protein-coupled receptor and reveal multiple activation pathways. Agonists and inverse agonists interact differentially with these pathways, with profound implications for drug design.

  10. Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.

    PubMed

    Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

    2011-09-07

    Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.

  11. A Cloud-Based Internet of Things Platform for Ambient Assisted Living

    PubMed Central

    Cubo, Javier; Nieto, Adrián; Pimentel, Ernesto

    2014-01-01

    A common feature of ambient intelligence is that many objects are inter-connected and act in unison, which is also a challenge in the Internet of Things. There has been a shift in research towards integrating both concepts, considering the Internet of Things as representing the future of computing and communications. However, the efficient combination and management of heterogeneous things or devices in the ambient intelligence domain is still a tedious task, and it presents crucial challenges. Therefore, to appropriately manage the inter-connection of diverse devices in these systems requires: (1) specifying and efficiently implementing the devices (e.g., as services); (2) handling and verifying their heterogeneity and composition; and (3) standardizing and managing their data, so as to tackle large numbers of systems together, avoiding standalone applications on local servers. To overcome these challenges, this paper proposes a platform to manage the integration and behavior-aware orchestration of heterogeneous devices as services, stored and accessed via the cloud, with the following contributions: (i) we describe a lightweight model to specify the behavior of devices, to determine the order of the sequence of exchanged messages during the composition of devices; (ii) we define a common architecture using a service-oriented standard environment, to integrate heterogeneous devices by means of their interfaces, via a gateway, and to orchestrate them according to their behavior; (iii) we design a framework based on cloud computing technology, connecting the gateway in charge of acquiring the data from the devices with a cloud platform, to remotely access and monitor the data at run-time and react to emergency situations; and (iv) we implement and generate a novel cloud-based IoT platform of behavior-aware devices as services for ambient intelligence systems, validating the whole approach in real scenarios related to a specific ambient assisted living application. PMID:25093343

  12. A cloud-based Internet of Things platform for ambient assisted living.

    PubMed

    Cubo, Javier; Nieto, Adrián; Pimentel, Ernesto

    2014-08-04

    A common feature of ambient intelligence is that many objects are inter-connected and act in unison, which is also a challenge in the Internet of Things. There has been a shift in research towards integrating both concepts, considering the Internet of Things as representing the future of computing and communications. However, the efficient combination and management of heterogeneous things or devices in the ambient intelligence domain is still a tedious task, and it presents crucial challenges. Therefore, to appropriately manage the inter-connection of diverse devices in these systems requires: (1) specifying and efficiently implementing the devices (e.g., as services); (2) handling and verifying their heterogeneity and composition; and (3) standardizing and managing their data, so as to tackle large numbers of systems together, avoiding standalone applications on local servers. To overcome these challenges, this paper proposes a platform to manage the integration and behavior-aware orchestration of heterogeneous devices as services, stored and accessed via the cloud, with the following contributions: (i) we describe a lightweight model to specify the behavior of devices, to determine the order of the sequence of exchanged messages during the composition of devices; (ii) we define a common architecture using a service-oriented standard environment, to integrate heterogeneous devices by means of their interfaces, via a gateway, and to orchestrate them according to their behavior; (iii) we design a framework based on cloud computing technology, connecting the gateway in charge of acquiring the data from the devices with a cloud platform, to remotely access and monitor the data at run-time and react to emergency situations; and (iv) we implement and generate a novel cloud-based IoT platform of behavior-aware devices as services for ambient intelligence systems, validating the whole approach in real scenarios related to a specific ambient assisted living application.

  13. Real-time WAMI streaming target tracking in fog

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Blasch, Erik; Chen, Ning; Deng, Anna; Ling, Haibin; Chen, Genshe

    2016-05-01

    Real-time information fusion based on WAMI (Wide-Area Motion Imagery), FMV (Full Motion Video), and Text data is highly desired for many mission critical emergency or security applications. Cloud Computing has been considered promising to achieve big data integration from multi-modal sources. In many mission critical tasks, however, powerful Cloud technology cannot satisfy the tight latency tolerance as the servers are allocated far from the sensing platform, actually there is no guaranteed connection in the emergency situations. Therefore, data processing, information fusion, and decision making are required to be executed on-site (i.e., near the data collection). Fog Computing, a recently proposed extension and complement for Cloud Computing, enables computing on-site without outsourcing jobs to a remote Cloud. In this work, we have investigated the feasibility of processing streaming WAMI in the Fog for real-time, online, uninterrupted target tracking. Using a single target tracking algorithm, we studied the performance of a Fog Computing prototype. The experimental results are very encouraging that validated the effectiveness of our Fog approach to achieve real-time frame rates.

  14. Distributed MRI reconstruction using Gadgetron-based cloud computing.

    PubMed

    Xue, Hui; Inati, Souheil; Sørensen, Thomas Sangild; Kellman, Peter; Hansen, Michael S

    2015-03-01

    To expand the open source Gadgetron reconstruction framework to support distributed computing and to demonstrate that a multinode version of the Gadgetron can be used to provide nonlinear reconstruction with clinically acceptable latency. The Gadgetron framework was extended with new software components that enable an arbitrary number of Gadgetron instances to collaborate on a reconstruction task. This cloud-enabled version of the Gadgetron was deployed on three different distributed computing platforms ranging from a heterogeneous collection of commodity computers to the commercial Amazon Elastic Compute Cloud. The Gadgetron cloud was used to provide nonlinear, compressed sensing reconstruction on a clinical scanner with low reconstruction latency (eg, cardiac and neuroimaging applications). The proposed setup was able to handle acquisition and 11 -SPIRiT reconstruction of nine high temporal resolution real-time, cardiac short axis cine acquisitions, covering the ventricles for functional evaluation, in under 1 min. A three-dimensional high-resolution brain acquisition with 1 mm(3) isotropic pixel size was acquired and reconstructed with nonlinear reconstruction in less than 5 min. A distributed computing enabled Gadgetron provides a scalable way to improve reconstruction performance using commodity cluster computing. Nonlinear, compressed sensing reconstruction can be deployed clinically with low image reconstruction latency. © 2014 Wiley Periodicals, Inc.

  15. Grids, virtualization, and clouds at Fermilab

    DOE PAGES

    Timm, S.; Chadwick, K.; Garzoglio, G.; ...

    2014-06-11

    Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture andmore » the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). Lastly, this work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.« less

  16. Grids, virtualization, and clouds at Fermilab

    NASA Astrophysics Data System (ADS)

    Timm, S.; Chadwick, K.; Garzoglio, G.; Noh, S.

    2014-06-01

    Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture and the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). This work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.

  17. Cloud computing method for dynamically scaling a process across physical machine boundaries

    DOEpatents

    Gillen, Robert E.; Patton, Robert M.; Potok, Thomas E.; Rojas, Carlos C.

    2014-09-02

    A cloud computing platform includes first device having a graph or tree structure with a node which receives data. The data is processed by the node or communicated to a child node for processing. A first node in the graph or tree structure determines the reconfiguration of a portion of the graph or tree structure on a second device. The reconfiguration may include moving a second node and some or all of its descendant nodes. The second and descendant nodes may be copied to the second device.

  18. IoT-based flood embankments monitoring system

    NASA Astrophysics Data System (ADS)

    Michta, E.; Szulim, R.; Sojka-Piotrowska, A.; Piotrowski, K.

    2017-08-01

    In the paper a concept of flood embankments monitoring system based on using Internet of Things approach and Cloud Computing technologies will be presented. The proposed system consists of sensors, IoT nodes, Gateways and Cloud based services. Nodes communicates with the sensors measuring certain physical parameters describing the state of the embankments and communicates with the Gateways. Gateways are specialized active devices responsible for direct communication with the nodes, collecting sensor data, preprocess the data, applying local rules and communicate with the Cloud Services using communication API delivered by cloud services providers. Architecture of all of the system components will be proposed consisting IoT devices functionalities description, their communication model, software modules and services bases on using a public cloud computing platform like Microsoft Azure will be proposed. The most important aspects of maintaining the communication in a secure way will be shown.

  19. Cloud-based opportunities in scientific computing: insights from processing Suomi National Polar-Orbiting Partnership (S-NPP) Direct Broadcast data

    NASA Astrophysics Data System (ADS)

    Evans, J. D.; Hao, W.; Chettri, S.

    2013-12-01

    The cloud is proving to be a uniquely promising platform for scientific computing. Our experience with processing satellite data using Amazon Web Services highlights several opportunities for enhanced performance, flexibility, and cost effectiveness in the cloud relative to traditional computing -- for example: - Direct readout from a polar-orbiting satellite such as the Suomi National Polar-Orbiting Partnership (S-NPP) requires bursts of processing a few times a day, separated by quiet periods when the satellite is out of receiving range. In the cloud, by starting and stopping virtual machines in minutes, we can marshal significant computing resources quickly when needed, but not pay for them when not needed. To take advantage of this capability, we are automating a data-driven approach to the management of cloud computing resources, in which new data availability triggers the creation of new virtual machines (of variable size and processing power) which last only until the processing workflow is complete. - 'Spot instances' are virtual machines that run as long as one's asking price is higher than the provider's variable spot price. Spot instances can greatly reduce the cost of computing -- for software systems that are engineered to withstand unpredictable interruptions in service (as occurs when a spot price exceeds the asking price). We are implementing an approach to workflow management that allows data processing workflows to resume with minimal delays after temporary spot price spikes. This will allow systems to take full advantage of variably-priced 'utility computing.' - Thanks to virtual machine images, we can easily launch multiple, identical machines differentiated only by 'user data' containing individualized instructions (e.g., to fetch particular datasets or to perform certain workflows or algorithms) This is particularly useful when (as is the case with S-NPP data) we need to launch many very similar machines to process an unpredictable number of data files concurrently. Our experience shows the viability and flexibility of this approach to workflow management for scientific data processing. - Finally, cloud computing is a promising platform for distributed volunteer ('interstitial') computing, via mechanisms such as the Berkeley Open Infrastructure for Network Computing (BOINC) popularized with the SETI@Home project and others such as ClimatePrediction.net and NASA's Climate@Home. Interstitial computing faces significant challenges as commodity computing shifts from (always on) desktop computers towards smartphones and tablets (untethered and running on scarce battery power); but cloud computing offers significant slack capacity. This capacity includes virtual machines with unused RAM or underused CPUs; virtual storage volumes allocated (& paid for) but not full; and virtual machines that are paid up for the current hour but whose work is complete. We are devising ways to facilitate the reuse of these resources (i.e., cloud-based interstitial computing) for satellite data processing and related analyses. We will present our findings and research directions on these and related topics.

  20. Cloud@Home: A New Enhanced Computing Paradigm

    NASA Astrophysics Data System (ADS)

    Distefano, Salvatore; Cunsolo, Vincenzo D.; Puliafito, Antonio; Scarpa, Marco

    Cloud computing is a distributed computing paradigm that mixes aspects of Grid computing, ("… hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities" (Foster, 2002)) Internet Computing ("…a computing platform geographically distributed across the Internet" (Milenkovic et al., 2003)), Utility computing ("a collection of technologies and business practices that enables computing to be delivered seamlessly and reliably across multiple computers, ... available as needed and billed according to usage, much like water and electricity are today" (Ross & Westerman, 2004)) Autonomic computing ("computing systems that can manage themselves given high-level objectives from administrators" (Kephart & Chess, 2003)), Edge computing ("… provides a generic template facility for any type of application to spread its execution across a dedicated grid, balancing the load …" Davis, Parikh, & Weihl, 2004) and Green computing (a new frontier of Ethical computing1 starting from the assumption that in next future energy costs will be related to the environment pollution).

  1. Using virtual machine monitors to overcome the challenges of monitoring and managing virtualized cloud infrastructures

    NASA Astrophysics Data System (ADS)

    Bamiah, Mervat Adib; Brohi, Sarfraz Nawaz; Chuprat, Suriayati

    2012-01-01

    Virtualization is one of the hottest research topics nowadays. Several academic researchers and developers from IT industry are designing approaches for solving security and manageability issues of Virtual Machines (VMs) residing on virtualized cloud infrastructures. Moving the application from a physical to a virtual platform increases the efficiency, flexibility and reduces management cost as well as effort. Cloud computing is adopting the paradigm of virtualization, using this technique, memory, CPU and computational power is provided to clients' VMs by utilizing the underlying physical hardware. Beside these advantages there are few challenges faced by adopting virtualization such as management of VMs and network traffic, unexpected additional cost and resource allocation. Virtual Machine Monitor (VMM) or hypervisor is the tool used by cloud providers to manage the VMs on cloud. There are several heterogeneous hypervisors provided by various vendors that include VMware, Hyper-V, Xen and Kernel Virtual Machine (KVM). Considering the challenge of VM management, this paper describes several techniques to monitor and manage virtualized cloud infrastructures.

  2. Large-scale high-throughput computer-aided discovery of advanced materials using cloud computing

    NASA Astrophysics Data System (ADS)

    Bazhirov, Timur; Mohammadi, Mohammad; Ding, Kevin; Barabash, Sergey

    Recent advances in cloud computing made it possible to access large-scale computational resources completely on-demand in a rapid and efficient manner. When combined with high fidelity simulations, they serve as an alternative pathway to enable computational discovery and design of new materials through large-scale high-throughput screening. Here, we present a case study for a cloud platform implemented at Exabyte Inc. We perform calculations to screen lightweight ternary alloys for thermodynamic stability. Due to the lack of experimental data for most such systems, we rely on theoretical approaches based on first-principle pseudopotential density functional theory. We calculate the formation energies for a set of ternary compounds approximated by special quasirandom structures. During an example run we were able to scale to 10,656 CPUs within 7 minutes from the start, and obtain results for 296 compounds within 38 hours. The results indicate that the ultimate formation enthalpy of ternary systems can be negative for some of lightweight alloys, including Li and Mg compounds. We conclude that compared to traditional capital-intensive approach that requires in on-premises hardware resources, cloud computing is agile and cost-effective, yet scalable and delivers similar performance.

  3. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  4. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. On-demand Simulation of Atmospheric Transport Processes on the AlpEnDAC Cloud

    NASA Astrophysics Data System (ADS)

    Hachinger, S.; Harsch, C.; Meyer-Arnek, J.; Frank, A.; Heller, H.; Giemsa, E.

    2016-12-01

    The "Alpine Environmental Data Analysis Centre" (AlpEnDAC) develops a data-analysis platform for high-altitude research facilities within the "Virtual Alpine Observatory" project (VAO). This platform, with its web portal, will support use cases going much beyond data management: On user request, the data are augmented with "on-demand" simulation results, such as air-parcel trajectories for tracing down the source of pollutants when they appear in high concentration. The respective back-end mechanism uses the Compute Cloud of the Leibniz Supercomputing Centre (LRZ) to transparently calculate results requested by the user, as far as they have not yet been stored in AlpEnDAC. The queuing-system operation model common in supercomputing is replaced by a model in which Virtual Machines (VMs) on the cloud are automatically created/destroyed, providing the necessary computing power immediately on demand. From a security point of view, this allows to perform simulations in a sandbox defined by the VM configuration, without direct access to a computing cluster. Within few minutes, the user receives conveniently visualized results. The AlpEnDAC infrastructure is distributed among two participating institutes [front-end at German Aerospace Centre (DLR), simulation back-end at LRZ], requiring an efficient mechanism for synchronization of measured and augmented data. We discuss our iRODS-based solution for these data-management tasks as well as the general AlpEnDAC framework. Our cloud-based offerings aim at making scientific computing for our users much more convenient and flexible than it has been, and to allow scientists without a broad background in scientific computing to benefit from complex numerical simulations.

  6. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    PubMed

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.

  7. Towards Large-area Field-scale Operational Evapotranspiration for Water Use Mapping

    NASA Astrophysics Data System (ADS)

    Senay, G. B.; Friedrichs, M.; Morton, C.; Huntington, J. L.; Verdin, J.

    2017-12-01

    Field-scale evapotranspiration (ET) estimates are needed for improving surface and groundwater use and water budget studies. Ideally, field-scale ET estimates would be at regional to national levels and cover long time periods. As a result of large data storage and computational requirements associated with processing field-scale satellite imagery such as Landsat, numerous challenges remain to develop operational ET estimates over large areas for detailed water use and availability studies. However, the combination of new science, data availability, and cloud computing technology is enabling unprecedented capabilities for ET mapping. To demonstrate this capability, we used Google's Earth Engine cloud computing platform to create nationwide annual ET estimates with 30-meter resolution Landsat ( 16,000 images) and gridded weather data using the Operational Simplified Surface Energy Balance (SSEBop) model in support of the National Water Census, a USGS research program designed to build decision support capacity for water management agencies and other natural resource managers. By leveraging Google's Earth Engine Application Programming Interface (API) and developing software in a collaborative, open-platform environment, we rapidly advance from research towards applications for large-area field-scale ET mapping. Cloud computing of the Landsat image archive combined with other satellite, climate, and weather data, is creating never imagined opportunities for assessing ET model behavior and uncertainty, and ultimately providing the ability for more robust operational monitoring and assessment of water use at field-scales.

  8. GC31G-1182: Opennex, a Private-Public Partnership in Support of the National Climate Assessment

    NASA Technical Reports Server (NTRS)

    Nemani, Ramakrishna R.; Wang, Weile; Michaelis, Andrew; Votava, Petr; Ganguly, Sangram

    2016-01-01

    The NASA Earth Exchange (NEX) is a collaborative computing platform that has been developed with the objective of bringing scientists together with the software tools, massive global datasets, and supercomputing resources necessary to accelerate research in Earth systems science and global change. NEX is funded as an enabling tool for sustaining the national climate assessment. Over the past five years, researchers have used the NEX platform and produced a number of data sets highly relevant to the National Climate Assessment. These include high-resolution climate projections using different downscaling techniques and trends in historical climate from satellite data. To enable a broader community in exploiting the above datasets, the NEX team partnered with public cloud providers to create the OpenNEX platform. OpenNEX provides ready access to NEX data holdings on a number of public cloud platforms along with pertinent analysis tools and workflows in the form of Machine Images and Docker Containers, lectures and tutorials by experts. We will showcase some of the applications of OpenNEX data and tools by the community on Amazon Web Services, Google Cloud and the NEX Sandbox.

  9. Helix Nebula: Enabling federation of existing data infrastructures and data services to an overarching cross-domain e-infrastructure

    NASA Astrophysics Data System (ADS)

    Lengert, Wolfgang; Farres, Jordi; Lanari, Riccardo; Casu, Francesco; Manunta, Michele; Lassalle-Balier, Gerard

    2014-05-01

    Helix Nebula has established a growing public private partnership of more than 30 commercial cloud providers, SMEs, and publicly funded research organisations and e-infrastructures. The Helix Nebula strategy is to establish a federated cloud service across Europe. Three high-profile flagships, sponsored by CERN (high energy physics), EMBL (life sciences) and ESA/DLR/CNES/CNR (earth science), have been deployed and extensively tested within this federated environment. The commitments behind these initial flagships have created a critical mass that attracts suppliers and users to the initiative, to work together towards an "Information as a Service" market place. Significant progress in implementing the following 4 programmatic goals (as outlined in the strategic Plan Ref.1) has been achieved: × Goal #1 Establish a Cloud Computing Infrastructure for the European Research Area (ERA) serving as a platform for innovation and evolution of the overall infrastructure. × Goal #2 Identify and adopt suitable policies for trust, security and privacy on a European-level can be provided by the European Cloud Computing framework and infrastructure. × Goal #3 Create a light-weight governance structure for the future European Cloud Computing Infrastructure that involves all the stakeholders and can evolve over time as the infrastructure, services and user-base grows. × Goal #4 Define a funding scheme involving the three stake-holder groups (service suppliers, users, EC and national funding agencies) into a Public-Private-Partnership model to implement a Cloud Computing Infrastructure that delivers a sustainable business environment adhering to European level policies. Now in 2014 a first version of this generic cross-domain e-infrastructure is ready to go into operations building on federation of European industry and contributors (data, tools, knowledge, ...). This presentation describes how Helix Nebula is being used in the domain of earth science focusing on geohazards. The so called "Supersite Exploitation Platform" (SSEP) provides scientists an overarching federated e-infrastructure with a very fast access to (i) large volume of data (EO/non-space data), (ii) computing resources (e.g. hybrid cloud/grid), (iii) processing software (e.g. toolboxes, RTMs, retrieval baselines, visualization routines), and (iv) general platform capabilities (e.g. user management and access control, accounting, information portal, collaborative tools, social networks etc.). In this federation each data provider remains in full control of the implementation of its data policy. This presentation outlines the Architecture (technical and services) supporting very heterogeneous science domains as well as the procedures for new-comers to join the Helix Nebula Market Place. Ref.1 http://cds.cern.ch/record/1374172/files/CERN-OPEN-2011-036.pdf

  10. Empirical Evaluation of Conservative and Optimistic Discrete Event Execution on Cloud and VM Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoginath, Srikanth B; Perumalla, Kalyan S

    2013-01-01

    Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the choice of the highest-end computing configuration is no longer the most economical one. Moreover, runtime dynamics unique to VM platforms introduce new performance characteristics, and the variety of possible VM configurations give rise to a range of choices for hosting a PDES run. Here, an empirical study of these issues is undertaken to guide an understanding of the dynamics, trends and trade-offsmore » in executing PDES on VM/Cloud platforms. Performance results and cost measures are obtained from actual execution of a range of scenarios in two PDES benchmark applications on the Amazon Cloud offerings and on a high-end VM host machine. The data reveals interesting insights into the new VM-PDES dynamics that come into play and also leads to counter-intuitive guidelines with respect to choosing the best and second-best configurations when overall cost of execution is considered. In particular, it is found that choosing the highest-end VM configuration guarantees neither the best runtime nor the least cost. Interestingly, choosing a (suitably scaled) low-end VM configuration provides the least overall cost without adversely affecting the total runtime.« less

  11. Migrating To The Cloud: Preparing The USMC CDET For MCEITS

    DTIC Science & Technology

    2016-03-01

    Service SAAR System Authorization Access Request SaaS Software as a... Service (IaaS), Platform as a Service (PaaS), Software as a Service ( SaaS ), and Data as a Service (DaaS) (Takai, 2012). A closer examination of each...8 3. Software as a Service NIST described SaaS as a model of cloud computing where the service provider offers its customers fee-based access

  12. Inexpensive and Highly Reproducible Cloud-Based Variant Calling of 2,535 Human Genomes

    PubMed Central

    Shringarpure, Suyash S.; Carroll, Andrew; De La Vega, Francisco M.; Bustamante, Carlos D.

    2015-01-01

    Population scale sequencing of whole human genomes is becoming economically feasible; however, data management and analysis remains a formidable challenge for many research groups. Large sequencing studies, like the 1000 Genomes Project, have improved our understanding of human demography and the effect of rare genetic variation in disease. Variant calling on datasets of hundreds or thousands of genomes is time-consuming, expensive, and not easily reproducible given the myriad components of a variant calling pipeline. Here, we describe a cloud-based pipeline for joint variant calling in large samples using the Real Time Genomics population caller. We deployed the population caller on the Amazon cloud with the DNAnexus platform in order to achieve low-cost variant calling. Using our pipeline, we were able to identify 68.3 million variants in 2,535 samples from Phase 3 of the 1000 Genomes Project. By performing the variant calling in a parallel manner, the data was processed within 5 days at a compute cost of $7.33 per sample (a total cost of $18,590 for completed jobs and $21,805 for all jobs). Analysis of cost dependence and running time on the data size suggests that, given near linear scalability, cloud computing can be a cheap and efficient platform for analyzing even larger sequencing studies in the future. PMID:26110529

  13. Helix Nebula and CERN: A Symbiotic approach to exploiting commercial clouds

    NASA Astrophysics Data System (ADS)

    Barreiro Megino, Fernando H.; Jones, Robert; Kucharczyk, Katarzyna; Medrano Llamas, Ramón; van der Ster, Daniel

    2014-06-01

    The recent paradigm shift toward cloud computing in IT, and general interest in "Big Data" in particular, have demonstrated that the computing requirements of HEP are no longer globally unique. Indeed, the CERN IT department and LHC experiments have already made significant R&D investments in delivering and exploiting cloud computing resources. While a number of technical evaluations of interesting commercial offerings from global IT enterprises have been performed by various physics labs, further technical, security, sociological, and legal issues need to be address before their large-scale adoption by the research community can be envisaged. Helix Nebula - the Science Cloud is an initiative that explores these questions by joining the forces of three European research institutes (CERN, ESA and EMBL) with leading European commercial IT enterprises. The goals of Helix Nebula are to establish a cloud platform federating multiple commercial cloud providers, along with new business models, which can sustain the cloud marketplace for years to come. This contribution will summarize the participation of CERN in Helix Nebula. We will explain CERN's flagship use-case and the model used to integrate several cloud providers with an LHC experiment's workload management system. During the first proof of concept, this project contributed over 40.000 CPU-days of Monte Carlo production throughput to the ATLAS experiment with marginal manpower required. CERN's experience, together with that of ESA and EMBL, is providing a great insight into the cloud computing industry and highlighted several challenges that are being tackled in order to ease the export of the scientific workloads to the cloud environments.

  14. BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark

    PubMed Central

    Gulzar, Muhammad Ali; Interlandi, Matteo; Yoo, Seunghyun; Tetali, Sai Deep; Condie, Tyson; Millstein, Todd; Kim, Miryung

    2016-01-01

    Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today’s data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user. First, BIGDEBUG’s simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact. PMID:27390389

  15. SU-E-T-222: Computational Optimization of Monte Carlo Simulation On 4D Treatment Planning Using the Cloud Computing Technology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chow, J

    Purpose: This study evaluated the efficiency of 4D lung radiation treatment planning using Monte Carlo simulation on the cloud. The EGSnrc Monte Carlo code was used in dose calculation on the 4D-CT image set. Methods: 4D lung radiation treatment plan was created by the DOSCTP linked to the cloud, based on the Amazon elastic compute cloud platform. Dose calculation was carried out by Monte Carlo simulation on the 4D-CT image set on the cloud, and results were sent to the FFD4D image deformation program for dose reconstruction. The dependence of computing time for treatment plan on the number of computemore » node was optimized with variations of the number of CT image set in the breathing cycle and dose reconstruction time of the FFD4D. Results: It is found that the dependence of computing time on the number of compute node was affected by the diminishing return of the number of node used in Monte Carlo simulation. Moreover, the performance of the 4D treatment planning could be optimized by using smaller than 10 compute nodes on the cloud. The effects of the number of image set and dose reconstruction time on the dependence of computing time on the number of node were not significant, as more than 15 compute nodes were used in Monte Carlo simulations. Conclusion: The issue of long computing time in 4D treatment plan, requiring Monte Carlo dose calculations in all CT image sets in the breathing cycle, can be solved using the cloud computing technology. It is concluded that the optimized number of compute node selected in simulation should be between 5 and 15, as the dependence of computing time on the number of node is significant.« less

  16. Automatic Atlas Based Electron Density and Structure Contouring for MRI-based Prostate Radiation Therapy on the Cloud

    NASA Astrophysics Data System (ADS)

    Dowling, J. A.; Burdett, N.; Greer, P. B.; Sun, J.; Parker, J.; Pichler, P.; Stanwell, P.; Chandra, S.; Rivest-Hénault, D.; Ghose, S.; Salvado, O.; Fripp, J.

    2014-03-01

    Our group have been developing methods for MRI-alone prostate cancer radiation therapy treatment planning. To assist with clinical validation of the workflow we are investigating a cloud platform solution for research purposes. Benefits of cloud computing can include increased scalability, performance and extensibility while reducing total cost of ownership. In this paper we demonstrate the generation of DICOM-RT directories containing an automatic average atlas based electron density image and fast pelvic organ contouring from whole pelvis MR scans.

  17. GenomeVIP: a cloud platform for genomic variant discovery and interpretation

    PubMed Central

    Mashl, R. Jay; Scott, Adam D.; Huang, Kuan-lin; Wyczalkowski, Matthew A.; Yoon, Christopher J.; Niu, Beifang; DeNardo, Erin; Yellapantula, Venkata D.; Handsaker, Robert E.; Chen, Ken; Koboldt, Daniel C.; Ye, Kai; Fenyö, David; Raphael, Benjamin J.; Wendl, Michael C.; Ding, Li

    2017-01-01

    Identifying genomic variants is a fundamental first step toward the understanding of the role of inherited and acquired variation in disease. The accelerating growth in the corpus of sequencing data that underpins such analysis is making the data-download bottleneck more evident, placing substantial burdens on the research community to keep pace. As a result, the search for alternative approaches to the traditional “download and analyze” paradigm on local computing resources has led to a rapidly growing demand for cloud-computing solutions for genomics analysis. Here, we introduce the Genome Variant Investigation Platform (GenomeVIP), an open-source framework for performing genomics variant discovery and annotation using cloud- or local high-performance computing infrastructure. GenomeVIP orchestrates the analysis of whole-genome and exome sequence data using a set of robust and popular task-specific tools, including VarScan, GATK, Pindel, BreakDancer, Strelka, and Genome STRiP, through a web interface. GenomeVIP has been used for genomic analysis in large-data projects such as the TCGA PanCanAtlas and in other projects, such as the ICGC Pilots, CPTAC, ICGC-TCGA DREAM Challenges, and the 1000 Genomes SV Project. Here, we demonstrate GenomeVIP's ability to provide high-confidence annotated somatic, germline, and de novo variants of potential biological significance using publicly available data sets. PMID:28522612

  18. Atlas2 Cloud: a framework for personal genome analysis in the cloud

    PubMed Central

    2012-01-01

    Background Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms. PMID:23134663

  19. Atlas2 Cloud: a framework for personal genome analysis in the cloud.

    PubMed

    Evani, Uday S; Challis, Danny; Yu, Jin; Jackson, Andrew R; Paithankar, Sameer; Bainbridge, Matthew N; Jakkamsetti, Adinarayana; Pham, Peter; Coarfa, Cristian; Milosavljevic, Aleksandar; Yu, Fuli

    2012-01-01

    Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.

  20. MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud.

    PubMed

    Expósito, Roberto R; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan

    2017-09-01

    This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool. Source code in Java and Hadoop as well as a user's guide are freely available under the GNU GPLv3 license at http://mardre.des.udc.es . rreye@udc.es. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  1. Trust Model to Enhance Security and Interoperability of Cloud Environment

    NASA Astrophysics Data System (ADS)

    Li, Wenjuan; Ping, Lingdi

    Trust is one of the most important means to improve security and enable interoperability of current heterogeneous independent cloud platforms. This paper first analyzed several trust models used in large and distributed environment and then introduced a novel cloud trust model to solve security issues in cross-clouds environment in which cloud customer can choose different providers' services and resources in heterogeneous domains can cooperate. The model is domain-based. It divides one cloud provider's resource nodes into the same domain and sets trust agent. It distinguishes two different roles cloud customer and cloud server and designs different strategies for them. In our model, trust recommendation is treated as one type of cloud services just like computation or storage. The model achieves both identity authentication and behavior authentication. The results of emulation experiments show that the proposed model can efficiently and safely construct trust relationship in cross-clouds environment.

  2. Application research of Ganglia in Hadoop monitoring and management

    NASA Astrophysics Data System (ADS)

    Li, Gang; Ding, Jing; Zhou, Lixia; Yang, Yi; Liu, Lei; Wang, Xiaolei

    2017-03-01

    There are many applications of Hadoop System in the field of large data, cloud computing. The test bench of storage and application in seismic network at Earthquake Administration of Tianjin use with Hadoop system, which is used the open source software of Ganglia to operate and monitor. This paper reviews the function, installation and configuration process, application effect of operating and monitoring in Hadoop system of the Ganglia system. It briefly introduces the idea and effect of Nagios software monitoring Hadoop system. It is valuable for the industry in the monitoring system of cloud computing platform.

  3. AstroCloud, a Cyber-Infrastructure for Astronomy Research: Overview

    NASA Astrophysics Data System (ADS)

    Cui, C.; Yu, C.; Xiao, J.; He, B.; Li, C.; Fan, D.; Wang, C.; Hong, Z.; Li, S.; Mi, L.; Wan, W.; Cao, Z.; Wang, J.; Yin, S.; Fan, Y.; Wang, J.

    2015-09-01

    AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Tasks such as proposal submission, proposal peer-review, data archiving, data quality control, data release and open access, Cloud based data processing and analyzing, will be all supported on the platform. It will act as a full lifecycle management system for astronomical data and telescopes. Achievements from international Virtual Observatories and Cloud Computing are adopted heavily. In this paper, backgrounds of the project, key features of the system, and latest progresses are introduced.

  4. Provenance based data integrity checking and verification in cloud environments

    PubMed Central

    Haq, Inam Ul; Jan, Bilal; Khan, Fakhri Alam; Ahmad, Awais

    2017-01-01

    Cloud computing is a recent tendency in IT that moves computing and data away from desktop and hand-held devices into large scale processing hubs and data centers respectively. It has been proposed as an effective solution for data outsourcing and on demand computing to control the rising cost of IT setups and management in enterprises. However, with Cloud platforms user’s data is moved into remotely located storages such that users lose control over their data. This unique feature of the Cloud is facing many security and privacy challenges which need to be clearly understood and resolved. One of the important concerns that needs to be addressed is to provide the proof of data integrity, i.e., correctness of the user’s data stored in the Cloud storage. The data in Clouds is physically not accessible to the users. Therefore, a mechanism is required where users can check if the integrity of their valuable data is maintained or compromised. For this purpose some methods are proposed like mirroring, checksumming and using third party auditors amongst others. However, these methods use extra storage space by maintaining multiple copies of data or the presence of a third party verifier is required. In this paper, we address the problem of proving data integrity in Cloud computing by proposing a scheme through which users are able to check the integrity of their data stored in Clouds. In addition, users can track the violation of data integrity if occurred. For this purpose, we utilize a relatively new concept in the Cloud computing called “Data Provenance”. Our scheme is capable to reduce the need of any third party services, additional hardware support and the replication of data items on client side for integrity checking. PMID:28545151

  5. Provenance based data integrity checking and verification in cloud environments.

    PubMed

    Imran, Muhammad; Hlavacs, Helmut; Haq, Inam Ul; Jan, Bilal; Khan, Fakhri Alam; Ahmad, Awais

    2017-01-01

    Cloud computing is a recent tendency in IT that moves computing and data away from desktop and hand-held devices into large scale processing hubs and data centers respectively. It has been proposed as an effective solution for data outsourcing and on demand computing to control the rising cost of IT setups and management in enterprises. However, with Cloud platforms user's data is moved into remotely located storages such that users lose control over their data. This unique feature of the Cloud is facing many security and privacy challenges which need to be clearly understood and resolved. One of the important concerns that needs to be addressed is to provide the proof of data integrity, i.e., correctness of the user's data stored in the Cloud storage. The data in Clouds is physically not accessible to the users. Therefore, a mechanism is required where users can check if the integrity of their valuable data is maintained or compromised. For this purpose some methods are proposed like mirroring, checksumming and using third party auditors amongst others. However, these methods use extra storage space by maintaining multiple copies of data or the presence of a third party verifier is required. In this paper, we address the problem of proving data integrity in Cloud computing by proposing a scheme through which users are able to check the integrity of their data stored in Clouds. In addition, users can track the violation of data integrity if occurred. For this purpose, we utilize a relatively new concept in the Cloud computing called "Data Provenance". Our scheme is capable to reduce the need of any third party services, additional hardware support and the replication of data items on client side for integrity checking.

  6. Space Situational Awareness Data Processing Scalability Utilizing Google Cloud Services

    NASA Astrophysics Data System (ADS)

    Greenly, D.; Duncan, M.; Wysack, J.; Flores, F.

    Space Situational Awareness (SSA) is a fundamental and critical component of current space operations. The term SSA encompasses the awareness, understanding and predictability of all objects in space. As the population of orbital space objects and debris increases, the number of collision avoidance maneuvers grows and prompts the need for accurate and timely process measures. The SSA mission continually evolves to near real-time assessment and analysis demanding the need for higher processing capabilities. By conventional methods, meeting these demands requires the integration of new hardware to keep pace with the growing complexity of maneuver planning algorithms. SpaceNav has implemented a highly scalable architecture that will track satellites and debris by utilizing powerful virtual machines on the Google Cloud Platform. SpaceNav algorithms for processing CDMs outpace conventional means. A robust processing environment for tracking data, collision avoidance maneuvers and various other aspects of SSA can be created and deleted on demand. Migrating SpaceNav tools and algorithms into the Google Cloud Platform will be discussed and the trials and tribulations involved. Information will be shared on how and why certain cloud products were used as well as integration techniques that were implemented. Key items to be presented are: 1.Scientific algorithms and SpaceNav tools integrated into a scalable architecture a) Maneuver Planning b) Parallel Processing c) Monte Carlo Simulations d) Optimization Algorithms e) SW Application Development/Integration into the Google Cloud Platform 2. Compute Engine Processing a) Application Engine Automated Processing b) Performance testing and Performance Scalability c) Cloud MySQL databases and Database Scalability d) Cloud Data Storage e) Redundancy and Availability

  7. Seqcrawler: biological data indexing and browsing platform.

    PubMed

    Sallou, Olivier; Bretaudeau, Anthony; Roult, Aurelien

    2012-07-24

    Seqcrawler takes its roots in software like SRS or Lucegene. It provides an indexing platform to ease the search of data and meta-data in biological banks and it can scale to face the current flow of data. While many biological bank search tools are available on the Internet, mainly provided by large organizations to search their data, there is a lack of free and open source solutions to browse one's own set of data with a flexible query system and able to scale from a single computer to a cloud system. A personal index platform will help labs and bioinformaticians to search their meta-data but also to build a larger information system with custom subsets of data. The software is scalable from a single computer to a cloud-based infrastructure. It has been successfully tested in a private cloud with 3 index shards (pieces of index) hosting ~400 millions of sequence information (whole GenBank, UniProt, PDB and others) for a total size of 600 GB in a fault tolerant architecture (high-availability). It has also been successfully integrated with software to add extra meta-data from blast results to enhance users' result analysis. Seqcrawler provides a complete open source search and store solution for labs or platforms needing to manage large amount of data/meta-data with a flexible and customizable web interface. All components (search engine, visualization and data storage), though independent, share a common and coherent data system that can be queried with a simple HTTP interface. The solution scales easily and can also provide a high availability infrastructure.

  8. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  9. Flexible Description and Adaptive Processing of Earth Observation Data through the BigEarth Platform

    NASA Astrophysics Data System (ADS)

    Gorgan, Dorian; Bacu, Victor; Stefanut, Teodor; Nandra, Cosmin; Mihon, Danut

    2016-04-01

    The Earth Observation data repositories extending periodically by several terabytes become a critical issue for organizations. The management of the storage capacity of such big datasets, accessing policy, data protection, searching, and complex processing require high costs that impose efficient solutions to balance the cost and value of data. Data can create value only when it is used, and the data protection has to be oriented toward allowing innovation that sometimes depends on creative people, which achieve unexpected valuable results through a flexible and adaptive manner. The users need to describe and experiment themselves different complex algorithms through analytics in order to valorize data. The analytics uses descriptive and predictive models to gain valuable knowledge and information from data analysis. Possible solutions for advanced processing of big Earth Observation data are given by the HPC platforms such as cloud. With platforms becoming more complex and heterogeneous, the developing of applications is even harder and the efficient mapping of these applications to a suitable and optimum platform, working on huge distributed data repositories, is challenging and complex as well, even by using specialized software services. From the user point of view, an optimum environment gives acceptable execution times, offers a high level of usability by hiding the complexity of computing infrastructure, and supports an open accessibility and control to application entities and functionality. The BigEarth platform [1] supports the entire flow of flexible description of processing by basic operators and adaptive execution over cloud infrastructure [2]. The basic modules of the pipeline such as the KEOPS [3] set of basic operators, the WorDeL language [4], the Planner for sequential and parallel processing, and the Executor through virtual machines, are detailed as the main components of the BigEarth platform [5]. The presentation exemplifies the development of some Earth Observation oriented applications based on flexible description of processing, and adaptive and portable execution over Cloud infrastructure. Main references for further information: [1] BigEarth project, http://cgis.utcluj.ro/projects/bigearth [2] Gorgan, D., "Flexible and Adaptive Processing of Earth Observation Data over High Performance Computation Architectures", International Conference and Exhibition Satellite 2015, August 17-19, Houston, Texas, USA. [3] Mihon, D., Bacu, V., Colceriu, V., Gorgan, D., "Modeling of Earth Observation Use Cases through the KEOPS System", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 455-460, (2015). [4] Nandra, C., Gorgan, D., "Workflow Description Language for Defining Big Earth Data Processing Tasks", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 461-468, (2015). [5] Bacu, V., Stefan, T., Gorgan, D., "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015).

  10. Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure

    NASA Astrophysics Data System (ADS)

    Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

    2011-09-01

    Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.

  11. The StratusLab cloud distribution: Use-cases and support for scientific applications

    NASA Astrophysics Data System (ADS)

    Floros, E.

    2012-04-01

    The StratusLab project is integrating an open cloud software distribution that enables organizations to setup and provide their own private or public IaaS (Infrastructure as a Service) computing clouds. StratusLab distribution capitalizes on popular infrastructure virtualization solutions like KVM, the OpenNebula virtual machine manager, Claudia service manager and SlipStream deployment platform, which are further enhanced and expanded with additional components developed within the project. The StratusLab distribution covers the core aspects of a cloud IaaS architecture, namely Computing (life-cycle management of virtual machines), Storage, Appliance management and Networking. The resulting software stack provides a packaged turn-key solution for deploying cloud computing services. The cloud computing infrastructures deployed using StratusLab can support a wide range of scientific and business use cases. Grid computing has been the primary use case pursued by the project and for this reason the initial priority has been the support for the deployment and operation of fully virtualized production-level grid sites; a goal that has already been achieved by operating such a site as part of EGI's (European Grid Initiative) pan-european grid infrastructure. In this area the project is currently working to provide non-trivial capabilities like elastic and autonomic management of grid site resources. Although grid computing has been the motivating paradigm, StratusLab's cloud distribution can support a wider range of use cases. Towards this direction, we have developed and currently provide support for setting up general purpose computing solutions like Hadoop, MPI and Torque clusters. For what concerns scientific applications the project is collaborating closely with the Bioinformatics community in order to prepare VM appliances and deploy optimized services for bioinformatics applications. In a similar manner additional scientific disciplines like Earth Science can take advantage of StratusLab cloud solutions. Interested users are welcomed to join StratusLab's user community by getting access to the reference cloud services deployed by the project and offered to the public.

  12. GIFT-Cloud: A data sharing and collaboration platform for medical imaging research.

    PubMed

    Doel, Tom; Shakir, Dzhoshkun I; Pratt, Rosalind; Aertsen, Michael; Moggridge, James; Bellon, Erwin; David, Anna L; Deprest, Jan; Vercauteren, Tom; Ourselin, Sébastien

    2017-02-01

    Clinical imaging data are essential for developing research software for computer-aided diagnosis, treatment planning and image-guided surgery, yet existing systems are poorly suited for data sharing between healthcare and academia: research systems rarely provide an integrated approach for data exchange with clinicians; hospital systems are focused towards clinical patient care with limited access for external researchers; and safe haven environments are not well suited to algorithm development. We have established GIFT-Cloud, a data and medical image sharing platform, to meet the needs of GIFT-Surg, an international research collaboration that is developing novel imaging methods for fetal surgery. GIFT-Cloud also has general applicability to other areas of imaging research. GIFT-Cloud builds upon well-established cross-platform technologies. The Server provides secure anonymised data storage, direct web-based data access and a REST API for integrating external software. The Uploader provides automated on-site anonymisation, encryption and data upload. Gateways provide a seamless process for uploading medical data from clinical systems to the research server. GIFT-Cloud has been implemented in a multi-centre study for fetal medicine research. We present a case study of placental segmentation for pre-operative surgical planning, showing how GIFT-Cloud underpins the research and integrates with the clinical workflow. GIFT-Cloud simplifies the transfer of imaging data from clinical to research institutions, facilitating the development and validation of medical research software and the sharing of results back to the clinical partners. GIFT-Cloud supports collaboration between multiple healthcare and research institutions while satisfying the demands of patient confidentiality, data security and data ownership. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  13. Adventures in Private Cloud: Balancing Cost and Capability at the CloudSat Data Processing Center

    NASA Astrophysics Data System (ADS)

    Partain, P.; Finley, S.; Fluke, J.; Haynes, J. M.; Cronk, H. Q.; Miller, S. D.

    2016-12-01

    Since the beginning of the CloudSat Mission in 2006, The CloudSat Data Processing Center (DPC) at the Cooperative Institute for Research in the Atmosphere (CIRA) has been ingesting data from the satellite and other A-Train sensors, producing data products, and distributing them to researchers around the world. The computing infrastructure was specifically designed to fulfill the requirements as specified at the beginning of what nominally was a two-year mission. The environment consisted of servers dedicated to specific processing tasks in a rigid workflow to generate the required products. To the benefit of science and with credit to the mission engineers, CloudSat has lasted well beyond its planned lifetime and is still collecting data ten years later. Over that period requirements of the data processing system have greatly expanded and opportunities for providing value-added services have presented themselves. But while demands on the system have increased, the initial design allowed for very little expansion in terms of scalability and flexibility. The design did change to include virtual machine processing nodes and distributed workflows but infrastructure management was still a time consuming task when system modification was required to run new tests or implement new processes. To address the scalability, flexibility, and manageability of the system Cloud computing methods and technologies are now being employed. The use of a public cloud like Amazon Elastic Compute Cloud or Google Compute Engine was considered but, among other issues, data transfer and storage cost becomes a problem especially when demand fluctuates as a result of reprocessing and the introduction of new products and services. Instead, the existing system was converted to an on premises private Cloud using the OpenStack computing platform and Ceph software defined storage to reap the benefits of the Cloud computing paradigm. This work details the decisions that were made, the benefits that have been realized, the difficulties that were encountered and issues that still exist.

  14. Mapping land cover change over continental Africa using Landsat and Google Earth Engine cloud computing.

    PubMed

    Midekisa, Alemayehu; Holl, Felix; Savory, David J; Andrade-Pacheco, Ricardo; Gething, Peter W; Bennett, Adam; Sturrock, Hugh J W

    2017-01-01

    Quantifying and monitoring the spatial and temporal dynamics of the global land cover is critical for better understanding many of the Earth's land surface processes. However, the lack of regularly updated, continental-scale, and high spatial resolution (30 m) land cover data limit our ability to better understand the spatial extent and the temporal dynamics of land surface changes. Despite the free availability of high spatial resolution Landsat satellite data, continental-scale land cover mapping using high resolution Landsat satellite data was not feasible until now due to the need for high-performance computing to store, process, and analyze this large volume of high resolution satellite data. In this study, we present an approach to quantify continental land cover and impervious surface changes over a long period of time (15 years) using high resolution Landsat satellite observations and Google Earth Engine cloud computing platform. The approach applied here to overcome the computational challenges of handling big earth observation data by using cloud computing can help scientists and practitioners who lack high-performance computational resources.

  15. Mapping land cover change over continental Africa using Landsat and Google Earth Engine cloud computing

    PubMed Central

    Holl, Felix; Savory, David J.; Andrade-Pacheco, Ricardo; Gething, Peter W.; Bennett, Adam; Sturrock, Hugh J. W.

    2017-01-01

    Quantifying and monitoring the spatial and temporal dynamics of the global land cover is critical for better understanding many of the Earth’s land surface processes. However, the lack of regularly updated, continental-scale, and high spatial resolution (30 m) land cover data limit our ability to better understand the spatial extent and the temporal dynamics of land surface changes. Despite the free availability of high spatial resolution Landsat satellite data, continental-scale land cover mapping using high resolution Landsat satellite data was not feasible until now due to the need for high-performance computing to store, process, and analyze this large volume of high resolution satellite data. In this study, we present an approach to quantify continental land cover and impervious surface changes over a long period of time (15 years) using high resolution Landsat satellite observations and Google Earth Engine cloud computing platform. The approach applied here to overcome the computational challenges of handling big earth observation data by using cloud computing can help scientists and practitioners who lack high-performance computational resources. PMID:28953943

  16. A Comprehensive Review on Adaptability of Network Forensics Frameworks for Mobile Cloud Computing

    PubMed Central

    Abdul Wahab, Ainuddin Wahid; Han, Qi; Bin Abdul Rahman, Zulkanain

    2014-01-01

    Network forensics enables investigation and identification of network attacks through the retrieved digital content. The proliferation of smartphones and the cost-effective universal data access through cloud has made Mobile Cloud Computing (MCC) a congenital target for network attacks. However, confines in carrying out forensics in MCC is interrelated with the autonomous cloud hosting companies and their policies for restricted access to the digital content in the back-end cloud platforms. It implies that existing Network Forensic Frameworks (NFFs) have limited impact in the MCC paradigm. To this end, we qualitatively analyze the adaptability of existing NFFs when applied to the MCC. Explicitly, the fundamental mechanisms of NFFs are highlighted and then analyzed using the most relevant parameters. A classification is proposed to help understand the anatomy of existing NFFs. Subsequently, a comparison is given that explores the functional similarities and deviations among NFFs. The paper concludes by discussing research challenges for progressive network forensics in MCC. PMID:25097880

  17. A comprehensive review on adaptability of network forensics frameworks for mobile cloud computing.

    PubMed

    Khan, Suleman; Shiraz, Muhammad; Wahab, Ainuddin Wahid Abdul; Gani, Abdullah; Han, Qi; Rahman, Zulkanain Bin Abdul

    2014-01-01

    Network forensics enables investigation and identification of network attacks through the retrieved digital content. The proliferation of smartphones and the cost-effective universal data access through cloud has made Mobile Cloud Computing (MCC) a congenital target for network attacks. However, confines in carrying out forensics in MCC is interrelated with the autonomous cloud hosting companies and their policies for restricted access to the digital content in the back-end cloud platforms. It implies that existing Network Forensic Frameworks (NFFs) have limited impact in the MCC paradigm. To this end, we qualitatively analyze the adaptability of existing NFFs when applied to the MCC. Explicitly, the fundamental mechanisms of NFFs are highlighted and then analyzed using the most relevant parameters. A classification is proposed to help understand the anatomy of existing NFFs. Subsequently, a comparison is given that explores the functional similarities and deviations among NFFs. The paper concludes by discussing research challenges for progressive network forensics in MCC.

  18. Development of a cloud-based system for remote monitoring of a PVT panel

    NASA Astrophysics Data System (ADS)

    Saraiva, Luis; Alcaso, Adérito; Vieira, Paulo; Ramos, Carlos Figueiredo; Cardoso, Antonio Marques

    2016-10-01

    The paper presents a monitoring system developed for an energy conversion system based on the sun and known as thermophotovoltaic panel (PVT). The project was implemented using two embedded microcontrollers platforms (arduino Leonardo and arduino yún), wireless transmission systems (WI-FI and XBEE) and net computing ,commonly known as cloud (Google cloud). The main objective of the project is to provide remote access and real-time data monitoring (like: electrical current, electrical voltage, input fluid temperature, output fluid temperature, backward fluid temperature, up PV glass temperature, down PV glass temperature, ambient temperature, solar radiation, wind speed, wind direction and fluid mass flow). This project demonstrates the feasibility of using inexpensive microcontroller's platforms and free internet service in theWeb, to support the remote study of renewable energy systems, eliminating the acquisition of dedicated systems typically more expensive and limited in the kind of processing proposed.

  19. Cloud-Based Smart Health Monitoring System for Automatic Cardiovascular and Fall Risk Assessment in Hypertensive Patients.

    PubMed

    Melillo, P; Orrico, A; Scala, P; Crispino, F; Pecchia, L

    2015-10-01

    The aim of this paper is to describe the design and the preliminary validation of a platform developed to collect and automatically analyze biomedical signals for risk assessment of vascular events and falls in hypertensive patients. This m-health platform, based on cloud computing, was designed to be flexible, extensible, and transparent, and to provide proactive remote monitoring via data-mining functionalities. A retrospective study was conducted to train and test the platform. The developed system was able to predict a future vascular event within the next 12 months with an accuracy rate of 84 % and to identify fallers with an accuracy rate of 72 %. In an ongoing prospective trial, almost all the recruited patients accepted favorably the system with a limited rate of inadherences causing data losses (<20 %). The developed platform supported clinical decision by processing tele-monitored data and providing quick and accurate risk assessment of vascular events and falls.

  20. An adaptive process-based cloud infrastructure for space situational awareness applications

    NASA Astrophysics Data System (ADS)

    Liu, Bingwei; Chen, Yu; Shen, Dan; Chen, Genshe; Pham, Khanh; Blasch, Erik; Rubin, Bruce

    2014-06-01

    Space situational awareness (SSA) and defense space control capabilities are top priorities for groups that own or operate man-made spacecraft. Also, with the growing amount of space debris, there is an increase in demand for contextual understanding that necessitates the capability of collecting and processing a vast amount sensor data. Cloud computing, which features scalable and flexible storage and computing services, has been recognized as an ideal candidate that can meet the large data contextual challenges as needed by SSA. Cloud computing consists of physical service providers and middleware virtual machines together with infrastructure, platform, and software as service (IaaS, PaaS, SaaS) models. However, the typical Virtual Machine (VM) abstraction is on a per operating systems basis, which is at too low-level and limits the flexibility of a mission application architecture. In responding to this technical challenge, a novel adaptive process based cloud infrastructure for SSA applications is proposed in this paper. In addition, the details for the design rationale and a prototype is further examined. The SSA Cloud (SSAC) conceptual capability will potentially support space situation monitoring and tracking, object identification, and threat assessment. Lastly, the benefits of a more granular and flexible cloud computing resources allocation are illustrated for data processing and implementation considerations within a representative SSA system environment. We show that the container-based virtualization performs better than hypervisor-based virtualization technology in an SSA scenario.

  1. CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce.

    PubMed

    Chung, Wei-Chun; Chen, Chien-Chih; Ho, Jan-Ming; Lin, Chung-Yen; Hsu, Wen-Lian; Wang, Yu-Chun; Lee, D T; Lai, Feipei; Huang, Chih-Wei; Chang, Yu-Jung

    2014-01-01

    Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome assembly, mapping reads to genomes, and finding single nucleotide polymorphisms. Major cloud providers offer Hadoop cloud services to their users. However, it remains technically challenging to deploy a Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce. We present CloudDOE, a platform-independent software package implemented in Java. CloudDOE encapsulates technical details behind a user-friendly graphical interface, thus liberating scientists from having to perform complicated operational procedures. Users are guided through the user interface to deploy a Hadoop cloud within in-house computing environments and to run applications specifically targeted for bioinformatics, including CloudBurst, CloudBrush, and CloudRS. One may also use CloudDOE on top of a public cloud. CloudDOE consists of three wizards, i.e., Deploy, Operate, and Extend wizards. Deploy wizard is designed to aid the system administrator to deploy a Hadoop cloud. It installs Java runtime environment version 1.6 and Hadoop version 0.20.203, and initiates the service automatically. Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the dashboard list, the administrator may install a new MapReduce application using Extend wizard. CloudDOE is a user-friendly tool for deploying a Hadoop cloud. Its smart wizards substantially reduce the complexity and costs of deployment, execution, enhancement, and management. Interested users may collaborate to improve the source code of CloudDOE to further incorporate more MapReduce bioinformatics tools into CloudDOE and support next-generation big data open source tools, e.g., Hadoop BigTop and Spark. CloudDOE is distributed under Apache License 2.0 and is freely available at http://clouddoe.iis.sinica.edu.tw/.

  2. CloudDOE: A User-Friendly Tool for Deploying Hadoop Clouds and Analyzing High-Throughput Sequencing Data with MapReduce

    PubMed Central

    Chung, Wei-Chun; Chen, Chien-Chih; Ho, Jan-Ming; Lin, Chung-Yen; Hsu, Wen-Lian; Wang, Yu-Chun; Lee, D. T.; Lai, Feipei; Huang, Chih-Wei; Chang, Yu-Jung

    2014-01-01

    Background Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome assembly, mapping reads to genomes, and finding single nucleotide polymorphisms. Major cloud providers offer Hadoop cloud services to their users. However, it remains technically challenging to deploy a Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce. Results We present CloudDOE, a platform-independent software package implemented in Java. CloudDOE encapsulates technical details behind a user-friendly graphical interface, thus liberating scientists from having to perform complicated operational procedures. Users are guided through the user interface to deploy a Hadoop cloud within in-house computing environments and to run applications specifically targeted for bioinformatics, including CloudBurst, CloudBrush, and CloudRS. One may also use CloudDOE on top of a public cloud. CloudDOE consists of three wizards, i.e., Deploy, Operate, and Extend wizards. Deploy wizard is designed to aid the system administrator to deploy a Hadoop cloud. It installs Java runtime environment version 1.6 and Hadoop version 0.20.203, and initiates the service automatically. Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the dashboard list, the administrator may install a new MapReduce application using Extend wizard. Conclusions CloudDOE is a user-friendly tool for deploying a Hadoop cloud. Its smart wizards substantially reduce the complexity and costs of deployment, execution, enhancement, and management. Interested users may collaborate to improve the source code of CloudDOE to further incorporate more MapReduce bioinformatics tools into CloudDOE and support next-generation big data open source tools, e.g., Hadoop BigTop and Spark. Availability: CloudDOE is distributed under Apache License 2.0 and is freely available at http://clouddoe.iis.sinica.edu.tw/. PMID:24897343

  3. Open Reading Frame Phylogenetic Analysis on the Cloud

    PubMed Central

    2013-01-01

    Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

  4. phpMs: A PHP-Based Mass Spectrometry Utilities Library.

    PubMed

    Collins, Andrew; Jones, Andrew R

    2018-03-02

    The recent establishment of cloud computing, high-throughput networking, and more versatile web standards and browsers has led to a renewed interest in web-based applications. While traditionally big data has been the domain of optimized desktop and server applications, it is now possible to store vast amounts of data and perform the necessary calculations offsite in cloud storage and computing providers, with the results visualized in a high-quality cross-platform interface via a web browser. There are number of emerging platforms for cloud-based mass spectrometry data analysis; however, there is limited pre-existing code accessible to web developers, especially for those that are constrained to a shared hosting environment where Java and C applications are often forbidden from use by the hosting provider. To remedy this, we provide an open-source mass spectrometry library for one of the most commonly used web development languages, PHP. Our new library, phpMs, provides objects for storing and manipulating spectra and identification data as well as utilities for file reading, file writing, calculations, peptide fragmentation, and protein digestion as well as a software interface for controlling search engines. We provide a working demonstration of some of the capabilities at http://pgb.liv.ac.uk/phpMs .

  5. Change Detection of Mobile LIDAR Data Using Cloud Computing

    NASA Astrophysics Data System (ADS)

    Liu, Kun; Boehm, Jan; Alis, Christian

    2016-06-01

    Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of the voxels with the same size, which fairly suites parallel computation. Apache Spark is a popular distributed parallel computing platform which allows fault tolerance and memory cache. These features can significantly enhance the performance of Apache Spark and results in an efficient and robust implementation. In our experiments, both synthetic and real point cloud data are employed to demonstrate the quality of our method.

  6. The big data processing platform for intelligent agriculture

    NASA Astrophysics Data System (ADS)

    Huang, Jintao; Zhang, Lichen

    2017-08-01

    Big data technology is another popular technology after the Internet of Things and cloud computing. Big data is widely used in many fields such as social platform, e-commerce, and financial analysis and so on. Intelligent agriculture in the course of the operation will produce large amounts of data of complex structure, fully mining the value of these data for the development of agriculture will be very meaningful. This paper proposes an intelligent data processing platform based on Storm and Cassandra to realize the storage and management of big data of intelligent agriculture.

  7. TRIDEC Cloud - a Web-based Platform for Tsunami Early Warning tested with NEAMWave14 Scenarios

    NASA Astrophysics Data System (ADS)

    Hammitzsch, Martin; Spazier, Johannes; Reißland, Sven; Necmioglu, Ocal; Comoglu, Mustafa; Ozer Sozdinler, Ceren; Carrilho, Fernando; Wächter, Joachim

    2015-04-01

    In times of cloud computing and ubiquitous computing the use of concepts and paradigms introduced by information and communications technology (ICT) have to be considered even for early warning systems (EWS). Based on the experiences and the knowledge gained in research projects new technologies are exploited to implement a cloud-based and web-based platform - the TRIDEC Cloud - to open up new prospects for EWS. The platform in its current version addresses tsunami early warning and mitigation. It merges several complementary external and in-house cloud-based services for instant tsunami propagation calculations and automated background computation with graphics processing units (GPU), for web-mapping of hazard specific geospatial data, and for serving relevant functionality to handle, share, and communicate threat specific information in a collaborative and distributed environment. The TRIDEC Cloud can be accessed in two different modes, the monitoring mode and the exercise-and-training mode. The monitoring mode provides important functionality required to act in a real event. So far, the monitoring mode integrates historic and real-time sea level data and latest earthquake information. The integration of sources is supported by a simple and secure interface. The exercise and training mode enables training and exercises with virtual scenarios. This mode disconnects real world systems and connects with a virtual environment that receives virtual earthquake information and virtual sea level data re-played by a scenario player. Thus operators and other stakeholders are able to train skills and prepare for real events and large exercises. The GFZ German Research Centre for Geosciences (GFZ), the Kandilli Observatory and Earthquake Research Institute (KOERI), and the Portuguese Institute for the Sea and Atmosphere (IPMA) have used the opportunity provided by NEAMWave14 to test the TRIDEC Cloud as a collaborative activity based on previous partnership and commitments at the European scale. The TRIDEC Cloud has not been involved officially in Part B of the NEAMWave14 scenarios. However, the scenarios have been used by GFZ, KOERI, and IPMA for testing in exercise runs on October 27-28, 2014. Additionally, the Greek NEAMWave14 scenario has been tested in an exercise run by GFZ only on October 29, 2014 (see ICG/NEAMTWS-XI/13). The exercise runs demonstrated that operators in warning centres and stakeholders of other involved parties just need a standard web browser to access a full-fledged TEWS. The integration of GPU accelerated tsunami simulation computations have been an integral part to foster early warning with on-demand tsunami predictions based on actual source parameters. Thus tsunami travel times, estimated times of arrival and estimated wave heights are available immediately for visualization and for further analysis and processing. The generation of warning messages is based on internationally agreed message structures and includes static and dynamic information based on earthquake information, instant computations of tsunami simulations, and actual measurements. Generated messages are served for review, modification, and addressing in one simple form for dissemination via Cloud Messages, Shared Maps, e-mail, FTP/GTS, SMS, and FAX. Cloud Messages and Shared Maps are complementary channels and integrate interactive event and simulation data. Thus recipients are enabled to interact dynamically with a map and diagrams beyond traditional text information.

  8. An Interactive Web-Based Analysis Framework for Remote Sensing Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wang, X. Z.; Zhang, H. M.; Zhao, J. H.; Lin, Q. H.; Zhou, Y. C.; Li, J. H.

    2015-07-01

    Spatiotemporal data, especially remote sensing data, are widely used in ecological, geographical, agriculture, and military research and applications. With the development of remote sensing technology, more and more remote sensing data are accumulated and stored in the cloud. An effective way for cloud users to access and analyse these massive spatiotemporal data in the web clients becomes an urgent issue. In this paper, we proposed a new scalable, interactive and web-based cloud computing solution for massive remote sensing data analysis. We build a spatiotemporal analysis platform to provide the end-user with a safe and convenient way to access massive remote sensing data stored in the cloud. The lightweight cloud storage system used to store public data and users' private data is constructed based on open source distributed file system. In it, massive remote sensing data are stored as public data, while the intermediate and input data are stored as private data. The elastic, scalable, and flexible cloud computing environment is built using Docker, which is a technology of open-source lightweight cloud computing container in the Linux operating system. In the Docker container, open-source software such as IPython, NumPy, GDAL, and Grass GIS etc., are deployed. Users can write scripts in the IPython Notebook web page through the web browser to process data, and the scripts will be submitted to IPython kernel to be executed. By comparing the performance of remote sensing data analysis tasks executed in Docker container, KVM virtual machines and physical machines respectively, we can conclude that the cloud computing environment built by Docker makes the greatest use of the host system resources, and can handle more concurrent spatial-temporal computing tasks. Docker technology provides resource isolation mechanism in aspects of IO, CPU, and memory etc., which offers security guarantee when processing remote sensing data in the IPython Notebook. Users can write complex data processing code on the web directly, so they can design their own data processing algorithm.

  9. Simulation Platform: a cloud-based online simulation environment.

    PubMed

    Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

    2011-09-01

    For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. Reprint of: Simulation Platform: a cloud-based online simulation environment.

    PubMed

    Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

    2011-11-01

    For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.

  11. A case study of tuning MapReduce for efficient Bioinformatics in the cloud

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, Lizhen; Wang, Zhong; Yu, Weikuan

    The combination of the Hadoop MapReduce programming model and cloud computing allows biological scientists to analyze next-generation sequencing (NGS) data in a timely and cost-effective manner. Cloud computing platforms remove the burden of IT facility procurement and management from end users and provide ease of access to Hadoop clusters. However, biological scientists are still expected to choose appropriate Hadoop parameters for running their jobs. More importantly, the available Hadoop tuning guidelines are either obsolete or too general to capture the particular characteristics of bioinformatics applications. In this paper, we aim to minimize the cloud computing cost spent on bioinformatics datamore » analysis by optimizing the extracted significant Hadoop parameters. When using MapReduce-based bioinformatics tools in the cloud, the default settings often lead to resource underutilization and wasteful expenses. We choose k-mer counting, a representative application used in a large number of NGS data analysis tools, as our study case. Experimental results show that, with the fine-tuned parameters, we achieve a total of 4× speedup compared with the original performance (using the default settings). Finally, this paper presents an exemplary case for tuning MapReduce-based bioinformatics applications in the cloud, and documents the key parameters that could lead to significant performance benefits.« less

  12. Big data processing in the cloud - Challenges and platforms

    NASA Astrophysics Data System (ADS)

    Zhelev, Svetoslav; Rozeva, Anna

    2017-12-01

    Choosing the appropriate architecture and technologies for a big data project is a difficult task, which requires extensive knowledge in both the problem domain and in the big data landscape. The paper analyzes the main big data architectures and the most widely implemented technologies used for processing and persisting big data. Clouds provide for dynamic resource scaling, which makes them a natural fit for big data applications. Basic cloud computing service models are presented. Two architectures for processing big data are discussed, Lambda and Kappa architectures. Technologies for big data persistence are presented and analyzed. Stream processing as the most important and difficult to manage is outlined. The paper highlights main advantages of cloud and potential problems.

  13. Efficient operating system level virtualization techniques for cloud resources

    NASA Astrophysics Data System (ADS)

    Ansu, R.; Samiksha; Anju, S.; Singh, K. John

    2017-11-01

    Cloud computing is an advancing technology which provides the servcies of Infrastructure, Platform and Software. Virtualization and Computer utility are the keys of Cloud computing. The numbers of cloud users are increasing day by day. So it is the need of the hour to make resources available on demand to satisfy user requirements. The technique in which resources namely storage, processing power, memory and network or I/O are abstracted is known as Virtualization. For executing the operating systems various virtualization techniques are available. They are: Full System Virtualization and Para Virtualization. In Full Virtualization, the whole architecture of hardware is duplicated virtually. No modifications are required in Guest OS as the OS deals with the VM hypervisor directly. In Para Virtualization, modifications of OS is required to run in parallel with other OS. For the Guest OS to access the hardware, the host OS must provide a Virtual Machine Interface. OS virtualization has many advantages such as migrating applications transparently, consolidation of server, online maintenance of OS and providing security. This paper briefs both the virtualization techniques and discusses the issues in OS level virtualization.

  14. The cloud paradigm applied to e-Health.

    PubMed

    Vilaplana, Jordi; Solsona, Francesc; Abella; Filgueira, Rosa; Rius, Josep

    2013-03-14

    Cloud computing is a new paradigm that is changing how enterprises, institutions and people understand, perceive and use current software systems. With this paradigm, the organizations have no need to maintain their own servers, nor host their own software. Instead, everything is moved to the cloud and provided on demand, saving energy, physical space and technical staff. Cloud-based system architectures provide many advantages in terms of scalability, maintainability and massive data processing. We present the design of an e-health cloud system, modelled by an M/M/m queue with QoS capabilities, i.e. maximum waiting time of requests. Detailed results for the model formed by a Jackson network of two M/M/m queues from the queueing theory perspective are presented. These results show a significant performance improvement when the number of servers increases. Platform scalability becomes a critical issue since we aim to provide the system with high Quality of Service (QoS). In this paper we define an architecture capable of adapting itself to different diseases and growing numbers of patients. This platform could be applied to the medical field to greatly enhance the results of those therapies that have an important psychological component, such as addictions and chronic diseases.

  15. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    PubMed

    Yazar, Seyhan; Gooden, George E C; Mackey, David A; Hewitt, Alex W

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2) for E.coli and 53.5% (95% CI: 34.4-72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1) and 173.9% (95% CI: 134.6-213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  16. Benchmarking Undedicated Cloud Computing Providers for Analysis of Genomic Datasets

    PubMed Central

    Yazar, Seyhan; Gooden, George E. C.; Mackey, David A.; Hewitt, Alex W.

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5–78.2) for E.coli and 53.5% (95% CI: 34.4–72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5–303.1) and 173.9% (95% CI: 134.6–213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE. PMID:25247298

  17. CloudNeo: a cloud pipeline for identifying patient-specific tumor neoantigens.

    PubMed

    Bais, Preeti; Namburi, Sandeep; Gatti, Daniel M; Zhang, Xinyu; Chuang, Jeffrey H

    2017-10-01

    We present CloudNeo, a cloud-based computational workflow for identifying patient-specific tumor neoantigens from next generation sequencing data. Tumor-specific mutant peptides can be detected by the immune system through their interactions with the human leukocyte antigen complex, and neoantigen presence has recently been shown to correlate with anti T-cell immunity and efficacy of checkpoint inhibitor therapy. However computing capabilities to identify neoantigens from genomic sequencing data are a limiting factor for understanding their role. This challenge has grown as cancer datasets become increasingly abundant, making them cumbersome to store and analyze on local servers. Our cloud-based pipeline provides scalable computation capabilities for neoantigen identification while eliminating the need to invest in local infrastructure for data transfer, storage or compute. The pipeline is a Common Workflow Language (CWL) implementation of human leukocyte antigen (HLA) typing using Polysolver or HLAminer combined with custom scripts for mutant peptide identification and NetMHCpan for neoantigen prediction. We have demonstrated the efficacy of these pipelines on Amazon cloud instances through the Seven Bridges Genomics implementation of the NCI Cancer Genomics Cloud, which provides graphical interfaces for running and editing, infrastructure for workflow sharing and version tracking, and access to TCGA data. The CWL implementation is at: https://github.com/TheJacksonLaboratory/CloudNeo. For users who have obtained licenses for all internal software, integrated versions in CWL and on the Seven Bridges Cancer Genomics Cloud platform (https://cgc.sbgenomics.com/, recommended version) can be obtained by contacting the authors. jeff.chuang@jax.org. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  18. Cancer Diagnosis Epigenomics Scientific Workflow Scheduling in the Cloud Computing Environment Using an Improved PSO Algorithm

    PubMed

    N, Sadhasivam; R, Balamurugan; M, Pandi

    2018-01-27

    Objective: Epigenetic modifications involving DNA methylation and histone statud are responsible for the stable maintenance of cellular phenotypes. Abnormalities may be causally involved in cancer development and therefore could have diagnostic potential. The field of epigenomics refers to all epigenetic modifications implicated in control of gene expression, with a focus on better understanding of human biology in both normal and pathological states. Epigenomics scientific workflow is essentially a data processing pipeline to automate the execution of various genome sequencing operations or tasks. Cloud platform is a popular computing platform for deploying large scale epigenomics scientific workflow. Its dynamic environment provides various resources to scientific users on a pay-per-use billing model. Scheduling epigenomics scientific workflow tasks is a complicated problem in cloud platform. We here focused on application of an improved particle swam optimization (IPSO) algorithm for this purpose. Methods: The IPSO algorithm was applied to find suitable resources and allocate epigenomics tasks so that the total cost was minimized for detection of epigenetic abnormalities of potential application for cancer diagnosis. Result: The results showed that IPSO based task to resource mapping reduced total cost by 6.83 percent as compared to the traditional PSO algorithm. Conclusion: The results for various cancer diagnosis tasks showed that IPSO based task to resource mapping can achieve better costs when compared to PSO based mapping for epigenomics scientific application workflow. Creative Commons Attribution License

  19. Implementation of a Big Data Accessing and Processing Platform for Medical Records in Cloud.

    PubMed

    Yang, Chao-Tung; Liu, Jung-Chun; Chen, Shuo-Tsung; Lu, Hsin-Wen

    2017-08-18

    Big Data analysis has become a key factor of being innovative and competitive. Along with population growth worldwide and the trend aging of population in developed countries, the rate of the national medical care usage has been increasing. Due to the fact that individual medical data are usually scattered in different institutions and their data formats are varied, to integrate those data that continue increasing is challenging. In order to have scalable load capacity for these data platforms, we must build them in good platform architecture. Some issues must be considered in order to use the cloud computing to quickly integrate big medical data into database for easy analyzing, searching, and filtering big data to obtain valuable information.This work builds a cloud storage system with HBase of Hadoop for storing and analyzing big data of medical records and improves the performance of importing data into database. The data of medical records are stored in HBase database platform for big data analysis. This system performs distributed computing on medical records data processing through Hadoop MapReduce programming, and to provide functions, including keyword search, data filtering, and basic statistics for HBase database. This system uses the Put with the single-threaded method and the CompleteBulkload mechanism to import medical data. From the experimental results, we find that when the file size is less than 300MB, the Put with single-threaded method is used and when the file size is larger than 300MB, the CompleteBulkload mechanism is used to improve the performance of data import into database. This system provides a web interface that allows users to search data, filter out meaningful information through the web, and analyze and convert data in suitable forms that will be helpful for medical staff and institutions.

  20. Planetary-Scale Geospatial Data Analysis Techniques in Google's Earth Engine Platform (Invited)

    NASA Astrophysics Data System (ADS)

    Hancher, M.

    2013-12-01

    Geoscientists have more and more access to new tools for large-scale computing. With any tool, some tasks are easy and other tasks hard. It is natural to look to new computing platforms to increase the scale and efficiency of existing techniques, but there is a more exiting opportunity to discover and develop a new vocabulary of fundamental analysis idioms that are made easy and effective by these new tools. Google's Earth Engine platform is a cloud computing environment for earth data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog includes a nearly complete archive of scenes from Landsat 4, 5, 7, and 8 that have been processed by the USGS, as well as a wide variety of other remotely-sensed and ancillary data products. Earth Engine supports a just-in-time computation model that enables real-time preview during algorithm development and debugging as well as during experimental data analysis and open-ended data exploration. Data processing operations are performed in parallel across many computers in Google's datacenters. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, resampling, and associating image metadata with pixel data. Early applications of Earth Engine have included the development of Google's global cloud-free fifteen-meter base map and global multi-decadal time-lapse animations, as well as numerous large and small experimental analyses by scientists from a range of academic, government, and non-governmental institutions, working in a wide variety of application areas including forestry, agriculture, urban mapping, and species habitat modeling. Patterns in the successes and failures of these early efforts have begun to emerge, sketching the outlines of a new set of simple and effective approaches to geospatial data analysis.

  1. TomoMiner and TomoMinerCloud: A software platform for large-scale subtomogram structural analysis

    PubMed Central

    Frazier, Zachary; Xu, Min; Alber, Frank

    2017-01-01

    SUMMARY Cryo-electron tomography (cryoET) captures the 3D electron density distribution of macromolecular complexes in close to native state. With the rapid advance of cryoET acquisition technologies, it is possible to generate large numbers (>100,000) of subtomograms, each containing a macromolecular complex. Often, these subtomograms represent a heterogeneous sample due to variations in structure and composition of a complex in situ form or because particles are a mixture of different complexes. In this case subtomograms must be classified. However, classification of large numbers of subtomograms is a time-intensive task and often a limiting bottleneck. This paper introduces an open source software platform, TomoMiner, for large-scale subtomogram classification, template matching, subtomogram averaging, and alignment. Its scalable and robust parallel processing allows efficient classification of tens to hundreds of thousands of subtomograms. Additionally, TomoMiner provides a pre-configured TomoMinerCloud computing service permitting users without sufficient computing resources instant access to TomoMiners high-performance features. PMID:28552576

  2. SU-D-BRD-01: Cloud-Based Radiation Treatment Planning: Performance Evaluation of Dose Calculation and Plan Optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Na, Y; Kapp, D; Kim, Y

    2014-06-01

    Purpose: To report the first experience on the development of a cloud-based treatment planning system and investigate the performance improvement of dose calculation and treatment plan optimization of the cloud computing platform. Methods: A cloud computing-based radiation treatment planning system (cc-TPS) was developed for clinical treatment planning. Three de-identified clinical head and neck, lung, and prostate cases were used to evaluate the cloud computing platform. The de-identified clinical data were encrypted with 256-bit Advanced Encryption Standard (AES) algorithm. VMAT and IMRT plans were generated for the three de-identified clinical cases to determine the quality of the treatment plans and computationalmore » efficiency. All plans generated from the cc-TPS were compared to those obtained with the PC-based TPS (pc-TPS). The performance evaluation of the cc-TPS was quantified as the speedup factors for Monte Carlo (MC) dose calculations and large-scale plan optimizations, as well as the performance ratios (PRs) of the amount of performance improvement compared to the pc-TPS. Results: Speedup factors were improved up to 14.0-fold dependent on the clinical cases and plan types. The computation times for VMAT and IMRT plans with the cc-TPS were reduced by 91.1% and 89.4%, respectively, on average of the clinical cases compared to those with pc-TPS. The PRs were mostly better for VMAT plans (1.0 ≤ PRs ≤ 10.6 for the head and neck case, 1.2 ≤ PRs ≤ 13.3 for lung case, and 1.0 ≤ PRs ≤ 10.3 for prostate cancer cases) than for IMRT plans. The isodose curves of plans on both cc-TPS and pc-TPS were identical for each of the clinical cases. Conclusion: A cloud-based treatment planning has been setup and our results demonstrate the computation efficiency of treatment planning with the cc-TPS can be dramatically improved while maintaining the same plan quality to that obtained with the pc-TPS. This work was supported in part by the National Cancer Institute (1R01 CA133474) and by Leading Foreign Research Institute Recruitment Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (MSIP) (Grant No.2009-00420)« less

  3. Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

    NASA Astrophysics Data System (ADS)

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

    2014-03-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.

  4. Space Science Cloud: a Virtual Space Science Research Platform Based on Cloud Model

    NASA Astrophysics Data System (ADS)

    Hu, Xiaoyan; Tong, Jizhou; Zou, Ziming

    Through independent and co-operational science missions, Strategic Pioneer Program (SPP) on Space Science, the new initiative of space science program in China which was approved by CAS and implemented by National Space Science Center (NSSC), dedicates to seek new discoveries and new breakthroughs in space science, thus deepen the understanding of universe and planet earth. In the framework of this program, in order to support the operations of space science missions and satisfy the demand of related research activities for e-Science, NSSC is developing a virtual space science research platform based on cloud model, namely the Space Science Cloud (SSC). In order to support mission demonstration, SSC integrates interactive satellite orbit design tool, satellite structure and payloads layout design tool, payload observation coverage analysis tool, etc., to help scientists analyze and verify space science mission designs. Another important function of SSC is supporting the mission operations, which runs through the space satellite data pipelines. Mission operators can acquire and process observation data, then distribute the data products to other systems or issue the data and archives with the services of SSC. In addition, SSC provides useful data, tools and models for space researchers. Several databases in the field of space science are integrated and an efficient retrieve system is developing. Common tools for data visualization, deep processing (e.g., smoothing and filtering tools), analysis (e.g., FFT analysis tool and minimum variance analysis tool) and mining (e.g., proton event correlation analysis tool) are also integrated to help the researchers to better utilize the data. The space weather models on SSC include magnetic storm forecast model, multi-station middle and upper atmospheric climate model, solar energetic particle propagation model and so on. All the services above-mentioned are based on the e-Science infrastructures of CAS e.g. cloud storage and cloud computing. SSC provides its users with self-service storage and computing resources at the same time.At present, the prototyping of SSC is underway and the platform is expected to be put into trial operation in August 2014. We hope that as SSC develops, our vision of Digital Space may come true someday.

  5. Evaluation of wind field statistics near and inside clouds using a coherent Doppler lidar

    NASA Astrophysics Data System (ADS)

    Lottman, Brian Todd

    1998-09-01

    This work proposes advanced techniques for measuring the spatial wind field statistics near and inside clouds using a vertically pointing solid state coherent Doppler lidar on a fixed ground based platform. The coherent Doppler lidar is an ideal instrument for high spatial and temporal resolution velocity estimates. The basic parameters of lidar are discussed, including a complete statistical description of the Doppler lidar signal. This description is extended to cases with simple functional forms for aerosol backscatter and velocity. An estimate for the mean velocity over a sensing volume is produced by estimating the mean spectra. There are many traditional spectral estimators, which are useful for conditions with slowly varying velocity and backscatter. A new class of estimators (novel) is introduced that produces reliable velocity estimates for conditions with large variations in aerosol backscatter and velocity with range, such as cloud conditions. Performance of traditional and novel estimators is computed for a variety of deterministic atmospheric conditions using computer simulated data. Wind field statistics are produced for actual data for a cloud deck, and for multi- layer clouds. Unique results include detection of possible spectral signatures for rain, estimates for the structure function inside a cloud deck, reliable velocity estimation techniques near and inside thin clouds, and estimates for simple wind field statistics between cloud layers.

  6. Ice particle morphology and microphysical properties of cirrus clouds inferred from combined CALIOP-IIR measurements

    NASA Astrophysics Data System (ADS)

    Saito, Masanori; Iwabuchi, Hironobu; Yang, Ping; Tang, Guanglin; King, Michael D.; Sekiguchi, Miho

    2017-04-01

    Ice particle morphology and microphysical properties of cirrus clouds are essential for assessing radiative forcing associated with these clouds. We develop an optimal estimation-based algorithm to infer cirrus cloud optical thickness (COT), cloud effective radius (CER), plate fraction including quasi-horizontally oriented plates (HOPs), and the degree of surface roughness from the Cloud Aerosol Lidar with Orthogonal Polarization (CALIOP) and the Infrared Imaging Radiometer (IIR) on the Cloud Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) platform. A simple but realistic ice particle model is used, and the relevant bulk optical properties are computed using state-of-the-art light-scattering computational capabilities. Rigorous estimation of uncertainties related to surface properties, atmospheric gases, and cloud heterogeneity is performed. The results based on the present method show that COTs are quite consistent with other satellite products and CERs essentially agree with the other counterparts. A 1 month global analysis for April 2007, in which CALIPSO off-nadir angle is 0.3°, shows that the HOP has significant temperature-dependence and is critical to the lidar ratio when cloud temperature is warmer than -40°C. The lidar ratio is calculated from the bulk optical properties based on the inferred parameters, showing robust temperature dependence. The median lidar ratio of cirrus clouds is 27-31 sr over the globe.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lingerfelt, Eric J; Endeve, Eirik; Hui, Yawei

    Improvements in scientific instrumentation allow imaging at mesoscopic to atomic length scales, many spectroscopic modes, and now--with the rise of multimodal acquisition systems and the associated processing capability--the era of multidimensional, informationally dense data sets has arrived. Technical issues in these combinatorial scientific fields are exacerbated by computational challenges best summarized as a necessity for drastic improvement in the capability to transfer, store, and analyze large volumes of data. The Bellerophon Environment for Analysis of Materials (BEAM) platform provides material scientists the capability to directly leverage the integrated computational and analytical power of High Performance Computing (HPC) to perform scalablemore » data analysis and simulation and manage uploaded data files via an intuitive, cross-platform client user interface. This framework delivers authenticated, "push-button" execution of complex user workflows that deploy data analysis algorithms and computational simulations utilizing compute-and-data cloud infrastructures and HPC environments like Titan at the Oak Ridge Leadershp Computing Facility (OLCF).« less

  8. cisPath: an R/Bioconductor package for cloud users for visualization and management of functional protein interaction networks.

    PubMed

    Wang, Likun; Yang, Luhe; Peng, Zuohan; Lu, Dan; Jin, Yan; McNutt, Michael; Yin, Yuxin

    2015-01-01

    With the burgeoning development of cloud technology and services, there are an increasing number of users who prefer cloud to run their applications. All software and associated data are hosted on the cloud, allowing users to access them via a web browser from any computer, anywhere. This paper presents cisPath, an R/Bioconductor package deployed on cloud servers for client users to visualize, manage, and share functional protein interaction networks. With this R package, users can easily integrate downloaded protein-protein interaction information from different online databases with private data to construct new and personalized interaction networks. Additional functions allow users to generate specific networks based on private databases. Since the results produced with the use of this package are in the form of web pages, cloud users can easily view and edit the network graphs via the browser, using a mouse or touch screen, without the need to download them to a local computer. This package can also be installed and run on a local desktop computer. Depending on user preference, results can be publicized or shared by uploading to a web server or cloud driver, allowing other users to directly access results via a web browser. This package can be installed and run on a variety of platforms. Since all network views are shown in web pages, such package is particularly useful for cloud users. The easy installation and operation is an attractive quality for R beginners and users with no previous experience with cloud services.

  9. cisPath: an R/Bioconductor package for cloud users for visualization and management of functional protein interaction networks

    PubMed Central

    2015-01-01

    Background With the burgeoning development of cloud technology and services, there are an increasing number of users who prefer cloud to run their applications. All software and associated data are hosted on the cloud, allowing users to access them via a web browser from any computer, anywhere. This paper presents cisPath, an R/Bioconductor package deployed on cloud servers for client users to visualize, manage, and share functional protein interaction networks. Results With this R package, users can easily integrate downloaded protein-protein interaction information from different online databases with private data to construct new and personalized interaction networks. Additional functions allow users to generate specific networks based on private databases. Since the results produced with the use of this package are in the form of web pages, cloud users can easily view and edit the network graphs via the browser, using a mouse or touch screen, without the need to download them to a local computer. This package can also be installed and run on a local desktop computer. Depending on user preference, results can be publicized or shared by uploading to a web server or cloud driver, allowing other users to directly access results via a web browser. Conclusions This package can be installed and run on a variety of platforms. Since all network views are shown in web pages, such package is particularly useful for cloud users. The easy installation and operation is an attractive quality for R beginners and users with no previous experience with cloud services. PMID:25708840

  10. A case study in open source innovation: developing the Tidepool Platform for interoperability in type 1 diabetes management.

    PubMed

    Neinstein, Aaron; Wong, Jenise; Look, Howard; Arbiter, Brandon; Quirk, Kent; McCanne, Steve; Sun, Yao; Blum, Michael; Adi, Saleh

    2016-03-01

    Develop a device-agnostic cloud platform to host diabetes device data and catalyze an ecosystem of software innovation for type 1 diabetes (T1D) management. An interdisciplinary team decided to establish a nonprofit company, Tidepool, and build open-source software. Through a user-centered design process, the authors created a software platform, the Tidepool Platform, to upload and host T1D device data in an integrated, device-agnostic fashion, as well as an application ("app"), Blip, to visualize the data. Tidepool's software utilizes the principles of modular components, modern web design including REST APIs and JavaScript, cloud computing, agile development methodology, and robust privacy and security. By consolidating the currently scattered and siloed T1D device data ecosystem into one open platform, Tidepool can improve access to the data and enable new possibilities and efficiencies in T1D clinical care and research. The Tidepool Platform decouples diabetes apps from diabetes devices, allowing software developers to build innovative apps without requiring them to design a unique back-end (e.g., database and security) or unique ways of ingesting device data. It allows people with T1D to choose to use any preferred app regardless of which device(s) they use. The authors believe that the Tidepool Platform can solve two current problems in the T1D device landscape: 1) limited access to T1D device data and 2) poor interoperability of data from different devices. If proven effective, Tidepool's open source, cloud model for health data interoperability is applicable to other healthcare use cases. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  11. A case study in open source innovation: developing the Tidepool Platform for interoperability in type 1 diabetes management

    PubMed Central

    Wong, Jenise; Look, Howard; Arbiter, Brandon; Quirk, Kent; McCanne, Steve; Sun, Yao; Blum, Michael; Adi, Saleh

    2016-01-01

    Objective Develop a device-agnostic cloud platform to host diabetes device data and catalyze an ecosystem of software innovation for type 1 diabetes (T1D) management. Materials and Methods An interdisciplinary team decided to establish a nonprofit company, Tidepool, and build open-source software. Results Through a user-centered design process, the authors created a software platform, the Tidepool Platform, to upload and host T1D device data in an integrated, device-agnostic fashion, as well as an application (“app”), Blip, to visualize the data. Tidepool’s software utilizes the principles of modular components, modern web design including REST APIs and JavaScript, cloud computing, agile development methodology, and robust privacy and security. Discussion By consolidating the currently scattered and siloed T1D device data ecosystem into one open platform, Tidepool can improve access to the data and enable new possibilities and efficiencies in T1D clinical care and research. The Tidepool Platform decouples diabetes apps from diabetes devices, allowing software developers to build innovative apps without requiring them to design a unique back-end (e.g., database and security) or unique ways of ingesting device data. It allows people with T1D to choose to use any preferred app regardless of which device(s) they use. Conclusion The authors believe that the Tidepool Platform can solve two current problems in the T1D device landscape: 1) limited access to T1D device data and 2) poor interoperability of data from different devices. If proven effective, Tidepool’s open source, cloud model for health data interoperability is applicable to other healthcare use cases. PMID:26338218

  12. A protect solution for data security in mobile cloud storage

    NASA Astrophysics Data System (ADS)

    Yu, Xiaojun; Wen, Qiaoyan

    2013-03-01

    It is popular to access the cloud storage by mobile devices. However, this application suffer data security risk, especial the data leakage and privacy violate problem. This risk exists not only in cloud storage system, but also in mobile client platform. To reduce the security risk, this paper proposed a new security solution. It makes full use of the searchable encryption and trusted computing technology. Given the performance limit of the mobile devices, it proposes the trusted proxy based protection architecture. The design basic idea, deploy model and key flows are detailed. The analysis from the security and performance shows the advantage.

  13. A cloud-based data network approach for translational cancer research.

    PubMed

    Xing, Wei; Tsoumakos, Dimitrios; Ghanem, Moustafa

    2015-01-01

    We develop a new model and associated technology for constructing and managing self-organizing data to support translational cancer research studies. We employ a semantic content network approach to address the challenges of managing cancer research data. Such data is heterogeneous, large, decentralized, growing and continually being updated. Moreover, the data originates from different information sources that may be partially overlapping, creating redundancies as well as contradictions and inconsistencies. Building on the advantages of elasticity of cloud computing, we deploy the cancer data networks on top of the CELAR Cloud platform to enable more effective processing and analysis of Big cancer data.

  14. Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application.

    PubMed

    Hanwell, Marcus D; de Jong, Wibe A; Harris, Christopher J

    2017-10-30

    An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction-connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platform with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web-going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.

  15. Proposed Use of the NASA Ames Nebula Cloud Computing Platform for Numerical Weather Prediction and the Distribution of High Resolution Satellite Imagery

    NASA Technical Reports Server (NTRS)

    Limaye, Ashutosh S.; Molthan, Andrew L.; Srikishen, Jayanthi

    2010-01-01

    The development of the Nebula Cloud Computing Platform at NASA Ames Research Center provides an open-source solution for the deployment of scalable computing and storage capabilities relevant to the execution of real-time weather forecasts and the distribution of high resolution satellite data to the operational weather community. Two projects at Marshall Space Flight Center may benefit from use of the Nebula system. The NASA Short-term Prediction Research and Transition (SPoRT) Center facilitates the use of unique NASA satellite data and research capabilities in the operational weather community by providing datasets relevant to numerical weather prediction, and satellite data sets useful in weather analysis. SERVIR provides satellite data products for decision support, emphasizing environmental threats such as wildfires, floods, landslides, and other hazards, with interests in numerical weather prediction in support of disaster response. The Weather Research and Forecast (WRF) model Environmental Modeling System (WRF-EMS) has been configured for Nebula cloud computing use via the creation of a disk image and deployment of repeated instances. Given the available infrastructure within Nebula and the "infrastructure as a service" concept, the system appears well-suited for the rapid deployment of additional forecast models over different domains, in response to real-time research applications or disaster response. Future investigations into Nebula capabilities will focus on the development of a web mapping server and load balancing configuration to support the distribution of high resolution satellite data sets to users within the National Weather Service and international partners of SERVIR.

  16. Secure Skyline Queries on Cloud Platform.

    PubMed

    Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian

    2017-04-01

    Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.

  17. The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections

    NASA Astrophysics Data System (ADS)

    Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.

    2014-12-01

    The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that software is both upgradable and maintainable, and can be readily reused with complexly integrated systems and become part of the growing global trusted community tools for cross-disciplinary research.

  18. Range wise busy checking 2-way imbalanced algorithm for cloudlet allocation in cloud environment

    NASA Astrophysics Data System (ADS)

    Alanzy, Mohammed; Latip, Rohaya; Muhammed, Abdullah

    2018-05-01

    Cloud computing considers as a new business paradigm and a popular platform over the last few years. Many organizations, agencies, and departments consider responsible tasks time and tasks needed to be accomplished as soon as possible. These agencies counter IT issues due to the massive arise of data, applications, and solution scopes. Currently, the main issue related with the cloud is the way of making the environment of the cloud computing more qualified, and this way needs a competent allocation strategy of the cloudlet, Thus, there are huge number of studies conducted with regards to this matter that sought to assign the cloudlets to VMs or resources by variety of strategies. In this paper we have proposed range wise busy checking 2-way imbalanced Algorithm in cloud computing. Compare to other methods, it decreases the completion time to finish tasks’ execution, it is considered the fundamental part to enhance the system performance such as the makespan. This algorithm was simulated using Cloudsim to give more opportunity to the higher VM speed to accommodate more Cloudlets in its local queue without considering the threshold balance condition. The simulation result shows that the average makespan time is lesser compare to the previous cloudlet allocation strategy.

  19. eTRIKS platform: Conception and operation of a highly scalable cloud-based platform for translational research and applications development.

    PubMed

    Bussery, Justin; Denis, Leslie-Alexandre; Guillon, Benjamin; Liu, Pengfeï; Marchetti, Gino; Rahal, Ghita

    2018-04-01

    We describe the genesis, design and evolution of a computing platform designed and built to improve the success rate of biomedical translational research. The eTRIKS project platform was developed with the aim of building a platform that can securely host heterogeneous types of data and provide an optimal environment to run tranSMART analytical applications. Many types of data can now be hosted, including multi-OMICS data, preclinical laboratory data and clinical information, including longitudinal data sets. During the last two years, the platform has matured into a robust translational research knowledge management system that is able to host other data mining applications and support the development of new analytical tools. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. Large-scale parallel genome assembler over cloud computing environment.

    PubMed

    Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

    2017-06-01

    The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.

  1. Machine Learning for Flood Prediction in Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Kuhn, C.; Tellman, B.; Max, S. A.; Schwarz, B.

    2015-12-01

    With the increasing availability of high-resolution satellite imagery, dynamic flood mapping in near real time is becoming a reachable goal for decision-makers. This talk describes a newly developed framework for predicting biophysical flood vulnerability using public data, cloud computing and machine learning. Our objective is to define an approach to flood inundation modeling using statistical learning methods deployed in a cloud-based computing platform. Traditionally, static flood extent maps grounded in physically based hydrologic models can require hours of human expertise to construct at significant financial cost. In addition, desktop modeling software and limited local server storage can impose restraints on the size and resolution of input datasets. Data-driven, cloud-based processing holds promise for predictive watershed modeling at a wide range of spatio-temporal scales. However, these benefits come with constraints. In particular, parallel computing limits a modeler's ability to simulate the flow of water across a landscape, rendering traditional routing algorithms unusable in this platform. Our project pushes these limits by testing the performance of two machine learning algorithms, Support Vector Machine (SVM) and Random Forests, at predicting flood extent. Constructed in Google Earth Engine, the model mines a suite of publicly available satellite imagery layers to use as algorithm inputs. Results are cross-validated using MODIS-based flood maps created using the Dartmouth Flood Observatory detection algorithm. Model uncertainty highlights the difficulty of deploying unbalanced training data sets based on rare extreme events.

  2. Real-time video streaming in mobile cloud over heterogeneous wireless networks

    NASA Astrophysics Data System (ADS)

    Abdallah-Saleh, Saleh; Wang, Qi; Grecos, Christos

    2012-06-01

    Recently, the concept of Mobile Cloud Computing (MCC) has been proposed to offload the resource requirements in computational capabilities, storage and security from mobile devices into the cloud. Internet video applications such as real-time streaming are expected to be ubiquitously deployed and supported over the cloud for mobile users, who typically encounter a range of wireless networks of diverse radio access technologies during their roaming. However, real-time video streaming for mobile cloud users across heterogeneous wireless networks presents multiple challenges. The network-layer quality of service (QoS) provision to support high-quality mobile video delivery in this demanding scenario remains an open research question, and this in turn affects the application-level visual quality and impedes mobile users' perceived quality of experience (QoE). In this paper, we devise a framework to support real-time video streaming in this new mobile video networking paradigm and evaluate the performance of the proposed framework empirically through a lab-based yet realistic testing platform. One particular issue we focus on is the effect of users' mobility on the QoS of video streaming over the cloud. We design and implement a hybrid platform comprising of a test-bed and an emulator, on which our concept of mobile cloud computing, video streaming and heterogeneous wireless networks are implemented and integrated to allow the testing of our framework. As representative heterogeneous wireless networks, the popular WLAN (Wi-Fi) and MAN (WiMAX) networks are incorporated in order to evaluate effects of handovers between these different radio access technologies. The H.264/AVC (Advanced Video Coding) standard is employed for real-time video streaming from a server to mobile users (client nodes) in the networks. Mobility support is introduced to enable continuous streaming experience for a mobile user across the heterogeneous wireless network. Real-time video stream packets are captured for analytical purposes on the mobile user node. Experimental results are obtained and analysed. Future work is identified towards further improvement of the current design and implementation. With this new mobile video networking concept and paradigm implemented and evaluated, results and observations obtained from this study would form the basis of a more in-depth, comprehensive understanding of various challenges and opportunities in supporting high-quality real-time video streaming in mobile cloud over heterogeneous wireless networks.

  3. The cloud paradigm applied to e-Health

    PubMed Central

    2013-01-01

    Background Cloud computing is a new paradigm that is changing how enterprises, institutions and people understand, perceive and use current software systems. With this paradigm, the organizations have no need to maintain their own servers, nor host their own software. Instead, everything is moved to the cloud and provided on demand, saving energy, physical space and technical staff. Cloud-based system architectures provide many advantages in terms of scalability, maintainability and massive data processing. Methods We present the design of an e-health cloud system, modelled by an M/M/m queue with QoS capabilities, i.e. maximum waiting time of requests. Results Detailed results for the model formed by a Jackson network of two M/M/m queues from the queueing theory perspective are presented. These results show a significant performance improvement when the number of servers increases. Conclusions Platform scalability becomes a critical issue since we aim to provide the system with high Quality of Service (QoS). In this paper we define an architecture capable of adapting itself to different diseases and growing numbers of patients. This platform could be applied to the medical field to greatly enhance the results of those therapies that have an important psychological component, such as addictions and chronic diseases. PMID:23496912

  4. SenSyF Experience on Integration of EO Services in a Generic, Cloud-Based EO Exploitation Platform

    NASA Astrophysics Data System (ADS)

    Almeida, Nuno; Catarino, Nuno; Gutierrez, Antonio; Grosso, Nuno; Andrade, Joao; Caumont, Herve; Goncalves, Pedro; Villa, Guillermo; Mangin, Antoine; Serra, Romain; Johnsen, Harald; Grydeland, Tom; Emsley, Stephen; Jauch, Eduardo; Moreno, Jose; Ruiz, Antonio

    2016-08-01

    SenSyF is a cloud-based data processing framework for EO- based services. It has been pioneer in addressing Big Data issues from the Earth Observation point of view, and is a precursor of several of the technologies and methodologies that will be deployed in ESA's Thematic Exploitation Platforms and other related systems.The SenSyF system focuses on developing fully automated data management, together with access to a processing and exploitation framework, including Earth Observation specific tools. SenSyF is both a development and validation platform for data intensive applications using Earth Observation data. With SenSyF, scientific, institutional or commercial institutions developing EO- based applications and services can take advantage of distributed computational and storage resources, tailored for applications dependent on big Earth Observation data, and without resorting to deep infrastructure and technological investments.This paper describes the integration process and the experience gathered from different EO Service providers during the project.

  5. Cloud-Based Speech Technology for Assistive Technology Applications (CloudCAST).

    PubMed

    Cunningham, Stuart; Green, Phil; Christensen, Heidi; Atria, José Joaquín; Coy, André; Malavasi, Massimiliano; Desideri, Lorenzo; Rudzicz, Frank

    2017-01-01

    The CloudCAST platform provides a series of speech recognition services that can be integrated into assistive technology applications. The platform and the services provided by the public API are described. Several exemplar applications have been developed to demonstrate the platform to potential developers and users.

  6. Bringing Legacy Visualization Software to Modern Computing Devices via Application Streaming

    NASA Astrophysics Data System (ADS)

    Fisher, Ward

    2014-05-01

    Planning software compatibility across forthcoming generations of computing platforms is a problem commonly encountered in software engineering and development. While this problem can affect any class of software, data analysis and visualization programs are particularly vulnerable. This is due in part to their inherent dependency on specialized hardware and computing environments. A number of strategies and tools have been designed to aid software engineers with this task. While generally embraced by developers at 'traditional' software companies, these methodologies are often dismissed by the scientific software community as unwieldy, inefficient and unnecessary. As a result, many important and storied scientific software packages can struggle to adapt to a new computing environment; for example, one in which much work is carried out on sub-laptop devices (such as tablets and smartphones). Rewriting these packages for a new platform often requires significant investment in terms of development time and developer expertise. In many cases, porting older software to modern devices is neither practical nor possible. As a result, replacement software must be developed from scratch, wasting resources better spent on other projects. Enabled largely by the rapid rise and adoption of cloud computing platforms, 'Application Streaming' technologies allow legacy visualization and analysis software to be operated wholly from a client device (be it laptop, tablet or smartphone) while retaining full functionality and interactivity. It mitigates much of the developer effort required by other more traditional methods while simultaneously reducing the time it takes to bring the software to a new platform. This work will provide an overview of Application Streaming and how it compares against other technologies which allow scientific visualization software to be executed from a remote computer. We will discuss the functionality and limitations of existing application streaming frameworks and how a developer might prepare their software for application streaming. We will also examine the secondary benefits realized by moving legacy software to the cloud. Finally, we will examine the process by which a legacy Java application, the Integrated Data Viewer (IDV), is to be adapted for tablet computing via Application Streaming.

  7. Cloud Computing-based Platform for Drought Decision-Making using Remote Sensing and Modeling Products: Preliminary Results for Brazil

    NASA Astrophysics Data System (ADS)

    Vivoni, E.; Mascaro, G.; Shupe, J. W.; Hiatt, C.; Potter, C. S.; Miller, R. L.; Stanley, J.; Abraham, T.; Castilla-Rubio, J.

    2012-12-01

    Droughts and their hydrological consequences are a major threat to food security throughout the world. In arid and semiarid regions dependent on irrigated agriculture, prolonged droughts lead to significant and recurring economic and social losses. In this contribution, we present preliminary results on integrating a set of multi-resolution drought indices into a cloud computing-based visualization platform. We focused our initial efforts on Brazil due to a severe, on-going drought in a large agricultural area in the northeastern part of the country. The online platform includes drought products developed from: (1) a MODIS-based water stress index (WSI) based on inferences from normalized difference vegetation index and land surface temperature fields, (2) a volumetric water content (VWC) index obtained from application of the NASA CASA model, and (3) a set of AVHRR-based vegetation health indices obtained from NOAA/NESDIS. The drought indices are also presented in terms of anomalies with respect to a baseline period. Since our main objective is to engage stakeholders and decision-makers in Brazil, we incorporated other relevant geospatial data into the platform, including irrigation areas, dams and reservoirs, administrative units and annual climate information. We will also present a set of use cases developed to help stakeholders explore, query and provide feedback that allowed fine-tuning of the drought product delivery, presentation and analysis tools. Finally, we discuss potential next steps in development of the online platform, including applications at finer resolutions in specific basins and at a coarser global scale.

  8. Towards real-time photon Monte Carlo dose calculation in the cloud

    NASA Astrophysics Data System (ADS)

    Ziegenhein, Peter; Kozin, Igor N.; Kamerling, Cornelis Ph; Oelfke, Uwe

    2017-06-01

    Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.

  9. Towards real-time photon Monte Carlo dose calculation in the cloud.

    PubMed

    Ziegenhein, Peter; Kozin, Igor N; Kamerling, Cornelis Ph; Oelfke, Uwe

    2017-06-07

    Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.

  10. Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application

    DOE PAGES

    Hanwell, Marcus D.; de Jong, Wibe A.; Harris, Christopher J.

    2017-10-30

    An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction - connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platformmore » with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web - going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.« less

  11. Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hanwell, Marcus D.; de Jong, Wibe A.; Harris, Christopher J.

    An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction - connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platformmore » with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web - going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.« less

  12. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

    NASA Astrophysics Data System (ADS)

    Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

    Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  13. Analysis of the new health management based on health internet of things and cloud computing

    NASA Astrophysics Data System (ADS)

    Liu, Shaogang

    2018-05-01

    With the development and application of Internet of things and cloud technology in the medical field, it provides a higher level of exploration space for human health management. By analyzing the Internet of things technology and cloud technology, this paper studies a new form of health management system which conforms to the current social and technical level, and explores its system architecture, system characteristics and application. The new health management platform for networking and cloud can achieve the real-time monitoring and prediction of human health through a variety of sensors and wireless networks based on information and can be transmitted to the monitoring system, and then through the software analysis model, and gives the targeted prevention and treatment measures, to achieve real-time, intelligent health management.

  14. Portable Map-Reduce Utility for MIT SuperCloud Environment

    DTIC Science & Technology

    2015-09-17

    Reuther, A. Rosa, C. Yee, “Driving Big Data With Big Compute,” IEEE HPEC, Sep 10-12, 2012, Waltham, MA. [6] Apache Hadoop 1.2.1 Documentation: HDFS... big data architecture, which is designed to address these challenges, is made of the computing resources, scheduler, central storage file system...databases, analytics software and web interfaces [1]. These components are common to many big data and supercomputing systems. The platform is

  15. Reconstructing evolutionary trees in parallel for massive sequences.

    PubMed

    Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

    2017-12-14

    Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .

  16. Unidata Cyberinfrastructure in the Cloud

    NASA Astrophysics Data System (ADS)

    Ramamurthy, M. K.; Young, J. W.

    2016-12-01

    Data services, software, and user support are critical components of geosciences cyber-infrastructure to help researchers to advance science. With the maturity of and significant advances in cloud computing, it has recently emerged as an alternative new paradigm for developing and delivering a broad array of services over the Internet. Cloud computing is now mature enough in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Given the enormous potential of cloud-based services, Unidata has been moving to augment its software, services, data delivery mechanisms to align with the cloud-computing paradigm. To realize the above vision, Unidata has worked toward: * Providing access to many types of data from a cloud (e.g., via the THREDDS Data Server, RAMADDA and EDEX servers); * Deploying data-proximate tools to easily process, analyze, and visualize those data in a cloud environment cloud for consumption by any one, by any device, from anywhere, at any time; * Developing and providing a range of pre-configured and well-integrated tools and services that can be deployed by any university in their own private or public cloud settings. Specifically, Unidata has developed Docker for "containerized applications", making them easy to deploy. Docker helps to create "disposable" installs and eliminates many configuration challenges. Containerized applications include tools for data transport, access, analysis, and visualization: THREDDS Data Server, Integrated Data Viewer, GEMPAK, Local Data Manager, RAMADDA Data Server, and Python tools; * Leveraging Jupyter as a central platform and hub with its powerful set of interlinking tools to connect interactively data servers, Python scientific libraries, scripts, and workflows; * Exploring end-to-end modeling and prediction capabilities in the cloud; * Partnering with NOAA and public cloud vendors (e.g., Amazon and OCC) on the NOAA Big Data Project to harness their capabilities and resources for the benefit of the academic community.

  17. Generation of Classical DInSAR and PSI Ground Motion Maps on a Cloud Thematic Platform

    NASA Astrophysics Data System (ADS)

    Mora, Oscar; Ordoqui, Patrick; Romero, Laia

    2016-08-01

    This paper presents the experience of ALTAMIRA INFORMATION uploading InSAR (Synthetic Aperture Radar Interferometry) services in the Geohazard Exploitation Platform (GEP), supported by ESA. Two different processing chains are presented jointly with ground motion maps obtained from the cloud computing, one being DIAPASON for classical DInSAR and SPN (Stable Point Network) for PSI (Persistent Scatterer Interferometry) processing. The product obtained from DIAPASON is the interferometric phase related to ground motion (phase fringes from a SAR pair). SPN provides motion data (mean velocity and time series) on high-quality pixels from a stack of SAR images. DIAPASON is already implemented, and SPN is under development to be exploited with historical data coming from ERS-1/2 and ENVISAT satellites, and current acquisitions of SENTINEL-1 in SLC and TOPSAR modes.

  18. Scientific Data Storage for Cloud Computing

    NASA Astrophysics Data System (ADS)

    Readey, J.

    2014-12-01

    Traditionally data storage used for geophysical software systems has centered on file-based systems and libraries such as NetCDF and HDF5. In contrast cloud based infrastructure providers such as Amazon AWS, Microsoft Azure, and the Google Cloud Platform generally provide storage technologies based on an object based storage service (for large binary objects) complemented by a database service (for small objects that can be represented as key-value pairs). These systems have been shown to be highly scalable, reliable, and cost effective. We will discuss a proposed system that leverages these cloud-based storage technologies to provide an API-compatible library for traditional NetCDF and HDF5 applications. This system will enable cloud storage suitable for geophysical applications that can scale up to petabytes of data and thousands of users. We'll also cover other advantages of this system such as enhanced metadata search.

  19. Design of the smart scenic spot service platform

    NASA Astrophysics Data System (ADS)

    Yin, Min; Wang, Shi-tai

    2015-12-01

    With the deepening of the smart city construction, the model "smart+" is rapidly developing. Guilin, the international tourism metropolis fast constructing need smart tourism technology support. This paper studied the smart scenic spot service object and its requirements. And then constructed the smart service platform of the scenic spot application of 3S technology (Geographic Information System (GIS), Remote Sensing (RS) and Global Navigation Satellite System (GNSS)) and the Internet of things, cloud computing. Based on Guilin Seven-star Park scenic area as an object, this paper designed the Seven-star smart scenic spot service platform framework. The application of this platform will improve the tourists' visiting experience, make the tourism management more scientifically and standardly, increase tourism enterprises operating earnings.

  20. Secure Skyline Queries on Cloud Platform

    PubMed Central

    Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian

    2017-01-01

    Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions. PMID:28883710

  1. Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

    PubMed

    Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

    2016-01-01

    With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.

  2. The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences.

    PubMed

    Merchant, Nirav; Lyons, Eric; Goff, Stephen; Vaughn, Matthew; Ware, Doreen; Micklos, David; Antin, Parker

    2016-01-01

    The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identity management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as distributed teams. iPlant's platform permits researchers to easily deposit and share their data and deploy new computational tools and analysis workflows, allowing the broader community to easily use and reuse those data and computational analyses.

  3. SU-E-T-314: The Application of Cloud Computing in Pencil Beam Scanning Proton Therapy Monte Carlo Simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Z; Gao, M

    Purpose: Monte Carlo simulation plays an important role for proton Pencil Beam Scanning (PBS) technique. However, MC simulation demands high computing power and is limited to few large proton centers that can afford a computer cluster. We study the feasibility of utilizing cloud computing in the MC simulation of PBS beams. Methods: A GATE/GEANT4 based MC simulation software was installed on a commercial cloud computing virtual machine (Linux 64-bits, Amazon EC2). Single spot Integral Depth Dose (IDD) curves and in-air transverse profiles were used to tune the source parameters to simulate an IBA machine. With the use of StarCluster softwaremore » developed at MIT, a Linux cluster with 2–100 nodes can be conveniently launched in the cloud. A proton PBS plan was then exported to the cloud where the MC simulation was run. Results: The simulated PBS plan has a field size of 10×10cm{sup 2}, 20cm range, 10cm modulation, and contains over 10,000 beam spots. EC2 instance type m1.medium was selected considering the CPU/memory requirement and 40 instances were used to form a Linux cluster. To minimize cost, master node was created with on-demand instance and worker nodes were created with spot-instance. The hourly cost for the 40-node cluster was $0.63 and the projected cost for a 100-node cluster was $1.41. Ten million events were simulated to plot PDD and profile, with each job containing 500k events. The simulation completed within 1 hour and an overall statistical uncertainty of < 2% was achieved. Good agreement between MC simulation and measurement was observed. Conclusion: Cloud computing is a cost-effective and easy to maintain platform to run proton PBS MC simulation. When proton MC packages such as GATE and TOPAS are combined with cloud computing, it will greatly facilitate the pursuing of PBS MC studies, especially for newly established proton centers or individual researchers.« less

  4. A novel approach for discovering condition-specific correlations of gene expressions within biological pathways by using cloud computing technology.

    PubMed

    Chang, Tzu-Hao; Wu, Shih-Lin; Wang, Wei-Jen; Horng, Jorng-Tzong; Chang, Cheng-Wei

    2014-01-01

    Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations that cause diseases might have been disregarded. This paper proposes an approach for discovering the condition-specific correlations of gene expressions within biological pathways. Because analyzing gene expression correlations is time consuming, an Apache Hadoop cloud computing platform was implemented. Three microarray data sets of breast cancer were collected from the Gene Expression Omnibus, and pathway information from the Kyoto Encyclopedia of Genes and Genomes was applied for discovering meaningful biological correlations. The results showed that adopting the Hadoop platform considerably decreased the computation time. Several correlations of differential gene expressions were discovered between the relapse and nonrelapse breast cancer samples, and most of them were involved in cancer regulation and cancer-related pathways. The results showed that breast cancer recurrence might be highly associated with the abnormal regulations of these gene pairs, rather than with their individual expression levels. The proposed method was computationally efficient and reliable, and stable results were obtained when different data sets were used. The proposed method is effective in identifying meaningful biological regulation patterns between conditions.

  5. Open Science in the Cloud: Towards a Universal Platform for Scientific and Statistical Computing

    NASA Astrophysics Data System (ADS)

    Chine, Karim

    The UK, through the e-Science program, the US through the NSF-funded cyber infrastructure and the European Union through the ICT Calls aimed to provide "the technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge".1 The Grid (Foster, 2002; Foster; Kesselman, Nick, & Tuecke, 2002), foreseen as a major accelerator of discovery, didn't meet the expectations it had excited at its beginnings and was not adopted by the broad population of research professionals. The Grid is a good tool for particle physicists and it has allowed them to tackle the tremendous computational challenges inherent to their field. However, as a technology and paradigm for delivering computing on demand, it doesn't work and it can't be fixed. On one hand, "the abstractions that Grids expose - to the end-user, to the deployers and to application developers - are inappropriate and they need to be higher level" (Jha, Merzky, & Fox), and on the other hand, academic Grids are inherently economically unsustainable. They can't compete with a service outsourced to the Industry whose quality and price would be driven by market forces. The virtualization technologies and their corollary, the Infrastructure-as-a-Service (IaaS) style cloud, hold the promise to enable what the Grid failed to deliver: a sustainable environment for computational sciences that would lower the barriers for accessing federated computational resources, software tools and data; enable collaboration and resources sharing and provide the building blocks of a ubiquitous platform for traceable and reproducible computational research.

  6. Benchmarking gate-based quantum computers

    NASA Astrophysics Data System (ADS)

    Michielsen, Kristel; Nocon, Madita; Willsch, Dennis; Jin, Fengping; Lippert, Thomas; De Raedt, Hans

    2017-11-01

    With the advent of public access to small gate-based quantum processors, it becomes necessary to develop a benchmarking methodology such that independent researchers can validate the operation of these processors. We explore the usefulness of a number of simple quantum circuits as benchmarks for gate-based quantum computing devices and show that circuits performing identity operations are very simple, scalable and sensitive to gate errors and are therefore very well suited for this task. We illustrate the procedure by presenting benchmark results for the IBM Quantum Experience, a cloud-based platform for gate-based quantum computing.

  7. Millstone: software for multiplex microbial genome analysis and engineering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  8. Millstone: software for multiplex microbial genome analysis and engineering.

    PubMed

    Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  9. Millstone: software for multiplex microbial genome analysis and engineering

    DOE PAGES

    Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.; ...

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  10. Farm Management Support on Cloud Computing Platform: A System for Cropland Monitoring Using Multi-Source Remotely Sensed Data

    NASA Astrophysics Data System (ADS)

    Coburn, C. A.; Qin, Y.; Zhang, J.; Staenz, K.

    2015-12-01

    Food security is one of the most pressing issues facing humankind. Recent estimates predict that over one billion people don't have enough food to meet their basic nutritional needs. The ability of remote sensing tools to monitor and model crop production and predict crop yield is essential for providing governments and farmers with vital information to ensure food security. Google Earth Engine (GEE) is a cloud computing platform, which integrates storage and processing algorithms for massive remotely sensed imagery and vector data sets. By providing the capabilities of storing and analyzing the data sets, it provides an ideal platform for the development of advanced analytic tools for extracting key variables used in regional and national food security systems. With the high performance computing and storing capabilities of GEE, a cloud-computing based system for near real-time crop land monitoring was developed using multi-source remotely sensed data over large areas. The system is able to process and visualize the MODIS time series NDVI profile in conjunction with Landsat 8 image segmentation for crop monitoring. With multi-temporal Landsat 8 imagery, the crop fields are extracted using the image segmentation algorithm developed by Baatz et al.[1]. The MODIS time series NDVI data are modeled by TIMESAT [2], a software package developed for analyzing time series of satellite data. The seasonality of MODIS time series data, for example, the start date of the growing season, length of growing season, and NDVI peak at a field-level are obtained for evaluating the crop-growth conditions. The system fuses MODIS time series NDVI data and Landsat 8 imagery to provide information of near real-time crop-growth conditions through the visualization of MODIS NDVI time series and comparison of multi-year NDVI profiles. Stakeholders, i.e., farmers and government officers, are able to obtain crop-growth information at crop-field level online. This unique utilization of GEE in combination with advanced analytic and extraction techniques provides a vital remote sensing tool for decision makers and scientists with a high-degree of flexibility to adapt to different uses.

  11. Public health practice course using Google Plus.

    PubMed

    Wu, Ting-Ting; Sung, Tien-Wen

    2014-03-01

    In recent years, mobile device-assisted clinical education has become popular among nursing school students. The introduction of mobile devices saves manpower and reduces errors while enhancing nursing students' professional knowledge and skills. To respond to the demands of various learning strategies and to maintain existing systems of education, the concept of Cloud Learning is gradually being introduced to instructional environments. Cloud computing facilitates learning that is personalized, diverse, and virtual. This study involved assessing the advantages of mobile devices and Cloud Learning in a public health practice course, in which Google+ was used as the learning platform, integrating various application tools. Users could save and access data by using any wireless Internet device. The platform was student centered and based on resource sharing and collaborative learning. With the assistance of highly flexible and convenient technology, certain obstacles in traditional practice training can be resolved. Our findings showed that the students who adopted Google+ were learned more effectively compared with those who were limited to traditional learning systems. Most students and the nurse educator expressed a positive attitude toward and were satisfied with the innovative learning method.

  12. Design and performance of the virtualization platform for offline computing on the ATLAS TDAQ Farm

    NASA Astrophysics Data System (ADS)

    Ballestrero, S.; Batraneanu, S. M.; Brasolin, F.; Contescu, C.; Di Girolamo, A.; Lee, C. J.; Pozo Astigarraga, M. E.; Scannicchio, D. A.; Twomey, M. S.; Zaytsev, A.

    2014-06-01

    With the LHC collider at CERN currently going through the period of Long Shutdown 1 there is an opportunity to use the computing resources of the experiments' large trigger farms for other data processing activities. In the case of the ATLAS experiment, the TDAQ farm, consisting of more than 1500 compute nodes, is suitable for running Monte Carlo (MC) production jobs that are mostly CPU and not I/O bound. This contribution gives a thorough review of the design and deployment of a virtualized platform running on this computing resource and of its use to run large groups of CernVM based virtual machines operating as a single CERN-P1 WLCG site. This platform has been designed to guarantee the security and the usability of the ATLAS private network, and to minimize interference with TDAQ's usage of the farm. Openstack has been chosen to provide a cloud management layer. The experience gained in the last 3.5 months shows that the use of the TDAQ farm for the MC simulation contributes to the ATLAS data processing at the level of a large Tier-1 WLCG site, despite the opportunistic nature of the underlying computing resources being used.

  13. A cloud-based system for automatic glaucoma screening.

    PubMed

    Fengshou Yin; Damon Wing Kee Wong; Ying Quan; Ai Ping Yow; Ngan Meng Tan; Gopalakrishnan, Kavitha; Beng Hai Lee; Yanwu Xu; Zhuo Zhang; Jun Cheng; Jiang Liu

    2015-08-01

    In recent years, there has been increasing interest in the use of automatic computer-based systems for the detection of eye diseases including glaucoma. However, these systems are usually standalone software with basic functions only, limiting their usage in a large scale. In this paper, we introduce an online cloud-based system for automatic glaucoma screening through the use of medical image-based pattern classification technologies. It is designed in a hybrid cloud pattern to offer both accessibility and enhanced security. Raw data including patient's medical condition and fundus image, and resultant medical reports are collected and distributed through the public cloud tier. In the private cloud tier, automatic analysis and assessment of colour retinal fundus images are performed. The ubiquitous anywhere access nature of the system through the cloud platform facilitates a more efficient and cost-effective means of glaucoma screening, allowing the disease to be detected earlier and enabling early intervention for more efficient intervention and disease management.

  14. Improvement of DHRA-DMDC Physical Access Software DBIDS Using Cloud Computing Technology: A Case Study

    DTIC Science & Technology

    2012-06-01

    technology originally developed on the Java platform. The Hibernate framework supports rapid development of a data access layer without requiring a...31 viii 2. Hibernate ................................................................................ 31 3. Database Design...protect from security threats; o Easy aggregate management operations via file tags; 2. Hibernate We recommend using Hibernate technology for object

  15. Technology in the Classroom: Transitioning Lab and Design to an All-Digital Workflow

    ERIC Educational Resources Information Center

    Anastasio, Daniel; Suresh, Aravind; Burkey, Daniel D.

    2013-01-01

    Mobile platforms and cloud computing allow collaborative sharing and submission of work in ways not feasible until recently. In this article, we detail how we took a writing-intensive laboratory and made the submission, reading, grading, and returning of student work online and paper-free, while maintaining familiar elements of pen-and-paper…

  16. Ed Tech Experts Choose Top Tools

    ERIC Educational Resources Information Center

    Demski, Jennifer

    2010-01-01

    Ten years into the new century, people are still trying to find the web 2.0 tools that best facilitate collaboration--one of the fundamentals of 21st century learning. As the number of tools continues to grow, and fuzzy terms like "cloud computing," "hashtags," and "synchronous live platforms" are introduced into the lexicon daily, even the most…

  17. Towards a 3d Based Platform for Cultural Heritage Site Survey and Virtual Exploration

    NASA Astrophysics Data System (ADS)

    Seinturier, J.; Riedinger, C.; Mahiddine, A.; Peloso, D.; Boï, J.-M.; Merad, D.; Drap, P.

    2013-07-01

    This paper present a 3D platform that enables to make both cultural heritage site survey and its virtual exploration. It provides a single and easy way to use framework for merging multi scaled 3D measurements based on photogrammetry, documentation produced by experts and the knowledge of involved domains leaving the experts able to extract and choose the relevant information to produce the final survey. Taking into account the interpretation of the real world during the process of archaeological surveys is in fact the main goal of a survey. New advances in photogrammetry and the capability to produce dense 3D point clouds do not solve the problem of surveys. New opportunities for 3D representation are now available and we must to use them and find new ways to link geometry and knowledge. The new platform is able to efficiently manage and process large 3D data (points set, meshes) thanks to the implementation of space partition methods coming from the state of the art such as octrees and kd-trees and thus can interact with dense point clouds (thousands to millions of points) in real time. The semantisation of raw 3D data relies on geometric algorithms such as geodetic path computation, surface extraction from dense points cloud and geometrical primitive optimization. The platform provide an interface that enables expert to describe geometric representations of interesting objects like ashlar blocs, stratigraphic units or generic items (contour, lines, … ) directly onto the 3D representation of the site and without explicit links to underlying algorithms. The platform provide two ways for describing geometric representation. If oriented photographs are available, the expert can draw geometry on a photograph and the system computes its 3D representation by projection on the underlying mesh or the points cloud. If photographs are not available or if the expert wants to only use the 3D representation then he can simply draw objects shape on it. When 3D representations of objects of a surveyed site are extracted from the mesh, the link with domain related documentation is done by means of a set of forms designed by experts. Information from these forms are linked with geometry such that documentation can be attached to the viewed objects. Additional semantisation methods related to specific domains have been added to the platform. Beyond realistic rendering of surveyed site, the platform embeds non photorealistic rendering (NPR) algorithms. These algorithms enable to dynamically illustrate objects of interest that are related to knowledge with specific styles. The whole platform is implemented with a Java framework and relies on an actual and effective 3D engine that make available latest rendering methods. We illustrate this work on various photogrammetric survey, in medieval archaeology with the Shawbak castle in Jordan and in underwater archaeology on different marine sites.

  18. Scalable, High-performance 3D Imaging Software Platform: System Architecture and Application to Virtual Colonoscopy

    PubMed Central

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli; Brett, Bevin

    2013-01-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. In this work, we have developed a software platform that is designed to support high-performance 3D medical image processing for a wide range of applications using increasingly available and affordable commodity computing systems: multi-core, clusters, and cloud computing systems. To achieve scalable, high-performance computing, our platform (1) employs size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D image processing algorithms; (2) supports task scheduling for efficient load distribution and balancing; and (3) consists of a layered parallel software libraries that allow a wide range of medical applications to share the same functionalities. We evaluated the performance of our platform by applying it to an electronic cleansing system in virtual colonoscopy, with initial experimental results showing a 10 times performance improvement on an 8-core workstation over the original sequential implementation of the system. PMID:23366803

  19. Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

    2013-12-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the 'A-Train' platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (MERRA), stratify the comparisons using a classification of the 'cloud scenes' from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically 'sharded' by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved 'clock time' speedups in fusing datasets on our own compute nodes and in the public Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed 'near' the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill. SciReduce Architecture

  20. Remote sensing image segmentation based on Hadoop cloud platform

    NASA Astrophysics Data System (ADS)

    Li, Jie; Zhu, Lingling; Cao, Fubin

    2018-01-01

    To solve the problem that the remote sensing image segmentation speed is slow and the real-time performance is poor, this paper studies the method of remote sensing image segmentation based on Hadoop platform. On the basis of analyzing the structural characteristics of Hadoop cloud platform and its component MapReduce programming, this paper proposes a method of image segmentation based on the combination of OpenCV and Hadoop cloud platform. Firstly, the MapReduce image processing model of Hadoop cloud platform is designed, the input and output of image are customized and the segmentation method of the data file is rewritten. Then the Mean Shift image segmentation algorithm is implemented. Finally, this paper makes a segmentation experiment on remote sensing image, and uses MATLAB to realize the Mean Shift image segmentation algorithm to compare the same image segmentation experiment. The experimental results show that under the premise of ensuring good effect, the segmentation rate of remote sensing image segmentation based on Hadoop cloud Platform has been greatly improved compared with the single MATLAB image segmentation, and there is a great improvement in the effectiveness of image segmentation.

  1. Development of a Cloud Resolving Model for Heterogeneous Supercomputers

    NASA Astrophysics Data System (ADS)

    Sreepathi, S.; Norman, M. R.; Pal, A.; Hannah, W.; Ponder, C.

    2017-12-01

    A cloud resolving climate model is needed to reduce major systematic errors in climate simulations due to structural uncertainty in numerical treatments of convection - such as convective storm systems. This research describes the porting effort to enable SAM (System for Atmosphere Modeling) cloud resolving model on heterogeneous supercomputers using GPUs (Graphical Processing Units). We have isolated a standalone configuration of SAM that is targeted to be integrated into the DOE ACME (Accelerated Climate Modeling for Energy) Earth System model. We have identified key computational kernels from the model and offloaded them to a GPU using the OpenACC programming model. Furthermore, we are investigating various optimization strategies intended to enhance GPU utilization including loop fusion/fission, coalesced data access and loop refactoring to a higher abstraction level. We will present early performance results, lessons learned as well as optimization strategies. The computational platform used in this study is the Summitdev system, an early testbed that is one generation removed from Summit, the next leadership class supercomputer at Oak Ridge National Laboratory. The system contains 54 nodes wherein each node has 2 IBM POWER8 CPUs and 4 NVIDIA Tesla P100 GPUs. This work is part of a larger project, ACME-MMF component of the U.S. Department of Energy(DOE) Exascale Computing Project. The ACME-MMF approach addresses structural uncertainty in cloud processes by replacing traditional parameterizations with cloud resolving "superparameterization" within each grid cell of global climate model. Super-parameterization dramatically increases arithmetic intensity, making the MMF approach an ideal strategy to achieve good performance on emerging exascale computing architectures. The goal of the project is to integrate superparameterization into ACME, and explore its full potential to scientifically and computationally advance climate simulation and prediction.

  2. Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

    2015-12-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, MODIS, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. HySDS is a Hybrid-Cloud Science Data System that has been developed and applied under NASA AIST, MEaSUREs, and ACCESS grants. HySDS uses the SciFlow workflow engine to partition analysis workflows into parallel tasks (e.g. segmenting by time or space) that are pushed into a durable job queue. The tasks are "pulled" from the queue by worker Virtual Machines (VM's) and executed in an on-premise Cloud (Eucalyptus or OpenStack) or at Amazon in the public Cloud or govCloud. In this way, years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the transferred data. We are using HySDS to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a MEASURES grant. We will present the architecture of HySDS, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. Our system demonstrates how one can pull A-Train variables (Levels 2 & 3) on-demand into the Amazon Cloud, and cache only those variables that are heavily used, so that any number of compute jobs can be executed "near" the multi-sensor data. Decade-long, multi-sensor studies can be performed without pre-staging data, with the researcher paying only his own Cloud compute bill.

  3. Research on the application of wisdom technology in smart city

    NASA Astrophysics Data System (ADS)

    Li, Juntao; Ma, Shuai; Gu, Weihua; Chen, Weiyi

    2015-12-01

    This paper first analyzes the concept of smart technology, the relationship between wisdom technology and smart city, and discusses the practical application of IOT(Internet of things) in smart city to explore a better way to realize smart city; then Introduces the basic concepts of cloud computing and smart city, and explains the relationship between the two; Discusses five advantages of cloud computing that applies to smart city construction: a unified and highly efficient, large-scale infrastructure software and hardware management, service scheduling and resource management, security control and management, energy conservation and management platform layer, and to promote modern practical significance of the development of services, promoting regional social and economic development faster. Finally, a brief description of the wisdom technology and smart city management is presented.

  4. Biomedical Informatics on the Cloud: A Treasure Hunt for Advancing Cardiovascular Medicine.

    PubMed

    Ping, Peipei; Hermjakob, Henning; Polson, Jennifer S; Benos, Panagiotis V; Wang, Wei

    2018-04-27

    In the digital age of cardiovascular medicine, the rate of biomedical discovery can be greatly accelerated by the guidance and resources required to unearth potential collections of knowledge. A unified computational platform leverages metadata to not only provide direction but also empower researchers to mine a wealth of biomedical information and forge novel mechanistic insights. This review takes the opportunity to present an overview of the cloud-based computational environment, including the functional roles of metadata, the architecture schema of indexing and search, and the practical scenarios of machine learning-supported molecular signature extraction. By introducing several established resources and state-of-the-art workflows, we share with our readers a broadly defined informatics framework to phenotype cardiovascular health and disease. © 2018 American Heart Association, Inc.

  5. Software Defined Networking challenges and future direction: A case study of implementing SDN features on OpenStack private cloud

    NASA Astrophysics Data System (ADS)

    Shamugam, Veeramani; Murray, I.; Leong, J. A.; Sidhu, Amandeep S.

    2016-03-01

    Cloud computing provides services on demand instantly, such as access to network infrastructure consisting of computing hardware, operating systems, network storage, database and applications. Network usage and demands are growing at a very fast rate and to meet the current requirements, there is a need for automatic infrastructure scaling. Traditional networks are difficult to automate because of the distributed nature of their decision making process for switching or routing which are collocated on the same device. Managing complex environments using traditional networks is time-consuming and expensive, especially in the case of generating virtual machines, migration and network configuration. To mitigate the challenges, network operations require efficient, flexible, agile and scalable software defined networks (SDN). This paper discuss various issues in SDN and suggests how to mitigate the network management related issues. A private cloud prototype test bed was setup to implement the SDN on the OpenStack platform to test and evaluate the various network performances provided by the various configurations.

  6. Cloud Computing Services for Seismic Networks

    NASA Astrophysics Data System (ADS)

    Olson, Michael

    This thesis describes a compositional framework for developing situation awareness applications: applications that provide ongoing information about a user's changing environment. The thesis describes how the framework is used to develop a situation awareness application for earthquakes. The applications are implemented as Cloud computing services connected to sensors and actuators. The architecture and design of the Cloud services are described and measurements of performance metrics are provided. The thesis includes results of experiments on earthquake monitoring conducted over a year. The applications developed by the framework are (1) the CSN---the Community Seismic Network---which uses relatively low-cost sensors deployed by members of the community, and (2) SAF---the Situation Awareness Framework---which integrates data from multiple sources, including the CSN, CISN---the California Integrated Seismic Network, a network consisting of high-quality seismometers deployed carefully by professionals in the CISN organization and spread across Southern California---and prototypes of multi-sensor platforms that include carbon monoxide, methane, dust and radiation sensors.

  7. Unified Geophysical Cloud Platform (UGCP) for Seismic Monitoring and other Geophysical Applications.

    NASA Astrophysics Data System (ADS)

    Synytsky, R.; Starovoit, Y. O.; Henadiy, S.; Lobzakov, V.; Kolesnikov, L.

    2016-12-01

    We present Unified Geophysical Cloud Platform (UGCP) or UniGeoCloud as an innovative approach for geophysical data processing in the Cloud environment with the ability to run any type of data processing software in isolated environment within the single Cloud platform. We've developed a simple and quick method of several open-source widely known software seismic packages (SeisComp3, Earthworm, Geotool, MSNoise) installation which does not require knowledge of system administration, configuration, OS compatibility issues etc. and other often annoying details preventing time wasting for system configuration work. Installation process is simplified as "mouse click" on selected software package from the Cloud market place. The main objective of the developed capability was the software tools conception with which users are able to design and install quickly their own highly reliable and highly available virtual IT-infrastructure for the organization of seismic (and in future other geophysical) data processing for either research or monitoring purposes. These tools provide access to any seismic station data available in open IP configuration from the different networks affiliated with different Institutions and Organizations. It allows also setting up your own network as you desire by selecting either regionally deployed stations or the worldwide global network based on stations selection form the global map. The processing software and products and research results could be easily monitored from everywhere using variety of user's devices form desk top computers to IT gadgets. Currents efforts of the development team are directed to achieve Scalability, Reliability and Sustainability (SRS) of proposed solutions allowing any user to run their applications with the confidence of no data loss and no failure of the monitoring or research software components. The system is suitable for quick rollout of NDC-in-Box software package developed for State Signatories and aimed for promotion of data processing collected by the IMS Network.

  8. Automatic energy expenditure measurement for health science.

    PubMed

    Catal, Cagatay; Akbulut, Akhan

    2018-04-01

    It is crucial to predict the human energy expenditure in any sports activity and health science application accurately to investigate the impact of the activity. However, measurement of the real energy expenditure is not a trivial task and involves complex steps. The objective of this work is to improve the performance of existing estimation models of energy expenditure by using machine learning algorithms and several data from different sensors and provide this estimation service in a cloud-based platform. In this study, we used input data such as breathe rate, and hearth rate from three sensors. Inputs are received from a web form and sent to the web service which applies a regression model on Azure cloud platform. During the experiments, we assessed several machine learning models based on regression methods. Our experimental results showed that our novel model which applies Boosted Decision Tree Regression in conjunction with the median aggregation technique provides the best result among other five regression algorithms. This cloud-based energy expenditure system which uses a web service showed that cloud computing technology is a great opportunity to develop estimation systems and the new model which applies Boosted Decision Tree Regression with the median aggregation provides remarkable results. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.

    PubMed

    Siretskiy, Alexey; Sundqvist, Tore; Voznesenskiy, Mikhail; Spjuth, Ola

    2015-01-01

    New high-throughput technologies, such as massively parallel sequencing, have transformed the life sciences into a data-intensive field. The most common e-infrastructure for analyzing this data consists of batch systems that are based on high-performance computing resources; however, the bioinformatics software that is built on this platform does not scale well in the general case. Recently, the Hadoop platform has emerged as an interesting option to address the challenges of increasingly large datasets with distributed storage, distributed processing, built-in data locality, fault tolerance, and an appealing programming methodology. In this work we introduce metrics and report on a quantitative comparison between Hadoop and a single node of conventional high-performance computing resources for the tasks of short read mapping and variant calling. We calculate efficiency as a function of data size and observe that the Hadoop platform is more efficient for biologically relevant data sizes in terms of computing hours for both split and un-split data files. We also quantify the advantages of the data locality provided by Hadoop for NGS problems, and show that a classical architecture with network-attached storage will not scale when computing resources increase in numbers. Measurements were performed using ten datasets of different sizes, up to 100 gigabases, using the pipeline implemented in Crossbow. To make a fair comparison, we implemented an improved preprocessor for Hadoop with better performance for splittable data files. For improved usability, we implemented a graphical user interface for Crossbow in a private cloud environment using the CloudGene platform. All of the code and data in this study are freely available as open source in public repositories. From our experiments we can conclude that the improved Hadoop pipeline scales better than the same pipeline on high-performance computing resources, we also conclude that Hadoop is an economically viable option for the common data sizes that are currently used in massively parallel sequencing. Given that datasets are expected to increase over time, Hadoop is a framework that we envision will have an increasingly important role in future biological data analysis.

  10. An interactive computer lab of the galvanic cell for students in biochemistry.

    PubMed

    Ahlstrand, Emma; Buetti-Dinh, Antoine; Friedman, Ran

    2018-01-01

    We describe an interactive module that can be used to teach basic concepts in electrochemistry and thermodynamics to first year natural science students. The module is used together with an experimental laboratory and improves the students' understanding of thermodynamic quantities such as Δ r G, Δ r H, and Δ r S that are calculated but not directly measured in the lab. We also discuss how new technologies can substitute some parts of experimental chemistry courses, and improve accessibility to course material. Cloud computing platforms such as CoCalc facilitate the distribution of computer codes and allow students to access and apply interactive course tools beyond the course's scope. Despite some limitations imposed by cloud computing, the students appreciated the approach and the enhanced opportunities to discuss study questions with their classmates and instructor as facilitated by the interactive tools. © 2017 by The International Union of Biochemistry and Molecular Biology, 46(1):58-65, 2018. © 2017 The International Union of Biochemistry and Molecular Biology.

  11. Estimation of the cloud transmittance from radiometric measurements at the ground level

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Costa, Dario; Mares, Oana, E-mail: mareshoana@yahoo.com

    2014-11-24

    The extinction of solar radiation due to the clouds is more significant than due to any other atmospheric constituent, but it is always difficult to be modeled because of the random distribution of clouds on the sky. Moreover, the transmittance of a layer of clouds is in a very complex relation with their type and depth. A method for estimating cloud transmittance was proposed in Paulescu et al. (Energ. Convers. Manage, 75 690–697, 2014). The approach is based on the hypothesis that the structure of the cloud covering the sun at a time moment does not change significantly in amore » short time interval (several minutes). Thus, the cloud transmittance can be calculated as the estimated coefficient of a simple linear regression for the computed versus measured solar irradiance in a time interval Δt. The aim of this paper is to optimize the length of the time interval Δt. Radiometric data measured on the Solar Platform of the West University of Timisoara during 2010 at a frequency of 1/15 seconds are used in this study.« less

  12. Estimation of the cloud transmittance from radiometric measurements at the ground level

    NASA Astrophysics Data System (ADS)

    Costa, Dario; Mares, Oana

    2014-11-01

    The extinction of solar radiation due to the clouds is more significant than due to any other atmospheric constituent, but it is always difficult to be modeled because of the random distribution of clouds on the sky. Moreover, the transmittance of a layer of clouds is in a very complex relation with their type and depth. A method for estimating cloud transmittance was proposed in Paulescu et al. (Energ. Convers. Manage, 75 690-697, 2014). The approach is based on the hypothesis that the structure of the cloud covering the sun at a time moment does not change significantly in a short time interval (several minutes). Thus, the cloud transmittance can be calculated as the estimated coefficient of a simple linear regression for the computed versus measured solar irradiance in a time interval Δt. The aim of this paper is to optimize the length of the time interval Δt. Radiometric data measured on the Solar Platform of the West University of Timisoara during 2010 at a frequency of 1/15 seconds are used in this study.

  13. Solar-Terrestrial and Astronomical Research Network (STAR-Network) - A Meaningful Practice of New Cyberinfrastructure on Space Science

    NASA Astrophysics Data System (ADS)

    Hu, X.; Zou, Z.

    2017-12-01

    For the next decades, comprehensive big data application environment is the dominant direction of cyberinfrastructure development on space science. To make the concept of such BIG cyberinfrastructure (e.g. Digital Space) a reality, these aspects of capability should be focused on and integrated, which includes science data system, digital space engine, big data application (tools and models) and the IT infrastructure. In the past few years, CAS Chinese Space Science Data Center (CSSDC) has made a helpful attempt in this direction. A cloud-enabled virtual research platform on space science, called Solar-Terrestrial and Astronomical Research Network (STAR-Network), has been developed to serve the full lifecycle of space science missions and research activities. It integrated a wide range of disciplinary and interdisciplinary resources, to provide science-problem-oriented data retrieval and query service, collaborative mission demonstration service, mission operation supporting service, space weather computing and Analysis service and other self-help service. This platform is supported by persistent infrastructure, including cloud storage, cloud computing, supercomputing and so on. Different variety of resource are interconnected: the science data can be displayed on the browser by visualization tools, the data analysis tools and physical models can be drived by the applicable science data, the computing results can be saved on the cloud, for example. So far, STAR-Network has served a series of space science mission in China, involving Strategic Pioneer Program on Space Science (this program has invested some space science satellite as DAMPE, HXMT, QUESS, and more satellite will be launched around 2020) and Meridian Space Weather Monitor Project. Scientists have obtained some new findings by using the science data from these missions with STAR-Network's contribution. We are confident that STAR-Network is an exciting practice of new cyberinfrastructure architecture on space science.

  14. Research on Visualization of Ground Laser Radar Data Based on Osg

    NASA Astrophysics Data System (ADS)

    Huang, H.; Hu, C.; Zhang, F.; Xue, H.

    2018-04-01

    Three-dimensional (3D) laser scanning is a new advanced technology integrating light, machine, electricity, and computer technologies. It can conduct 3D scanning to the whole shape and form of space objects with high precision. With this technology, you can directly collect the point cloud data of a ground object and create the structure of it for rendering. People use excellent 3D rendering engine to optimize and display the 3D model in order to meet the higher requirements of real time realism rendering and the complexity of the scene. OpenSceneGraph (OSG) is an open source 3D graphics engine. Compared with the current mainstream 3D rendering engine, OSG is practical, economical, and easy to expand. Therefore, OSG is widely used in the fields of virtual simulation, virtual reality, science and engineering visualization. In this paper, a dynamic and interactive ground LiDAR data visualization platform is constructed based on the OSG and the cross-platform C++ application development framework Qt. In view of the point cloud data of .txt format and the triangulation network data file of .obj format, the functions of 3D laser point cloud and triangulation network data display are realized. It is proved by experiments that the platform is of strong practical value as it is easy to operate and provides good interaction.

  15. The AppScale Cloud Platform

    PubMed Central

    Krintz, Chandra

    2013-01-01

    AppScale is an open source distributed software system that implements a cloud platform as a service (PaaS). AppScale makes cloud applications easy to deploy and scale over disparate cloud fabrics, implementing a set of APIs and architecture that also makes apps portable across the services they employ. AppScale is API-compatible with Google App Engine (GAE) and thus executes GAE applications on-premise or over other cloud infrastructures, without modification. PMID:23828721

  16. The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences

    PubMed Central

    Merchant, Nirav; Lyons, Eric; Goff, Stephen; Vaughn, Matthew; Ware, Doreen; Micklos, David; Antin, Parker

    2016-01-01

    The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identity management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as distributed teams. iPlant’s platform permits researchers to easily deposit and share their data and deploy new computational tools and analysis workflows, allowing the broader community to easily use and reuse those data and computational analyses. PMID:26752627

  17. A New Approach to Integrate Internet-of-Things and Software-as-a-Service Model for Logistic Systems: A Case Study

    PubMed Central

    Chen, Shang-Liang; Chen, Yun-Yao; Hsu, Chiang

    2014-01-01

    Cloud computing is changing the ways software is developed and managed in enterprises, which is changing the way of doing business in that dynamically scalable and virtualized resources are regarded as services over the Internet. Traditional manufacturing systems such as supply chain management (SCM), customer relationship management (CRM), and enterprise resource planning (ERP) are often developed case by case. However, effective collaboration between different systems, platforms, programming languages, and interfaces has been suggested by researchers. In cloud-computing-based systems, distributed resources are encapsulated into cloud services and centrally managed, which allows high automation, flexibility, fast provision, and ease of integration at low cost. The integration between physical resources and cloud services can be improved by combining Internet of things (IoT) technology and Software-as-a-Service (SaaS) technology. This study proposes a new approach for developing cloud-based manufacturing systems based on a four-layer SaaS model. There are three main contributions of this paper: (1) enterprises can develop their own cloud-based logistic management information systems based on the approach proposed in this paper; (2) a case study based on literature reviews with experimental results is proposed to verify that the system performance is remarkable; (3) challenges encountered and feedback collected from T Company in the case study are discussed in this paper for the purpose of enterprise deployment. PMID:24686728

  18. A new approach to integrate Internet-of-things and software-as-a-service model for logistic systems: a case study.

    PubMed

    Chen, Shang-Liang; Chen, Yun-Yao; Hsu, Chiang

    2014-03-28

    Cloud computing is changing the ways software is developed and managed in enterprises, which is changing the way of doing business in that dynamically scalable and virtualized resources are regarded as services over the Internet. Traditional manufacturing systems such as supply chain management (SCM), customer relationship management (CRM), and enterprise resource planning (ERP) are often developed case by case. However, effective collaboration between different systems, platforms, programming languages, and interfaces has been suggested by researchers. In cloud-computing-based systems, distributed resources are encapsulated into cloud services and centrally managed, which allows high automation, flexibility, fast provision, and ease of integration at low cost. The integration between physical resources and cloud services can be improved by combining Internet of things (IoT) technology and Software-as-a-Service (SaaS) technology. This study proposes a new approach for developing cloud-based manufacturing systems based on a four-layer SaaS model. There are three main contributions of this paper: (1) enterprises can develop their own cloud-based logistic management information systems based on the approach proposed in this paper; (2) a case study based on literature reviews with experimental results is proposed to verify that the system performance is remarkable; (3) challenges encountered and feedback collected from T Company in the case study are discussed in this paper for the purpose of enterprise deployment.

  19. Implementation of Web-Based Education in Egypt through Cloud Computing Technologies and Its Effect on Higher Education

    ERIC Educational Resources Information Center

    El-Seoud, M. Samir Abou; El-Sofany, Hosam F.; Taj-Eddin, Islam A. T. F.; Nosseir, Ann; El-Khouly, Mahmoud M.

    2013-01-01

    The information technology educational programs at most universities in Egypt face many obstacles that can be overcome using technology enhanced learning. An open source Moodle eLearning platform has been implemented at many public and private universities in Egypt, as an aid to deliver e-content and to provide the institution with various…

  20. Utilization of Cloud Computing in Education and Research to the Attainment of Millennium Development Goals and Vision 2030 in Kenya

    ERIC Educational Resources Information Center

    Waga, Duncan; Makori, Esther; Rabah, Kefa

    2014-01-01

    Kenya Educational and Research fraternity has highly qualified human resource capacity with globally gained experiences. However each research entity works in disparity due to the absence of a common digital platform while educational units don't even have the basic infrastructure. For sustainability of Education and research progression,…

  1. Towards a Service-Oriented Enterprise: The Design of a Cloud Business Integration Platform in a Medium-Sized Manufacturing Enterprise

    ERIC Educational Resources Information Center

    Stamas, Paul J.

    2013-01-01

    This case study research followed the two-year transition of a medium-sized manufacturing firm towards a service-oriented enterprise. A service-oriented enterprise is an emerging architecture of the firm that leverages the paradigm of services computing to integrate the capabilities of the firm with the complementary competencies of business…

  2. A hybrid computational strategy to address WGS variant analysis in >5000 samples.

    PubMed

    Huang, Zhuoyi; Rustagi, Navin; Veeraraghavan, Narayanan; Carroll, Andrew; Gibbs, Richard; Boerwinkle, Eric; Venkata, Manjunath Gorentla; Yu, Fuli

    2016-09-10

    The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies. We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms. Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants.

  3. Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.

    2013-05-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill.; Figure 1 -- Architecture.

  4. Developing cloud applications using the e-Science Central platform.

    PubMed

    Hiden, Hugo; Woodman, Simon; Watson, Paul; Cala, Jacek

    2013-01-28

    This paper describes the e-Science Central (e-SC) cloud data processing system and its application to a number of e-Science projects. e-SC provides both software as a service (SaaS) and platform as a service for scientific data management, analysis and collaboration. It is a portable system and can be deployed on both private (e.g. Eucalyptus) and public clouds (Amazon AWS and Microsoft Windows Azure). The SaaS application allows scientists to upload data, edit and run workflows and share results in the cloud, using only a Web browser. It is underpinned by a scalable cloud platform consisting of a set of components designed to support the needs of scientists. The platform is exposed to developers so that they can easily upload their own analysis services into the system and make these available to other users. A representational state transfer-based application programming interface (API) is also provided so that external applications can leverage the platform's functionality, making it easier to build scalable, secure cloud-based applications. This paper describes the design of e-SC, its API and its use in three different case studies: spectral data visualization, medical data capture and analysis, and chemical property prediction.

  5. Developing cloud applications using the e-Science Central platform

    PubMed Central

    Hiden, Hugo; Woodman, Simon; Watson, Paul; Cala, Jacek

    2013-01-01

    This paper describes the e-Science Central (e-SC) cloud data processing system and its application to a number of e-Science projects. e-SC provides both software as a service (SaaS) and platform as a service for scientific data management, analysis and collaboration. It is a portable system and can be deployed on both private (e.g. Eucalyptus) and public clouds (Amazon AWS and Microsoft Windows Azure). The SaaS application allows scientists to upload data, edit and run workflows and share results in the cloud, using only a Web browser. It is underpinned by a scalable cloud platform consisting of a set of components designed to support the needs of scientists. The platform is exposed to developers so that they can easily upload their own analysis services into the system and make these available to other users. A representational state transfer-based application programming interface (API) is also provided so that external applications can leverage the platform's functionality, making it easier to build scalable, secure cloud-based applications. This paper describes the design of e-SC, its API and its use in three different case studies: spectral data visualization, medical data capture and analysis, and chemical property prediction. PMID:23230161

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garzoglio, Gabriele

    The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center are working on a multi-year Collaborative Research and Development Agreement.With the knowledge developed in the first year on how to provision and manage a federation of virtual machines through Cloud management systems. In this second year, we expanded the work on provisioning and federation, increasing both scale and diversity of solutions, and we started to build on-demand services on the established fabric, introducing the paradigm of Platform as a Service to assist with the execution of scientific workflows. We have enabled scientific workflows ofmore » stakeholders to run on multiple cloud resources at the scale of 1,000 concurrent machines. The demonstrations have been in the areas of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federation of Cloud Resources, and (c) On-demand Services for ScientificWorkflows.« less

  7. Health Information System in a Cloud Computing Context.

    PubMed

    Sadoughi, Farahnaz; Erfannia, Leila

    2017-01-01

    Healthcare as a worldwide industry is experiencing a period of growth based on health information technology. The capabilities of cloud systems make it as an option to develop eHealth goals. The main objectives of the present study was to evaluate the advantages and limitations of health information systems implementation in a cloud-computing context that was conducted as a systematic review in 2016. Science direct, Scopus, Web of science, IEEE, PubMed and Google scholar were searched according study criteria. Among 308 articles initially found, 21 articles were entered in the final analysis. All the studies had considered cloud computing as a positive tool to help advance health technology, but none had insisted too much on its limitations and threats. Electronic health record systems have been mostly studied in the fields of implementation, designing, and presentation of models and prototypes. According to this research, the main advantages of cloud-based health information systems could be categorized into the following groups: economic benefits and advantages of information management. The main limitations of the implementation of cloud-based health information systems could be categorized into the 4 groups of security, legal, technical, and human restrictions. Compared to earlier studies, the present research had the advantage of dealing with the issue of health information systems in a cloud platform. The high frequency of studies conducted on the implementation of cloud-based health information systems revealed health industry interest in the application of this technology. Security was a subject discussed in most studies due to health information sensitivity. In this investigation, some mechanisms and solutions were discussed concerning the mentioned systems, which would provide a suitable area for future scientific research on this issue. The limitations and solutions discussed in this systematic study would help healthcare managers and decision-makers take better and more efficient advantages of this technology and make better planning to adopt cloud-based health information systems.

  8. Production experience with the ATLAS Event Service

    NASA Astrophysics Data System (ADS)

    Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration

    2017-10-01

    The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.

  9. The COMET Sleep Research Platform.

    PubMed

    Nichols, Deborah A; DeSalvo, Steven; Miller, Richard A; Jónsson, Darrell; Griffin, Kara S; Hyde, Pamela R; Walsh, James K; Kushida, Clete A

    2014-01-01

    The Comparative Outcomes Management with Electronic Data Technology (COMET) platform is extensible and designed for facilitating multicenter electronic clinical research. Our research goals were the following: (1) to conduct a comparative effectiveness trial (CET) for two obstructive sleep apnea treatments-positive airway pressure versus oral appliance therapy; and (2) to establish a new electronic network infrastructure that would support this study and other clinical research studies. The COMET platform was created to satisfy the needs of CET with a focus on creating a platform that provides comprehensive toolsets, multisite collaboration, and end-to-end data management. The platform also provides medical researchers the ability to visualize and interpret data using business intelligence (BI) tools. COMET is a research platform that is scalable and extensible, and which, in a future version, can accommodate big data sets and enable efficient and effective research across multiple studies and medical specialties. The COMET platform components were designed for an eventual move to a cloud computing infrastructure that enhances sustainability, overall cost effectiveness, and return on investment.

  10. The COMET Sleep Research Platform

    PubMed Central

    Nichols, Deborah A.; DeSalvo, Steven; Miller, Richard A.; Jónsson, Darrell; Griffin, Kara S.; Hyde, Pamela R.; Walsh, James K.; Kushida, Clete A.

    2014-01-01

    Introduction: The Comparative Outcomes Management with Electronic Data Technology (COMET) platform is extensible and designed for facilitating multicenter electronic clinical research. Background: Our research goals were the following: (1) to conduct a comparative effectiveness trial (CET) for two obstructive sleep apnea treatments—positive airway pressure versus oral appliance therapy; and (2) to establish a new electronic network infrastructure that would support this study and other clinical research studies. Discussion: The COMET platform was created to satisfy the needs of CET with a focus on creating a platform that provides comprehensive toolsets, multisite collaboration, and end-to-end data management. The platform also provides medical researchers the ability to visualize and interpret data using business intelligence (BI) tools. Conclusion: COMET is a research platform that is scalable and extensible, and which, in a future version, can accommodate big data sets and enable efficient and effective research across multiple studies and medical specialties. The COMET platform components were designed for an eventual move to a cloud computing infrastructure that enhances sustainability, overall cost effectiveness, and return on investment. PMID:25848590

  11. Analysis of the Security and Privacy Requirements of Cloud-Based Electronic Health Records Systems

    PubMed Central

    Fernández, Gonzalo; López-Coronado, Miguel

    2013-01-01

    Background The Cloud Computing paradigm offers eHealth systems the opportunity to enhance the features and functionality that they offer. However, moving patients’ medical information to the Cloud implies several risks in terms of the security and privacy of sensitive health records. In this paper, the risks of hosting Electronic Health Records (EHRs) on the servers of third-party Cloud service providers are reviewed. To protect the confidentiality of patient information and facilitate the process, some suggestions for health care providers are made. Moreover, security issues that Cloud service providers should address in their platforms are considered. Objective To show that, before moving patient health records to the Cloud, security and privacy concerns must be considered by both health care providers and Cloud service providers. Security requirements of a generic Cloud service provider are analyzed. Methods To study the latest in Cloud-based computing solutions, bibliographic material was obtained mainly from Medline sources. Furthermore, direct contact was made with several Cloud service providers. Results Some of the security issues that should be considered by both Cloud service providers and their health care customers are role-based access, network security mechanisms, data encryption, digital signatures, and access monitoring. Furthermore, to guarantee the safety of the information and comply with privacy policies, the Cloud service provider must be compliant with various certifications and third-party requirements, such as SAS70 Type II, PCI DSS Level 1, ISO 27001, and the US Federal Information Security Management Act (FISMA). Conclusions Storing sensitive information such as EHRs in the Cloud means that precautions must be taken to ensure the safety and confidentiality of the data. A relationship built on trust with the Cloud service provider is essential to ensure a transparent process. Cloud service providers must make certain that all security mechanisms are in place to avoid unauthorized access and data breaches. Patients must be kept informed about how their data are being managed. PMID:23965254

  12. Analysis of the security and privacy requirements of cloud-based electronic health records systems.

    PubMed

    Rodrigues, Joel J P C; de la Torre, Isabel; Fernández, Gonzalo; López-Coronado, Miguel

    2013-08-21

    The Cloud Computing paradigm offers eHealth systems the opportunity to enhance the features and functionality that they offer. However, moving patients' medical information to the Cloud implies several risks in terms of the security and privacy of sensitive health records. In this paper, the risks of hosting Electronic Health Records (EHRs) on the servers of third-party Cloud service providers are reviewed. To protect the confidentiality of patient information and facilitate the process, some suggestions for health care providers are made. Moreover, security issues that Cloud service providers should address in their platforms are considered. To show that, before moving patient health records to the Cloud, security and privacy concerns must be considered by both health care providers and Cloud service providers. Security requirements of a generic Cloud service provider are analyzed. To study the latest in Cloud-based computing solutions, bibliographic material was obtained mainly from Medline sources. Furthermore, direct contact was made with several Cloud service providers. Some of the security issues that should be considered by both Cloud service providers and their health care customers are role-based access, network security mechanisms, data encryption, digital signatures, and access monitoring. Furthermore, to guarantee the safety of the information and comply with privacy policies, the Cloud service provider must be compliant with various certifications and third-party requirements, such as SAS70 Type II, PCI DSS Level 1, ISO 27001, and the US Federal Information Security Management Act (FISMA). Storing sensitive information such as EHRs in the Cloud means that precautions must be taken to ensure the safety and confidentiality of the data. A relationship built on trust with the Cloud service provider is essential to ensure a transparent process. Cloud service providers must make certain that all security mechanisms are in place to avoid unauthorized access and data breaches. Patients must be kept informed about how their data are being managed.

  13. Opportunity and Challenges for Migrating Big Data Analytics in Cloud

    NASA Astrophysics Data System (ADS)

    Amitkumar Manekar, S.; Pradeepini, G., Dr.

    2017-08-01

    Big Data Analytics is a big word now days. As per demanding and more scalable process data generation capabilities, data acquisition and storage become a crucial issue. Cloud storage is a majorly usable platform; the technology will become crucial to executives handling data powered by analytics. Now a day’s trend towards “big data-as-a-service” is talked everywhere. On one hand, cloud-based big data analytics exactly tackle in progress issues of scale, speed, and cost. But researchers working to solve security and other real-time problem of big data migration on cloud based platform. This article specially focused on finding possible ways to migrate big data to cloud. Technology which support coherent data migration and possibility of doing big data analytics on cloud platform is demanding in natute for new era of growth. This article also gives information about available technology and techniques for migration of big data in cloud.

  14. Exploiting Parallel R in the Cloud with SPRINT

    PubMed Central

    Piotrowski, M.; McGilvary, G.A.; Sloan, T. M.; Mewissen, M.; Lloyd, A.D.; Forster, T.; Mitchell, L.; Ghazal, P.; Hill, J.

    2012-01-01

    Background Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. PMID:23223611

  15. Exploiting parallel R in the cloud with SPRINT.

    PubMed

    Piotrowski, M; McGilvary, G A; Sloan, T M; Mewissen, M; Lloyd, A D; Forster, T; Mitchell, L; Ghazal, P; Hill, J

    2013-01-01

    Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon's Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of the algorithm. Resource underutilization can further improve the time to result. End-user's location impacts on costs due to factors such as local taxation. Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds.

  16. Mobile, Cloud, and Big Data Computing: Contributions, Challenges, and New Directions in Telecardiology

    PubMed Central

    Hsieh, Jui-Chien; Li, Ai-Hsien; Yang, Chung-Chi

    2013-01-01

    Many studies have indicated that computing technology can enable off-site cardiologists to read patients’ electrocardiograph (ECG), echocardiography (ECHO), and relevant images via smart phones during pre-hospital, in-hospital, and post-hospital teleconsultation, which not only identifies emergency cases in need of immediate treatment, but also prevents the unnecessary re-hospitalizations. Meanwhile, several studies have combined cloud computing and mobile computing to facilitate better storage, delivery, retrieval, and management of medical files for telecardiology. In the future, the aggregated ECG and images from hospitals worldwide will become big data, which should be used to develop an e-consultation program helping on-site practitioners deliver appropriate treatment. With information technology, real-time tele-consultation and tele-diagnosis of ECG and images can be practiced via an e-platform for clinical, research, and educational purposes. While being devoted to promote the application of information technology onto telecardiology, we need to resolve several issues: (1) data confidentiality in the cloud, (2) data interoperability among hospitals, and (3) network latency and accessibility. If these challenges are overcome, tele-consultation will be ubiquitous, easy to perform, inexpensive, and beneficial. Most importantly, these services will increase global collaboration and advance clinical practice, education, and scientific research in cardiology. PMID:24232290

  17. Mobile, cloud, and big data computing: contributions, challenges, and new directions in telecardiology.

    PubMed

    Hsieh, Jui-Chien; Li, Ai-Hsien; Yang, Chung-Chi

    2013-11-13

    Many studies have indicated that computing technology can enable off-site cardiologists to read patients' electrocardiograph (ECG), echocardiography (ECHO), and relevant images via smart phones during pre-hospital, in-hospital, and post-hospital teleconsultation, which not only identifies emergency cases in need of immediate treatment, but also prevents the unnecessary re-hospitalizations. Meanwhile, several studies have combined cloud computing and mobile computing to facilitate better storage, delivery, retrieval, and management of medical files for telecardiology. In the future, the aggregated ECG and images from hospitals worldwide will become big data, which should be used to develop an e-consultation program helping on-site practitioners deliver appropriate treatment. With information technology, real-time tele-consultation and tele-diagnosis of ECG and images can be practiced via an e-platform for clinical, research, and educational purposes. While being devoted to promote the application of information technology onto telecardiology, we need to resolve several issues: (1) data confidentiality in the cloud, (2) data interoperability among hospitals, and (3) network latency and accessibility. If these challenges are overcome, tele-consultation will be ubiquitous, easy to perform, inexpensive, and beneficial. Most importantly, these services will increase global collaboration and advance clinical practice, education, and scientific research in cardiology.

  18. Report on the Second ARM Mobile Facility (AMF2) Roll, Pitch, and Heave (RPH) Stabilization Platform: Design and Evaluation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coulter, Richard L.; Martin, Timothy J.

    One of the primary objectives of the U.S. Department of Energy’s Atmospheric Radiation Measurement (ARM) Climate Research Facility’s second Mobile Facility (AMF2) is to obtain reliable measurements of solar, surface, and atmospheric radiation, as well as cloud and atmospheric properties, from ocean-going vessels. To ensure that these climatic measurements are representative and accurate, many AMF2 instrument systems are designed to collect data in a zenith orientation. A pillar of the AMF2 strategy in this effort is the use of a stable platform. The purpose of the platform is to 1) mitigate vessel motion for instruments that require a truly verticalmore » orientation and keep them pointed in the zenith direction, and 2) allow for accurate positioning for viewing or shading of the sensors from direct sunlight. Numerous ARM instruments fall into these categories, but perhaps the most important are the vertically pointing cloud radars, for which vertical motions are a critical parameter. During the design and construction phase of AMF2, an inexpensive stable platform was purchased to perform the stabilization tasks for some of these instruments. The first table compensated for roll, pitch, and yaw (RPY) and was reported upon in a previous technical report (Kafle and Coulter, 2012). Subsequently, a second table was purchased specifically for operation with the Marine W-band cloud radar (MWACR). Computer programs originally developed for RPY were modified to communicate with the new platform controller and with an inertial measurements platform that measures true ship motion components (roll, pitch, yaw, surge, sway, and heave). This platform could not be tested dynamically for RPY because of time constraints requiring its deployment aboard the container ship Horizon Spirit in September 2013. Hence the initial motion tests were conducted on the initial cruise. Subsequent cruises provided additional test results. The platform, as tested, meets all the design and performance criteria established for its use. This is a report of the results of those efforts and the critical points in moving forward« less

  19. Development of a High Resolution Weather Forecast Model for Mesoamerica Using the NASA Nebula Cloud Computing Environment

    NASA Technical Reports Server (NTRS)

    Molthan, Andrew L.; Case, Jonathan L.; Venner, Jason; Moreno-Madrinan, Max. J.; Delgado, Francisco

    2012-01-01

    Over the past two years, scientists in the Earth Science Office at NASA fs Marshall Space Flight Center (MSFC) have explored opportunities to apply cloud computing concepts to support near real ]time weather forecast modeling via the Weather Research and Forecasting (WRF) model. Collaborators at NASA fs Short ]term Prediction Research and Transition (SPoRT) Center and the SERVIR project at Marshall Space Flight Center have established a framework that provides high resolution, daily weather forecasts over Mesoamerica through use of the NASA Nebula Cloud Computing Platform at Ames Research Center. Supported by experts at Ames, staff at SPoRT and SERVIR have established daily forecasts complete with web graphics and a user interface that allows SERVIR partners access to high resolution depictions of weather in the next 48 hours, useful for monitoring and mitigating meteorological hazards such as thunderstorms, heavy precipitation, and tropical weather that can lead to other disasters such as flooding and landslides. This presentation will describe the framework for establishing and providing WRF forecasts, example applications of output provided via the SERVIR web portal, and early results of forecast model verification against available surface ] and satellite ]based observations.

  20. Performance implications from sizing a VM on multi-core systems: A Data analytic application s view

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lim, Seung-Hwan; Horey, James L; Begoli, Edmon

    In this paper, we present a quantitative performance analysis of data analytics applications running on multi-core virtual machines. Such environments form the core of cloud computing. In addition, data analytics applications, such as Cassandra and Hadoop, are becoming increasingly popular on cloud computing platforms. This convergence necessitates a better understanding of the performance and cost implications of such hybrid systems. For example, the very rst step in hosting applications in virtualized environments, requires the user to con gure the number of virtual processors and the size of memory. To understand performance implications of this step, we benchmarked three Yahoo Cloudmore » Serving Benchmark (YCSB) workloads in a virtualized multi-core environment. Our measurements indicate that the performance of Cassandra for YCSB workloads does not heavily depend on the processing capacity of a system, while the size of the data set is critical to performance relative to allocated memory. We also identi ed a strong relationship between the running time of workloads and various hardware events (last level cache loads, misses, and CPU migrations). From this analysis, we provide several suggestions to improve the performance of data analytics applications running on cloud computing environments.« less

  1. Development of a High Resolution Weather Forecast Model for Mesoamerica Using the NASA Nebula Cloud Computing Environment

    NASA Astrophysics Data System (ADS)

    Molthan, A.; Case, J.; Venner, J.; Moreno-Madriñán, M. J.; Delgado, F.

    2012-12-01

    Over the past two years, scientists in the Earth Science Office at NASA's Marshall Space Flight Center (MSFC) have explored opportunities to apply cloud computing concepts to support near real-time weather forecast modeling via the Weather Research and Forecasting (WRF) model. Collaborators at NASA's Short-term Prediction Research and Transition (SPoRT) Center and the SERVIR project at Marshall Space Flight Center have established a framework that provides high resolution, daily weather forecasts over Mesoamerica through use of the NASA Nebula Cloud Computing Platform at Ames Research Center. Supported by experts at Ames, staff at SPoRT and SERVIR have established daily forecasts complete with web graphics and a user interface that allows SERVIR partners access to high resolution depictions of weather in the next 48 hours, useful for monitoring and mitigating meteorological hazards such as thunderstorms, heavy precipitation, and tropical weather that can lead to other disasters such as flooding and landslides. This presentation will describe the framework for establishing and providing WRF forecasts, example applications of output provided via the SERVIR web portal, and early results of forecast model verification against available surface- and satellite-based observations.

  2. The JASMIN Cloud: specialised and hybrid to meet the needs of the Environmental Sciences Community

    NASA Astrophysics Data System (ADS)

    Kershaw, Philip; Lawrence, Bryan; Churchill, Jonathan; Pritchard, Matt

    2014-05-01

    Cloud computing provides enormous opportunities for the research community. The large public cloud providers provide near-limitless scaling capability. However, adapting Cloud to scientific workloads is not without its problems. The commodity nature of the public cloud infrastructure can be at odds with the specialist requirements of the research community. Issues such as trust, ownership of data, WAN bandwidth and costing models make additional barriers to more widespread adoption. Alongside the application of public cloud for scientific applications, a number of private cloud initiatives are underway in the research community of which the JASMIN Cloud is one example. Here, cloud service models are being effectively super-imposed over more established services such as data centres, compute cluster facilities and Grids. These have the potential to deliver the specialist infrastructure needed for the science community coupled with the benefits of a Cloud service model. The JASMIN facility based at the Rutherford Appleton Laboratory was established in 2012 to support the data analysis requirements of the climate and Earth Observation community. In its first year of operation, the 5PB of available storage capacity was filled and the hosted compute capability used extensively. JASMIN has modelled the concept of a centralised large-volume data analysis facility. Key characteristics have enabled success: peta-scale fast disk connected via low latency networks to compute resources and the use of virtualisation for effective management of the resources for a range of users. A second phase is now underway funded through NERC's (Natural Environment Research Council) Big Data initiative. This will see significant expansion to the resources available with a doubling of disk-based storage to 12PB and an increase of compute capacity by a factor of ten to over 3000 processing cores. This expansion is accompanied by a broadening in the scope for JASMIN, as a service available to the entire UK environmental science community. Experience with the first phase demonstrated the range of user needs. A trade-off is needed between access privileges to resources, flexibility of use and security. This has influenced the form and types of service under development for the new phase. JASMIN will deploy a specialised private cloud organised into "Managed" and "Unmanaged" components. In the Managed Cloud, users have direct access to the storage and compute resources for optimal performance but for reasons of security, via a more restrictive PaaS (Platform-as-a-Service) interface. The Unmanaged Cloud is deployed in an isolated part of the network but co-located with the rest of the infrastructure. This enables greater liberty to tenants - full IaaS (Infrastructure-as-a-Service) capability to provision customised infrastructure - whilst at the same time protecting more sensitive parts of the system from direct access using these elevated privileges. The private cloud will be augmented with cloud-bursting capability so that it can exploit the resources available from public clouds, making it effectively a hybrid solution. A single interface will overlay the functionality of both the private cloud and external interfaces to public cloud providers giving users the flexibility to migrate resources between infrastructures as requirements dictate.

  3. Efficient secure-channel free public key encryption with keyword search for EMRs in cloud storage.

    PubMed

    Guo, Lifeng; Yau, Wei-Chuen

    2015-02-01

    Searchable encryption is an important cryptographic primitive that enables privacy-preserving keyword search on encrypted electronic medical records (EMRs) in cloud storage. Efficiency of such searchable encryption in a medical cloud storage system is very crucial as it involves client platforms such as smartphones or tablets that only have constrained computing power and resources. In this paper, we propose an efficient secure-channel free public key encryption with keyword search (SCF-PEKS) scheme that is proven secure in the standard model. We show that our SCF-PEKS scheme is not only secure against chosen keyword and ciphertext attacks (IND-SCF-CKCA), but also secure against keyword guessing attacks (IND-KGA). Furthermore, our proposed scheme is more efficient than other recent SCF-PEKS schemes in the literature.

  4. Implementation of Online Veterinary Hospital on Cloud Platform.

    PubMed

    Chen, Tzer-Shyong; Chen, Tzer-Long; Chung, Yu-Fang; Huang, Yao-Min; Chen, Tao-Chieh; Wang, Huihui; Wei, Wei

    2016-06-01

    Pet markets involve in great commercial possibilities, which boost thriving development of veterinary hospital businesses. The service tends to intensive competition and diversified channel environment. Information technology is integrated for developing the veterinary hospital cloud service platform. The platform contains not only pet medical services but veterinary hospital management and services. In the study, QR Code andcloud technology are applied to establish the veterinary hospital cloud service platform for pet search by labeling a pet's identification with QR Code. This technology can break the restriction on veterinary hospital inspection in different areas and allows veterinary hospitals receiving the medical records and information through the exclusive QR Code for more effective inspection. As an interactive platform, the veterinary hospital cloud service platform allows pet owners gaining the knowledge of pet diseases and healthcare. Moreover, pet owners can enquire and communicate with veterinarians through the platform. Also, veterinary hospitals can periodically send reminders of relevant points and introduce exclusive marketing information with the platform for promoting the service items and establishing individualized marketing. Consequently, veterinary hospitals can increase the profits by information share and create the best solution in such a competitive veterinary market with industry alliance.

  5. HoloGondel: in situ cloud observations on a cable car in the Swiss Alps using a holographic imager

    NASA Astrophysics Data System (ADS)

    Beck, Alexander; Henneberger, Jan; Schöpfer, Sarah; Fugal, Jacob; Lohmann, Ulrike

    2017-02-01

    In situ observations of cloud properties in complex alpine terrain where research aircraft cannot sample are commonly conducted at mountain-top research stations and limited to single-point measurements. The HoloGondel platform overcomes this limitation by using a cable car to obtain vertical profiles of the microphysical and meteorological cloud parameters. The main component of the HoloGondel platform is the HOLographic Imager for Microscopic Objects (HOLIMO 3G), which uses digital in-line holography to image cloud particles. Based on two-dimensional images the microphysical cloud parameters for the size range from small cloud particles to large precipitation particles are obtained for the liquid and ice phase. The low traveling velocity of a cable car on the order of 10 m s-1 allows measurements with high spatial resolution; however, at the same time it leads to an unstable air speed towards the HoloGondel platform. Holographic cloud imagers, which have a sample volume that is independent of the air speed, are therefore well suited for measurements on a cable car. Example measurements of the vertical profiles observed in a liquid cloud and a mixed-phase cloud at the Eggishorn in the Swiss Alps in the winters 2015 and 2016 are presented. The HoloGondel platform reliably observes cloud droplets larger than 6.5 µm, partitions between cloud droplets and ice crystals for a size larger than 25 µm and obtains a statistically significantly size distribution for every 5 m in vertical ascent.

  6. Final Scientific/Technical Report for "Enabling Exascale Hardware and Software Design through Scalable System Virtualization"

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dinda, Peter August

    2015-03-17

    This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems softwaremore » for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3), although a prototype connecting Palacios with the GEM5 architectural simulator was demonstrated, our conclusion was that such a platform was less useful for design space exploration than anticipated due to inherent complexity of the connection between the instruction set architecture level and the microarchitectural level. For effort (4), we found that a code injection approach proved to be more fruitful. The results of our efforts are publicly available in the open source Palacios codebase and published papers, all of which are available from the project web site, v3vee.org. Palacios is currently one of the two codebases (the other being Sandia’s Kitten lightweight kernel) that underlies the node operating system for the DOE Hobbes Project, one of two projects tasked with building a systems software prototype for the national exascale computing effort.« less

  7. Cloud hosting of the IPython Notebook to Provide Collaborative Research Environments for Big Data Analysis

    NASA Astrophysics Data System (ADS)

    Kershaw, Philip; Lawrence, Bryan; Gomez-Dans, Jose; Holt, John

    2015-04-01

    We explore how the popular IPython Notebook computing system can be hosted on a cloud platform to provide a flexible virtual research hosting environment for Earth Observation data processing and analysis and how this approach can be expanded more broadly into a generic SaaS (Software as a Service) offering for the environmental sciences. OPTIRAD (OPTImisation environment for joint retrieval of multi-sensor RADiances) is a project funded by the European Space Agency to develop a collaborative research environment for Data Assimilation of Earth Observation products for land surface applications. Data Assimilation provides a powerful means to combine multiple sources of data and derive new products for this application domain. To be most effective, it requires close collaboration between specialists in this field, land surface modellers and end users of data generated. A goal of OPTIRAD then is to develop a collaborative research environment to engender shared working. Another significant challenge is that of data volume and complexity. Study of land surface requires high spatial and temporal resolutions, a relatively large number of variables and the application of algorithms which are computationally expensive. These problems can be addressed with the application of parallel processing techniques on specialist compute clusters. However, scientific users are often deterred by the time investment required to port their codes to these environments. Even when successfully achieved, it may be difficult to readily change or update. This runs counter to the scientific process of continuous experimentation, analysis and validation. The IPython Notebook provides users with a web-based interface to multiple interactive shells for the Python programming language. Code, documentation and graphical content can be saved and shared making it directly applicable to OPTIRAD's requirements for a shared working environment. Given the web interface it can be readily made into a hosted service with Wakari and Microsoft Azure being notable examples. Cloud-hosting of the Notebook allows the same familiar Python interface to be retained but backed by Cloud Computing attributes of scalability, elasticity and resource pooling. This combination makes it a powerful solution to address the needs of long-tail science users of Big Data: an intuitive interactive interface with which to access powerful compute resources. IPython Notebook can be hosted as a single user desktop environment but the recent development by the IPython community of JupyterHub enables it to be run as a multi-user hosting environment. In addition, IPython.parallel allows the exposition of parallel compute infrastructure through a Python interface. Applying these technologies in combination, a collaborative research environment has been developed for OPTIRAD on the UK JASMIN/CEMS facility's private cloud (http://jasmin.ac.uk). Based on this experience, a generic virtualised solution is under development suitable for use by the wider environmental science community - on both JASMIN and portable to third party cloud platforms.

  8. GeoChronos: An On-line Collaborative Platform for Earth Observation Scientists

    NASA Astrophysics Data System (ADS)

    Gamon, J. A.; Kiddle, C.; Curry, R.; Markatchev, N.; Zonta-Pastorello, G., Jr.; Rivard, B.; Sanchez-Azofeifa, G. A.; Simmonds, R.; Tan, T.

    2009-12-01

    Recent advances in cyberinfrastructure are offering new solutions to the growing challenges of managing and sharing large data volumes. Web 2.0 and social networking technologies, provide the means for scientists to collaborate and share information more effectively. Cloud computing technologies can provide scientists with transparent and on-demand access to applications served over the Internet in a dynamic and scalable manner. Semantic Web technologies allow for data to be linked together in a manner understandable by machines, enabling greater automation. Combining all of these technologies together can enable the creation of very powerful platforms. GeoChronos (http://geochronos.org/), part of a CANARIE Network Enabled Platforms project, is an online collaborative platform that incorporates these technologies to enable members of the earth observation science community to share data and scientific applications and to collaborate more effectively. The GeoChronos portal is built on an open source social networking platform called Elgg. Elgg provides a full set of social networking functionalities similar to Facebook including blogs, tags, media/document sharing, wikis, friends/contacts, groups, discussions, message boards, calendars, status, activity feeds and more. An underlying cloud computing infrastructure enables scientists to access dynamically provisioned applications via the portal for visualizing and analyzing data. Users are able to access and run the applications from any computer that has a Web browser and Internet connectivity and do not need to manage and maintain the applications themselves. Semantic Web Technologies, such as the Resource Description Framework (RDF) are being employed for relating and linking together spectral, satellite, meteorological and other data. Social networking functionality plays an integral part in facilitating the sharing of data and applications. Examples of recent GeoChronos users during the early testing phase have included the IAI International Wireless Sensor Networking Summer School at the University of Alberta, and the IAI Tropi-Dry community. Current GeoChronos activities include the development of a web-based spectral library and related analytical and visualization tools, in collaboration with members of the SpecNet community. The GeoChronos portal will be open to all members of the earth observation science community when the project nears completion at the end of 2010.

  9. RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application

    PubMed Central

    2015-01-01

    Background The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). Moreover, the huge volume of data generated by NGS platforms introduces unprecedented computational and technological challenges to efficiently analyze and store sequence data and results. Methods In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Results Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs. PMID:26046471

  10. Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wilson, Brian; Manipon, Gerald; Hua, Hook; Fetzer, Eric

    2014-05-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map-reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in a hybrid Cloud (private eucalyptus & public Amazon). Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept and prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be perform

  11. A Cloud Architecture for Teleradiology-as-a-Service.

    PubMed

    Melício Monteiro, Eriksson J; Costa, Carlos; Oliveira, José L

    2016-05-17

    Telemedicine has been promoted by healthcare professionals as an efficient way to obtain remote assistance from specialised centres, to get a second opinion about complex diagnosis or even to share knowledge among practitioners. The current economic restrictions in many countries are increasing the demand for these solutions even more, in order to optimize processes and reduce costs. However, despite some technological solutions already in place, their adoption has been hindered by the lack of usability, especially in the set-up process. In this article we propose a telemedicine platform that relies on a cloud computing infrastructure and social media principles to simplify the creation of dynamic user-based groups, opening up opportunities for the establishment of teleradiology trust domains. The collaborative platform is provided as a Software-as-a-Service solution, supporting real time and asynchronous collaboration between users. To evaluate the solution, we have deployed the platform in a private cloud infrastructure. The system is made up of three main components - the collaborative framework, the Medical Management Information System (MMIS) and the HTML5 (Hyper Text Markup Language) Web client application - connected by a message-oriented middleware. The solution allows physicians to create easily dynamic network groups for synchronous or asynchronous cooperation. The network created improves dataflow between colleagues and also knowledge sharing and cooperation through social media tools. The platform was implemented and it has already been used in two distinct scenarios: teaching of radiology and tele-reporting. Collaborative systems can simplify the establishment of telemedicine expert groups with tools that enable physicians to improve their clinical practice. Streamlining the usage of this kind of systems through the adoption of Web technologies that are common in social media will increase the quality of current solutions, facilitating the sharing of clinical information, medical imaging studies and patient diagnostics among collaborators.

  12. Support for Taverna workflows in the VPH-Share cloud platform.

    PubMed

    Kasztelnik, Marek; Coto, Ernesto; Bubak, Marian; Malawski, Maciej; Nowakowski, Piotr; Arenas, Juan; Saglimbeni, Alfredo; Testi, Debora; Frangi, Alejandro F

    2017-07-01

    To address the increasing need for collaborative endeavours within the Virtual Physiological Human (VPH) community, the VPH-Share collaborative cloud platform allows researchers to expose and share sequences of complex biomedical processing tasks in the form of computational workflows. The Taverna Workflow System is a very popular tool for orchestrating complex biomedical & bioinformatics processing tasks in the VPH community. This paper describes the VPH-Share components that support the building and execution of Taverna workflows, and explains how they interact with other VPH-Share components to improve the capabilities of the VPH-Share platform. Taverna workflow support is delivered by the Atmosphere cloud management platform and the VPH-Share Taverna plugin. These components are explained in detail, along with the two main procedures that were developed to enable this seamless integration: workflow composition and execution. 1) Seamless integration of VPH-Share with other components and systems. 2) Extended range of different tools for workflows. 3) Successful integration of scientific workflows from other VPH projects. 4) Execution speed improvement for medical applications. The presented workflow integration provides VPH-Share users with a wide range of different possibilities to compose and execute workflows, such as desktop or online composition, online batch execution, multithreading, remote execution, etc. The specific advantages of each supported tool are presented, as are the roles of Atmosphere and the VPH-Share plugin within the VPH-Share project. The combination of the VPH-Share plugin and Atmosphere engenders the VPH-Share infrastructure with far more flexible, powerful and usable capabilities for the VPH-Share community. As both components can continue to evolve and improve independently, we acknowledge that further improvements are still to be developed and will be described. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. The EOS CERES Global Cloud Mask

    NASA Technical Reports Server (NTRS)

    Berendes, T. A.; Welch, R. M.; Trepte, Q.; Schaaf, C.; Baum, B. A.

    1996-01-01

    To detect long-term climate trends, it is essential to produce long-term and consistent data sets from a variety of different satellite platforms. With current global cloud climatology data sets, such as the International Satellite Cloud Climatology Experiment (ISCCP) or CLAVR (Clouds from Advanced Very High Resolution Radiometer), one of the first processing steps is to determine whether an imager pixel is obstructed between the satellite and the surface, i.e., determine a cloud 'mask.' A cloud mask is essential to studies monitoring changes over ocean, land, or snow-covered surfaces. As part of the Earth Observing System (EOS) program, a series of platforms will be flown beginning in 1997 with the Tropical Rainfall Measurement Mission (TRMM) and subsequently the EOS-AM and EOS-PM platforms in following years. The cloud imager on TRMM is the Visible/Infrared Sensor (VIRS), while the Moderate Resolution Imaging Spectroradiometer (MODIS) is the imager on the EOS platforms. To be useful for long term studies, a cloud masking algorithm should produce consistent results between existing (AVHRR) data, and future VIRS and MODIS data. The present work outlines both existing and proposed approaches to detecting cloud using multispectral narrowband radiance data. Clouds generally are characterized by higher albedos and lower temperatures than the underlying surface. However, there are numerous conditions when this characterization is inappropriate, most notably over snow and ice of the cloud types, cirrus, stratocumulus and cumulus are the most difficult to detect. Other problems arise when analyzing data from sun-glint areas over oceans or lakes over deserts or over regions containing numerous fires and smoke. The cloud mask effort builds upon operational experience of several groups that will now be discussed.

  14. Generic Module for Collecting Data in Smart Cities

    NASA Astrophysics Data System (ADS)

    Martinez, A.; Ramirez, F.; Estrada, H.; Torres, L. A.

    2017-09-01

    The Future Internet brings new technologies to the common life of people, such as Internet of Things, Cloud Computing or Big Data. All this technologies have change the way people communicate and also the way the devices interact with the context, giving rise to new paradigms, as the case of smart cities. Currently, the mobile devices represent one of main sources of information for new applications that take into account the user context, such as apps for mobility, health, of security. Several platforms have been proposed that consider the development of Future Internet applications, however, no generic modules can be found that implement the collection of context data from smartphones. In this research work we present a generic module to collect data from different sensors of the mobile devices and also to send, in a standard manner, this data to the Open FIWARE Cloud to be stored or analyzed by software tools. The proposed module enables the human-as-a-sensor approach for FIWARE Platform.

  15. iSPHERE - A New Approach to Collaborative Research and Cloud Computing

    NASA Astrophysics Data System (ADS)

    Al-Ubaidi, T.; Khodachenko, M. L.; Kallio, E. J.; Harry, A.; Alexeev, I. I.; Vázquez-Poletti, J. L.; Enke, H.; Magin, T.; Mair, M.; Scherf, M.; Poedts, S.; De Causmaecker, P.; Heynderickx, D.; Congedo, P.; Manolescu, I.; Esser, B.; Webb, S.; Ruja, C.

    2015-10-01

    The project iSPHERE (integrated Scientific Platform for HEterogeneous Research and Engineering) that has been proposed for Horizon 2020 (EINFRA-9- 2015, [1]) aims at creating a next generation Virtual Research Environment (VRE) that embraces existing and emerging technologies and standards in order to provide a versatile platform for scientific investigations and collaboration. The presentation will introduce the large project consortium, provide a comprehensive overview of iSPHERE's basic concepts and approaches and outline general user requirements that the VRE will strive to satisfy. An overview of the envisioned architecture will be given, focusing on the adapted Service Bus concept, i.e. the "Scientific Service Bus" as it is called in iSPHERE. The bus will act as a central hub for all communication and user access, and will be implemented in the course of the project. The agile approach [2] that has been chosen for detailed elaboration and documentation of user requirements, as well as for the actual implementation of the system, will be outlined and its motivation and basic structure will be discussed. The presentation will show which user communities will benefit and which concrete problems, scientific investigations are facing today, will be tackled by the system. Another focus of the presentation is iSPHERE's seamless integration of cloud computing resources and how these will benefit scientific modeling teams by providing a reliable and web based environment for cloud based model execution, storage of results, and comparison with measurements, including fully web based tools for data mining, analysis and visualization. Also the envisioned creation of a dedicated data model for experimental plasma physics will be discussed. It will be shown why the Scientific Service Bus provides an ideal basis to integrate a number of data models and communication protocols and to provide mechanisms for data exchange across multiple and even multidisciplinary platforms.

  16. Parallel processing optimization strategy based on MapReduce model in cloud storage environment

    NASA Astrophysics Data System (ADS)

    Cui, Jianming; Liu, Jiayi; Li, Qiuyan

    2017-05-01

    Currently, a large number of documents in the cloud storage process employed the way of packaging after receiving all the packets. From the local transmitter this stored procedure to the server, packing and unpacking will consume a lot of time, and the transmission efficiency is low as well. A new parallel processing algorithm is proposed to optimize the transmission mode. According to the operation machine graphs model work, using MPI technology parallel execution Mapper and Reducer mechanism. It is good to use MPI technology to implement Mapper and Reducer parallel mechanism. After the simulation experiment of Hadoop cloud computing platform, this algorithm can not only accelerate the file transfer rate, but also shorten the waiting time of the Reducer mechanism. It will break through traditional sequential transmission constraints and reduce the storage coupling to improve the transmission efficiency.

  17. [Research and Implementation of Vital Signs Monitoring System Based on Cloud Platform].

    PubMed

    Yu, Man; Tan, Anzu; Huang, Jianqi

    2018-05-30

    Through analyzing the existing problems in the current mode, the vital signs monitoring information system based on cloud platform is designed and developed. The system's aim is to assist nurse carry out vital signs nursing work effectively and accurately. The system collects, uploads and analyzes patient's vital signs data by PDA which connecting medical inspection equipments. Clinical application proved that the system can effectively improve the quality and efficiency of medical care and may reduce medical expenses. It is alse an important practice result to build a medical cloud platform.

  18. The HSP, the QCN, and the Dragon: Developing inquiry-based QCN instructional modules in Taiwan

    NASA Astrophysics Data System (ADS)

    Chen, K. H.; Liang, W.; Chang, C.; Yen, E.; Lin, C.; Lin, G.

    2012-12-01

    High Scope Program (HSP) is a long-term project funded by NSC in Taiwan since 2006. It is designed to elevate the quality of science education by means of incorporating emerging science and technology into the traditional curricula in senior high schools. Quake-Catcher Network (QCN), a distributed computing project initiated by Stanford University and UC Riverside, encourages the volunteers to install the low-cost, novel sensors at home and school to build a seismic network. To meet both needs, we have developed a model curriculum that introduces QCN, earthquake science, and cloud computing into high school classrooms. Through professional development workshops, Taiwan cloud-based earthquake science learning platform, and QCN club on Facebook, we have worked closely with Lan-Yang Girl's Senior High School teachers' team to design workable teaching plans through a practical operation of seismic monitoring at home or school. However, some obstacles to learning appear including QCN installation/maintain problems, high self-noise of the sensor, difficulty of introducing earthquake sciences for high school teachers. The challenges of QCN outreach in Taiwan bring out our future plans: (1) development of easy, frequently updated, physics-based QCN-experiments for high school teachers, and (2) design of an interactive learning platform with social networking function for students.

  19. An integrated system for land resources supervision based on the IoT and cloud computing

    NASA Astrophysics Data System (ADS)

    Fang, Shifeng; Zhu, Yunqiang; Xu, Lida; Zhang, Jinqu; Zhou, Peiji; Luo, Kan; Yang, Jie

    2017-01-01

    Integrated information systems are important safeguards for the utilisation and development of land resources. Information technologies, including the Internet of Things (IoT) and cloud computing, are inevitable requirements for the quality and efficiency of land resources supervision tasks. In this study, an economical and highly efficient supervision system for land resources has been established based on IoT and cloud computing technologies; a novel online and offline integrated system with synchronised internal and field data that includes the entire process of 'discovering breaches, analysing problems, verifying fieldwork and investigating cases' was constructed. The system integrates key technologies, such as the automatic extraction of high-precision information based on remote sensing, semantic ontology-based technology to excavate and discriminate public sentiment on the Internet that is related to illegal incidents, high-performance parallel computing based on MapReduce, uniform storing and compressing (bitwise) technology, global positioning system data communication and data synchronisation mode, intelligent recognition and four-level ('device, transfer, system and data') safety control technology. The integrated system based on a 'One Map' platform has been officially implemented by the Department of Land and Resources of Guizhou Province, China, and was found to significantly increase the efficiency and level of land resources supervision. The system promoted the overall development of informatisation in fields related to land resource management.

  20. Monitor weather conditions for cloud seeding control. [Colorado River Basin

    NASA Technical Reports Server (NTRS)

    Kahan, A. M. (Principal Investigator)

    1973-01-01

    The author has identified the following significant results. The near real-time DCS platform data transfer to the time-share compare is a working reality. Six stations are now being automatically monitored and displayed with a system delay of 3 to 8 hours from time of data transmission to time of data accessibility on the computer. The DCS platform system has proven itself a valuable tool for near real-time monitoring of mountain precipitation. Data from Wolf Creek Pass were an important input in making the decision when to suspend seeding operations to avoid exceeding suspension criteria in that area. The DCS platforms, as deployed in this investigation, have proven themselves to be reliable weather resistant systems for winter mountain environments in the southern Colorado mountains.

  1. A Truthful Incentive Mechanism for Online Recruitment in Mobile Crowd Sensing System.

    PubMed

    Chen, Xiao; Liu, Min; Zhou, Yaqin; Li, Zhongcheng; Chen, Shuang; He, Xiangnan

    2017-01-01

    We investigate emerging mobile crowd sensing (MCS) systems, in which new cloud-based platforms sequentially allocate homogenous sensing jobs to dynamically-arriving users with uncertain service qualities. Given that human beings are selfish in nature, it is crucial yet challenging to design an efficient and truthful incentive mechanism to encourage users to participate. To address the challenge, we propose a novel truthful online auction mechanism that can efficiently learn to make irreversible online decisions on winner selections for new MCS systems without requiring previous knowledge of users. Moreover, we theoretically prove that our incentive possesses truthfulness, individual rationality and computational efficiency. Extensive simulation results under both real and synthetic traces demonstrate that our incentive mechanism can reduce the payment of the platform, increase the utility of the platform and social welfare.

  2. Cliff Collapse Hazard from Repeated Multicopter Uav Acquisitions: Return on Experience

    NASA Astrophysics Data System (ADS)

    Dewez, T. J. B.; Leroux, J.; Morelli, S.

    2016-06-01

    Cliff collapse poses a serious hazard to infrastructure and passers-by. Obtaining information such as magnitude-frequency relationship for a specific site is of great help to adapt appropriate mitigation measures. While it is possible to monitor hundreds-of-meter-long cliff sites with ground based techniques (e.g. lidar or photogrammetry), it is both time consuming and scientifically limiting to focus on short cliff sections. In the project SUAVE, we sought to investigate whether an octocopter UAV photogrammetric survey would perform sufficiently well in order to repeatedly survey cliff face geometry and derive rock fall inventories amenable to probabilistic rock fall hazard computation. An experiment was therefore run on a well-studied site of the chalk coast of Normandy, in Mesnil Val, along the English Channel (Northern France). Two campaigns were organized in January and June 2015 which surveyed about 60 ha of coastline, including the 80-m-high cliff face, the chalk platform at its foot, and the hinterland in a matter of 4 hours from start to finish. To conform with UAV regulations, the flight was flown in 3 legs for a total of about 30 minutes in the air. A total of 868 and 1106 photos were respectively shot with a Sony NEX 7 with fixed focal 16mm. Three lines of sight were combined: horizontal shots for cliff face imaging, 45°-oblique views to tie plateau/platform photos with cliff face images, and regular vertical shots. Photogrammetrically derived dense point clouds were produced with Agisoft Photoscan at ultra-high density (median density is 1 point every 1.7cm). Point cloud density proved a critical parameter to reproduce faithfully the chalk face's geometry. Tuning down the density parameter to "high" or "medium", though efficient from a computational point of view, generated artefacts along chalk bed edges (i.e. smoothing the sharp gradient) and ultimately creating ghost volumes when computing cloud to cloud differences. Yet, from a hazard point of view, this is where small rock fall will most likely occur. Absolute orientation of both point clouds proved unsufficient despite the 30 black and white quadrants ground control point DGPS surveyed. Additional ICP was necessary to reach centimeter-level accuracy and segment rock fall scars corresponding to the expected average daily rock fall volume (ca. 0.013 m3).

  3. Applying a cloud computing approach to storage architectures for spacecraft

    NASA Astrophysics Data System (ADS)

    Baldor, Sue A.; Quiroz, Carlos; Wood, Paul

    As sensor technologies, processor speeds, and memory densities increase, spacecraft command, control, processing, and data storage systems have grown in complexity to take advantage of these improvements and expand the possible missions of spacecraft. Spacecraft systems engineers are increasingly looking for novel ways to address this growth in complexity and mitigate associated risks. Looking to conventional computing, many solutions have been executed to solve both the problem of complexity and heterogeneity in systems. In particular, the cloud-based paradigm provides a solution for distributing applications and storage capabilities across multiple platforms. In this paper, we propose utilizing a cloud-like architecture to provide a scalable mechanism for providing mass storage in spacecraft networks that can be reused on multiple spacecraft systems. By presenting a consistent interface to applications and devices that request data to be stored, complex systems designed by multiple organizations may be more readily integrated. Behind the abstraction, the cloud storage capability would manage wear-leveling, power consumption, and other attributes related to the physical memory devices, critical components in any mass storage solution for spacecraft. Our approach employs SpaceWire networks and SpaceWire-capable devices, although the concept could easily be extended to non-heterogeneous networks consisting of multiple spacecraft and potentially the ground segment.

  4. Optical fibre multi-parameter sensing with secure cloud based signal capture and processing

    NASA Astrophysics Data System (ADS)

    Newe, Thomas; O'Connell, Eoin; Meere, Damien; Yuan, Hongwei; Leen, Gabriel; O'Keeffe, Sinead; Lewis, Elfed

    2016-05-01

    Recent advancements in cloud computing technologies in the context of optical and optical fibre based systems are reported. The proliferation of real time and multi-channel based sensor systems represents significant growth in data volume. This coupled with a growing need for security presents many challenges and presents a huge opportunity for an evolutionary step in the widespread application of these sensing technologies. A tiered infrastructural system approach is adopted that is designed to facilitate the delivery of Optical Fibre-based "SENsing as a Service- SENaaS". Within this infrastructure, novel optical sensing platforms, deployed within different environments, are interfaced with a Cloud-based backbone infrastructure which facilitates the secure collection, storage and analysis of real-time data. Feedback systems, which harness this data to affect a change within the monitored location/environment/condition, are also discussed. The cloud based system presented here can also be used with chemical and physical sensors that require real-time data analysis, processing and feedback.

  5. Person detection and tracking with a 360° lidar system

    NASA Astrophysics Data System (ADS)

    Hammer, Marcus; Hebel, Marcus; Arens, Michael

    2017-10-01

    Today it is easily possible to generate dense point clouds of the sensor environment using 360° LiDAR (Light Detection and Ranging) sensors which are available since a number of years. The interpretation of these data is much more challenging. For the automated data evaluation the detection and classification of objects is a fundamental task. Especially in urban scenarios moving objects like persons or vehicles are of particular interest, for instance in automatic collision avoidance, for mobile sensor platforms or surveillance tasks. In literature there are several approaches for automated person detection in point clouds. While most techniques show acceptable results in object detection, the computation time is often crucial. The runtime can be problematic, especially due to the amount of data in the panoramic 360° point clouds. On the other hand, for most applications an object detection and classification in real time is needed. The paper presents a proposal for a fast, real-time capable algorithm for person detection, classification and tracking in panoramic point clouds.

  6. Cytobank: providing an analytics platform for community cytometry data analysis and collaboration.

    PubMed

    Chen, Tiffany J; Kotecha, Nikesh

    2014-01-01

    Cytometry is used extensively in clinical and laboratory settings to diagnose and track cell subsets in blood and tissue. High-throughput, single-cell approaches leveraging cytometry are developed and applied in the computational and systems biology communities by researchers, who seek to improve the diagnosis of human diseases, map the structures of cell signaling networks, and identify new cell types. Data analysis and management present a bottleneck in the flow of knowledge from bench to clinic. Multi-parameter flow and mass cytometry enable identification of signaling profiles of patient cell samples. Currently, this process is manual, requiring hours of work to summarize multi-dimensional data and translate these data for input into other analysis programs. In addition, the increase in the number and size of collaborative cytometry studies as well as the computational complexity of analytical tools require the ability to assemble sufficient and appropriately configured computing capacity on demand. There is a critical need for platforms that can be used by both clinical and basic researchers who routinely rely on cytometry. Recent advances provide a unique opportunity to facilitate collaboration and analysis and management of cytometry data. Specifically, advances in cloud computing and virtualization are enabling efficient use of large computing resources for analysis and backup. An example is Cytobank, a platform that allows researchers to annotate, analyze, and share results along with the underlying single-cell data.

  7. Multi-source Geospatial Data Analysis with Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Erickson, T.

    2014-12-01

    The Google Earth Engine platform is a cloud computing environment for data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog is a multi-petabyte archive of georeferenced datasets that include images from Earth observing satellite and airborne sensors (examples: USGS Landsat, NASA MODIS, USDA NAIP), weather and climate datasets, and digital elevation models. Earth Engine supports both a just-in-time computation model that enables real-time preview and debugging during algorithm development for open-ended data exploration, and a batch computation mode for applying algorithms over large spatial and temporal extents. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, and resampling, which facilitates writing algorithms that combine data from multiple sensors and/or models. Although the primary use of Earth Engine, to date, has been the analysis of large Earth observing satellite datasets, the computational platform is generally applicable to a wide variety of use cases that require large-scale geospatial data analyses. This presentation will focus on how Earth Engine facilitates the analysis of geospatial data streams that originate from multiple separate sources (and often communities) and how it enables collaboration during algorithm development and data exploration. The talk will highlight current projects/analyses that are enabled by this functionality.https://earthengine.google.org

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin

    The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less

  9. Many-core computing for space-based stereoscopic imaging

    NASA Astrophysics Data System (ADS)

    McCall, Paul; Torres, Gildo; LeGrand, Keith; Adjouadi, Malek; Liu, Chen; Darling, Jacob; Pernicka, Henry

    The potential benefits of using parallel computing in real-time visual-based satellite proximity operations missions are investigated. Improvements in performance and relative navigation solutions over single thread systems can be achieved through multi- and many-core computing. Stochastic relative orbit determination methods benefit from the higher measurement frequencies, allowing them to more accurately determine the associated statistical properties of the relative orbital elements. More accurate orbit determination can lead to reduced fuel consumption and extended mission capabilities and duration. Inherent to the process of stereoscopic image processing is the difficulty of loading, managing, parsing, and evaluating large amounts of data efficiently, which may result in delays or highly time consuming processes for single (or few) processor systems or platforms. In this research we utilize the Single-Chip Cloud Computer (SCC), a fully programmable 48-core experimental processor, created by Intel Labs as a platform for many-core software research, provided with a high-speed on-chip network for sharing information along with advanced power management technologies and support for message-passing. The results from utilizing the SCC platform for the stereoscopic image processing application are presented in the form of Performance, Power, Energy, and Energy-Delay-Product (EDP) metrics. Also, a comparison between the SCC results and those obtained from executing the same application on a commercial PC are presented, showing the potential benefits of utilizing the SCC in particular, and any many-core platforms in general for real-time processing of visual-based satellite proximity operations missions.

  10. From One Pixel to One Earth: Building a Living Atlas in the Cloud to Analyze and Monitor Global Patterns

    NASA Astrophysics Data System (ADS)

    Moody, D.; Brumby, S. P.; Chartrand, R.; Franco, E.; Keisler, R.; Kelton, T.; Kontgis, C.; Mathis, M.; Raleigh, D.; Rudelis, X.; Skillman, S.; Warren, M. S.; Longbotham, N.

    2016-12-01

    The recent computing performance revolution has driven improvements in sensor, communication, and storage technology. Historical, multi-decadal remote sensing datasets at the petabyte scale are now available in commercial clouds, with new satellite constellations generating petabytes per year of high-resolution imagery with daily global coverage. Cloud computing and storage, combined with recent advances in machine learning and open software, are enabling understanding of the world at an unprecedented scale and detail. We have assembled all available satellite imagery from the USGS Landsat, NASA MODIS, and ESA Sentinel programs, as well as commercial PlanetScope and RapidEye imagery, and have analyzed over 2.8 quadrillion multispectral pixels. We leveraged the commercial cloud to generate a tiled, spatio-temporal mosaic of the Earth for fast iteration and development of new algorithms combining analysis techniques from remote sensing, machine learning, and scalable compute infrastructure. Our data platform enables processing at petabytes per day rates using multi-source data to produce calibrated, georeferenced imagery stacks at desired points in time and space that can be used for pixel level or global scale analysis. We demonstrate our data platform capability by using the European Space Agency's (ESA) published 2006 and 2009 GlobCover 20+ category label maps to train and test a Land Cover Land Use (LCLU) classifier, and generate current self-consistent LCLU maps in Brazil. We train a standard classifier on 2006 GlobCover categories using temporal imagery stacks, and we validate our results on co-registered 2009 Globcover LCLU maps and 2009 imagery. We then extend the derived LCLU model to current imagery stacks to generate an updated, in-season label map. Changes in LCLU labels can now be seamlessly monitored for a given location across the years in order to track, for example, cropland expansion, forest growth, and urban developments. An example of change monitoring is illustrated in the included figure showing rainfed cropland change in the Mato Grosso region of Brazil between 2006 and 2009.

  11. GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data

    NASA Astrophysics Data System (ADS)

    Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.

    2016-12-01

    Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.

  12. Leveraging the Cloud for Robust and Efficient Lunar Image Processing

    NASA Technical Reports Server (NTRS)

    Chang, George; Malhotra, Shan; Wolgast, Paul

    2011-01-01

    The Lunar Mapping and Modeling Project (LMMP) is tasked to aggregate lunar data, from the Apollo era to the latest instruments on the LRO spacecraft, into a central repository accessible by scientists and the general public. A critical function of this task is to provide users with the best solution for browsing the vast amounts of imagery available. The image files LMMP manages range from a few gigabytes to hundreds of gigabytes in size with new data arriving every day. Despite this ever-increasing amount of data, LMMP must make the data readily available in a timely manner for users to view and analyze. This is accomplished by tiling large images into smaller images using Hadoop, a distributed computing software platform implementation of the MapReduce framework, running on a small cluster of machines locally. Additionally, the software is implemented to use Amazon's Elastic Compute Cloud (EC2) facility. We also developed a hybrid solution to serve images to users by leveraging cloud storage using Amazon's Simple Storage Service (S3) for public data while keeping private information on our own data servers. By using Cloud Computing, we improve upon our local solution by reducing the need to manage our own hardware and computing infrastructure, thereby reducing costs. Further, by using a hybrid of local and cloud storage, we are able to provide data to our users more efficiently and securely. 12 This paper examines the use of a distributed approach with Hadoop to tile images, an approach that provides significant improvements in image processing time, from hours to minutes. This paper describes the constraints imposed on the solution and the resulting techniques developed for the hybrid solution of a customized Hadoop infrastructure over local and cloud resources in managing this ever-growing data set. It examines the performance trade-offs of using the more plentiful resources of the cloud, such as those provided by S3, against the bandwidth limitations such use encounters with remote resources. As part of this discussion this paper will outline some of the technologies employed, the reasons for their selection, the resulting performance metrics and the direction the project is headed based upon the demonstrated capabilities thus far.

  13. Analysis of Laogang energy internet and construction of the cloud platform

    NASA Astrophysics Data System (ADS)

    Wang, Selan; Nie, Jianwen; Zhang, Daiyue; Li, Xia; Tai, Jun; Yu, Zhaohui; Lu, Yiqi; Xie, Da

    2018-02-01

    Laogang solid waste recycling base deals with about 70% waste domestic garbage of Shanghai every day. By recycling the garbage, great amount of energy including electricity, heat and gas can be produced. Meanwhile, the base itself consumes much energy as well. Therefore, an energy internet has been designed for the base to analyse the output and usage of the energy so that the energy utilization rate can be enhanced. In addition, a cloud platform has been established basing on the three-layer cloud technology: IaaS, PaaS and SaaS. This cloud platform mainly analysing electricity will judge whether the energy has been used suitably form all sides and furthermore, improve the operation of the whole energy internet in the base.

  14. TERRA REF: Advancing phenomics with high resolution, open access sensor and genomics data

    NASA Astrophysics Data System (ADS)

    LeBauer, D.; Kooper, R.; Burnette, M.; Willis, C.

    2017-12-01

    Automated plant measurement has the potential to improve understanding of genetic and environmental controls on plant traits (phenotypes). The application of sensors and software in the automation of high throughput phenotyping reflects a fundamental shift from labor intensive hand measurements to drone, tractor, and robot mounted sensing platforms. These tools are expected to speed the rate of crop improvement by enabling plant breeders to more accurately select plants with improved yields, resource use efficiency, and stress tolerance. However, there are many challenges facing high throughput phenomics: sensors and platforms are expensive, currently there are few standard methods of data collection and storage, and the analysis of large data sets requires high performance computers and automated, reproducible computing pipelines. To overcome these obstacles and advance the science of high throughput phenomics, the TERRA Phenotyping Reference Platform (TERRA-REF) team is developing an open-access database of high resolution sensor data. TERRA REF is an integrated field and greenhouse phenotyping system that includes: a reference field scanner with fifteen sensors that can generate terrabytes of data each day at mm resolution; UAV, tractor, and fixed field sensing platforms; and an automated controlled-environment scanner. These platforms will enable investigation of diverse sensing modalities, and the investigation of traits under controlled and field environments. It is the goal of TERRA REF to lower the barrier to entry for academic and industry researchers by providing high-resolution data, open source software, and online computing resources. Our project is unique in that all data will be made fully public in November 2018, and is already available to early adopters through the beta-user program. We will describe the datasets and how to use them as well as the databases and computing pipeline and how these can be reused and remixed in other phenomics pipelines. Finally, we will describe the National Data Service workbench, a cloud computing platform that can access the petabyte scale data while supporting reproducible research.

  15. Geospatial Applications on Different Parallel and Distributed Systems in enviroGRIDS Project

    NASA Astrophysics Data System (ADS)

    Rodila, D.; Bacu, V.; Gorgan, D.

    2012-04-01

    The execution of Earth Science applications and services on parallel and distributed systems has become a necessity especially due to the large amounts of Geospatial data these applications require and the large geographical areas they cover. The parallelization of these applications comes to solve important performance issues and can spread from task parallelism to data parallelism as well. Parallel and distributed architectures such as Grid, Cloud, Multicore, etc. seem to offer the necessary functionalities to solve important problems in the Earth Science domain: storing, distribution, management, processing and security of Geospatial data, execution of complex processing through task and data parallelism, etc. A main goal of the FP7-funded project enviroGRIDS (Black Sea Catchment Observation and Assessment System supporting Sustainable Development) [1] is the development of a Spatial Data Infrastructure targeting this catchment region but also the development of standardized and specialized tools for storing, analyzing, processing and visualizing the Geospatial data concerning this area. For achieving these objectives, the enviroGRIDS deals with the execution of different Earth Science applications, such as hydrological models, Geospatial Web services standardized by the Open Geospatial Consortium (OGC) and others, on parallel and distributed architecture to maximize the obtained performance. This presentation analysis the integration and execution of Geospatial applications on different parallel and distributed architectures and the possibility of choosing among these architectures based on application characteristics and user requirements through a specialized component. Versions of the proposed platform have been used in enviroGRIDS project on different use cases such as: the execution of Geospatial Web services both on Web and Grid infrastructures [2] and the execution of SWAT hydrological models both on Grid and Multicore architectures [3]. The current focus is to integrate in the proposed platform the Cloud infrastructure, which is still a paradigm with critical problems to be solved despite the great efforts and investments. Cloud computing comes as a new way of delivering resources while using a large set of old as well as new technologies and tools for providing the necessary functionalities. The main challenges in the Cloud computing, most of them identified also in the Open Cloud Manifesto 2009, address resource management and monitoring, data and application interoperability and portability, security, scalability, software licensing, etc. We propose a platform able to execute different Geospatial applications on different parallel and distributed architectures such as Grid, Cloud, Multicore, etc. with the possibility of choosing among these architectures based on application characteristics and complexity, user requirements, necessary performances, cost support, etc. The execution redirection on a selected architecture is realized through a specialized component and has the purpose of offering a flexible way in achieving the best performances considering the existing restrictions.

  16. A wearable computing platform for developing cloud-based machine learning models for health monitoring applications.

    PubMed

    Patel, Shyamal; McGinnis, Ryan S; Silva, Ikaro; DiCristofaro, Steve; Mahadevan, Nikhil; Jortberg, Elise; Franco, Jaime; Martin, Albert; Lust, Joseph; Raj, Milan; McGrane, Bryan; DePetrillo, Paolo; Aranyosi, A J; Ceruolo, Melissa; Pindado, Jesus; Ghaffari, Roozbeh

    2016-08-01

    Wearable sensors have the potential to enable clinical-grade ambulatory health monitoring outside the clinic. Technological advances have enabled development of devices that can measure vital signs with great precision and significant progress has been made towards extracting clinically meaningful information from these devices in research studies. However, translating measurement accuracies achieved in the controlled settings such as the lab and clinic to unconstrained environments such as the home remains a challenge. In this paper, we present a novel wearable computing platform for unobtrusive collection of labeled datasets and a new paradigm for continuous development, deployment and evaluation of machine learning models to ensure robust model performance as we transition from the lab to home. Using this system, we train activity classification models across two studies and track changes in model performance as we go from constrained to unconstrained settings.

  17. The Combat Cloud: Enabling Multi-Domain Command and Control Across the Range of Military Operations

    DTIC Science & Technology

    2017-03-01

    and joint by their very nature.3 The Combat Cloud architecture will enable MDC2 by increasing the interoperability of existing networks...order to provide operating platforms with a robust architecture that communicates with relevant players, operates at reduced levels of connectivity...responsibility or aircraft platform, and a Combat Cloud architecture helps focus thought toward achieving efficient MDC2 and effects rather than

  18. iRODS-Based Climate Data Services and Virtualization-as-a-Service in the NASA Center for Climate Simulation

    NASA Astrophysics Data System (ADS)

    Schnase, J. L.; Duffy, D. Q.; Tamkin, G. S.; Strong, S.; Ripley, D.; Gill, R.; Sinno, S. S.; Shen, Y.; Carriere, L. E.; Brieger, L.; Moore, R.; Rajasekar, A.; Schroeder, W.; Wan, M.

    2011-12-01

    Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service. A virtual climate data server is an OAIS-compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have developed prototype vCDSs to manage NetCDF, HDF, and GeoTIF data products. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA's Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into these virtualized resources, multiple vCDSs can use iRODS's federation and realized object capabilities to create an integrated ecosystem of data servers that can scale and adapt to changing requirements. This approach enables platform- or software-as-a-service deployment of the vCDSs and allows the NCCS to offer virtualization-as-a-service, a capacity to respond in an agile way to new customer requests for data services, and a path for migrating existing services into the cloud. We have registered MODIS Atmosphere data products in a vCDS that contains 54 million registered files, 630TB of data, and over 300 million metadata values. We are now assembling IPCC AR5 data into a production vCDS that will provide the platform upon which NCCS's Earth System Grid (ESG) node publishes to the extended science community. In this talk, we describe our approach, experiences, lessons learned, and plans for the future.

  19. Radiative transfer model for aerosols at infrared wavelengths for passive remote sensing applications: revisited.

    PubMed

    Ben-David, Avishai; Davidson, Charles E; Embury, Janon F

    2008-11-01

    We introduced a two-dimensional radiative transfer model for aerosols in the thermal infrared [Appl. Opt.45, 6860-6875 (2006)APOPAI0003-693510.1364/AO.45.006860]. In that paper we superimposed two orthogonal plane-parallel layers to compute the radiance due to a two-dimensional (2D) rectangular aerosol cloud. In this paper we revisit the model and correct an error in the interaction of the two layers. We derive new expressions relating to the signal content of the radiance from an aerosol cloud based on the concept of five directional thermal contrasts: four for the 2D diffuse radiance and one for direct radiance along the line of sight. The new expressions give additional insight on the radiative transfer processes within the cloud. Simulations for Bacillus subtilis var. niger (BG) bioaerosol and dustlike kaolin aerosol clouds are compared and contrasted for two geometries: an airborne sensor looking down and a ground-based sensor looking up. Simulation results suggest that aerosol cloud detection from an airborne platform may be more challenging than for a ground-based sensor and that the detection of an aerosol cloud in emission mode (negative direct thermal contrast) is not the same as the detection of an aerosol cloud in absorption mode (positive direct thermal contrast).

  20. Vision 20/20: Automation and advanced computing in clinical radiation oncology.

    PubMed

    Moore, Kevin L; Kagadis, George C; McNutt, Todd R; Moiseenko, Vitali; Mutic, Sasa

    2014-01-01

    This Vision 20/20 paper considers what computational advances are likely to be implemented in clinical radiation oncology in the coming years and how the adoption of these changes might alter the practice of radiotherapy. Four main areas of likely advancement are explored: cloud computing, aggregate data analyses, parallel computation, and automation. As these developments promise both new opportunities and new risks to clinicians and patients alike, the potential benefits are weighed against the hazards associated with each advance, with special considerations regarding patient safety under new computational platforms and methodologies. While the concerns of patient safety are legitimate, the authors contend that progress toward next-generation clinical informatics systems will bring about extremely valuable developments in quality improvement initiatives, clinical efficiency, outcomes analyses, data sharing, and adaptive radiotherapy.

  1. Enhancing data utilization through adoption of cloud-based data architectures (Invited Paper 211869)

    NASA Astrophysics Data System (ADS)

    Kearns, E. J.

    2017-12-01

    A traditional approach to data distribution and utilization of open government data involves continuously moving those data from a central government location to each potential user, who would then utilize them on their local computer systems. An alternate approach would be to bring those users to the open government data, where users would also have access to computing and analytics capabilities that would support data utilization. NOAA's Big Data Project is exploring such an alternate approach through an experimental collaboration with Amazon Web Services, Google Cloud Platform, IBM, Microsoft Azure, and the Open Commons Consortium. As part of this ongoing experiment, NOAA is providing open data of interest which are freely hosted by the Big Data Project Collaborators, who provide a variety of cloud-based services and capabilities to enable utilization by data users. By the terms of the agreement, the Collaborators may charge for those value-added services and processing capacities to recover their costs to freely host the data and to generate profits if so desired. Initial results have shown sustained increases in data utilization from 2 to over 100 times previously-observed access patterns from traditional approaches. Significantly increased utilization speed as compared to the traditional approach has also been observed by NOAA data users who have volunteered their experiences on these cloud-based systems. The potential for implementing and sustaining the alternate cloud-based approach as part of a change in operational data utilization strategies will be discussed.

  2. Genetic Constructor: An Online DNA Design Platform.

    PubMed

    Bates, Maxwell; Lachoff, Joe; Meech, Duncan; Zulkower, Valentin; Moisy, Anaïs; Luo, Yisha; Tekotte, Hille; Franziska Scheitz, Cornelia Johanna; Khilari, Rupal; Mazzoldi, Florencio; Chandran, Deepak; Groban, Eli

    2017-12-15

    Genetic Constructor is a cloud Computer Aided Design (CAD) application developed to support synthetic biologists from design intent through DNA fabrication and experiment iteration. The platform allows users to design, manage, and navigate complex DNA constructs and libraries, using a new visual language that focuses on functional parts abstracted from sequence. Features like combinatorial libraries and automated primer design allow the user to separate design from construction by focusing on functional intent, and design constraints aid iterative refinement of designs. A plugin architecture enables contributions from scientists and coders to leverage existing powerful software and connect to DNA foundries. The software is easily accessible and platform agnostic, free for academics, and available in an open-source community edition. Genetic Constructor seeks to democratize DNA design, manufacture, and access to tools and services from the synthetic biology community.

  3. A Truthful Incentive Mechanism for Online Recruitment in Mobile Crowd Sensing System

    PubMed Central

    Chen, Xiao; Liu, Min; Zhou, Yaqin; Li, Zhongcheng; Chen, Shuang; He, Xiangnan

    2017-01-01

    We investigate emerging mobile crowd sensing (MCS) systems, in which new cloud-based platforms sequentially allocate homogenous sensing jobs to dynamically-arriving users with uncertain service qualities. Given that human beings are selfish in nature, it is crucial yet challenging to design an efficient and truthful incentive mechanism to encourage users to participate. To address the challenge, we propose a novel truthful online auction mechanism that can efficiently learn to make irreversible online decisions on winner selections for new MCS systems without requiring previous knowledge of users. Moreover, we theoretically prove that our incentive possesses truthfulness, individual rationality and computational efficiency. Extensive simulation results under both real and synthetic traces demonstrate that our incentive mechanism can reduce the payment of the platform, increase the utility of the platform and social welfare. PMID:28045441

  4. Architecture Design and Experimental Platform Demonstration of Optical Network based on OpenFlow Protocol

    NASA Astrophysics Data System (ADS)

    Xing, Fangyuan; Wang, Honghuan; Yin, Hongxi; Li, Ming; Luo, Shenzi; Wu, Chenguang

    2016-02-01

    With the extensive application of cloud computing and data centres, as well as the constantly emerging services, the big data with the burst characteristic has brought huge challenges to optical networks. Consequently, the software defined optical network (SDON) that combines optical networks with software defined network (SDN), has attracted much attention. In this paper, an OpenFlow-enabled optical node employed in optical cross-connect (OXC) and reconfigurable optical add/drop multiplexer (ROADM), is proposed. An open source OpenFlow controller is extended on routing strategies. In addition, the experiment platform based on OpenFlow protocol for software defined optical network, is designed. The feasibility and availability of the OpenFlow-enabled optical nodes and the extended OpenFlow controller are validated by the connectivity test, protection switching and load balancing experiments in this test platform.

  5. Calibration of UAS imagery inside and outside of shadows for improved vegetation index computation

    NASA Astrophysics Data System (ADS)

    Bondi, Elizabeth; Salvaggio, Carl; Montanaro, Matthew; Gerace, Aaron D.

    2016-05-01

    Vegetation health and vigor can be assessed with data from multi- and hyperspectral airborne and satellite- borne sensors using index products such as the normalized difference vegetation index (NDVI). Recent advances in unmanned aerial systems (UAS) technology have created the opportunity to access these same image data sets in a more cost effective manner with higher temporal and spatial resolution. Another advantage of these systems includes the ability to gather data in almost any weather condition, including complete cloud cover, when data has not been available before from traditional platforms. The ability to collect in these varied conditions, meteorological and temporal, will present researchers and producers with many new challenges. Particularly, cloud shadows and self-shadowing by vegetation must be taken into consideration in imagery collected from UAS platforms to avoid variation in NDVI due to changes in illumination within a single scene, and between collection flights. A workflow is presented to compensate for variations in vegetation indices due to shadows and variation in illumination levels in high resolution imagery collected from UAS platforms. Other calibration methods that producers may currently be utilizing produce NDVI products that still contain shadow boundaries and variations due to illumination, whereas the final NDVI mosaic from this workflow does not.

  6. Curating Big Data Made Simple: Perspectives from Scientific Communities.

    PubMed

    Sowe, Sulayman K; Zettsu, Koji

    2014-03-01

    The digital universe is exponentially producing an unprecedented volume of data that has brought benefits as well as fundamental challenges for enterprises and scientific communities alike. This trend is inherently exciting for the development and deployment of cloud platforms to support scientific communities curating big data. The excitement stems from the fact that scientists can now access and extract value from the big data corpus, establish relationships between bits and pieces of information from many types of data, and collaborate with a diverse community of researchers from various domains. However, despite these perceived benefits, to date, little attention is focused on the people or communities who are both beneficiaries and, at the same time, producers of big data. The technical challenges posed by big data are as big as understanding the dynamics of communities working with big data, whether scientific or otherwise. Furthermore, the big data era also means that big data platforms for data-intensive research must be designed in such a way that research scientists can easily search and find data for their research, upload and download datasets for onsite/offsite use, perform computations and analysis, share their findings and research experience, and seamlessly collaborate with their colleagues. In this article, we present the architecture and design of a cloud platform that meets some of these requirements, and a big data curation model that describes how a community of earth and environmental scientists is using the platform to curate data. Motivation for developing the platform, lessons learnt in overcoming some challenges associated with supporting scientists to curate big data, and future research directions are also presented.

  7. Analysis and design of energy monitoring platform for smart city

    NASA Astrophysics Data System (ADS)

    Wang, Hong-xia

    2016-09-01

    The development and utilization of energy has greatly promoted the development and progress of human society. It is the basic material foundation for human survival. City running is bound to consume energy inevitably, but it also brings a lot of waste discharge. In order to speed up the process of smart city, improve the efficiency of energy saving and emission reduction work, maintain the green and livable environment, a comprehensive management platform of energy monitoring for government departments is constructed based on cloud computing technology and 3-tier architecture in this paper. It is assumed that the system will provide scientific guidance for the environment management and decision making in smart city.

  8. DSCOVR_EPIC_L2_CLOUD_01

    Atmospheric Science Data Center

    2018-06-20

    ... V1 Level:  L2 Platform:  DEEP SPACE CLIMATE OBSERVATORY Instrument:  Enhanced Polychromatic ... assuming ice phase Cloud Optical Thickness – assuming liquid phase EPIC Cloud Mask Oxygen A-band Cloud Effective Height (in ...

  9. NASA's Integrated Instrument Simulator Suite for Atmospheric Remote Sensing from Spaceborne Platforms (ISSARS) and Its Role for the ACE and GPM Missions

    NASA Technical Reports Server (NTRS)

    Tanelli, Simone; Tao, Wei-Kuo; Hostetler, Chris; Kuo, Kwo-Sen; Matsui, Toshihisa; Jacob, Joseph C.; Niamsuwam, Noppasin; Johnson, Michael P.; Hair, John; Butler, Carolyn; hide

    2011-01-01

    Forward simulation is an indispensable tool for evaluation of precipitation retrieval algorithms as well as for studying snow/ice microphysics and their radiative properties. The main challenge of the implementation arises due to the size of the problem domain. To overcome this hurdle, assumptions need to be made to simplify compiles cloud microphysics. It is important that these assumptions are applied consistently throughout the simulation process. ISSARS addresses this issue by providing a computationally efficient and modular framework that can integrate currently existing models and is also capable of expanding for future development. ISSARS is designed to accommodate the simulation needs of the Aerosol/Clouds/Ecosystems (ACE) mission and the Global Precipitation Measurement (GPM) mission: radars, microwave radiometers, and optical instruments such as lidars and polarimeter. ISSARS's computation is performed in three stages: input reconditioning (IRM), electromagnetic properties (scattering/emission/absorption) calculation (SEAM), and instrument simulation (ISM). The computation is implemented as a web service while its configuration can be accessed through a web-based interface.

  10. Advanced Visualization and Interactive Display Rapid Innovation and Discovery Evaluation Research (VISRIDER) Program Task 6: Point Cloud Visualization Techniques for Desktop and Web Platforms

    DTIC Science & Technology

    2017-04-01

    ADVANCED VISUALIZATION AND INTERACTIVE DISPLAY RAPID INNOVATION AND DISCOVERY EVALUATION RESEARCH (VISRIDER) PROGRAM TASK 6: POINT CLOUD...To) OCT 2013 – SEP 2014 4. TITLE AND SUBTITLE ADVANCED VISUALIZATION AND INTERACTIVE DISPLAY RAPID INNOVATION AND DISCOVERY EVALUATION RESEARCH...various point cloud visualization techniques for viewing large scale LiDAR datasets. Evaluate their potential use for thick client desktop platforms

  11. Cloud manufacturing: from concept to practice

    NASA Astrophysics Data System (ADS)

    Ren, Lei; Zhang, Lin; Tao, Fei; Zhao, Chun; Chai, Xudong; Zhao, Xinpei

    2015-02-01

    The concept of cloud manufacturing is emerging as a new promising manufacturing paradigm, as well as a business model, which is reshaping the service-oriented, highly collaborative, knowledge-intensive and eco-efficient manufacturing industry. However, the basic concepts about cloud manufacturing are still in discussion. Both academia and industry will need to have a commonly accepted definition of cloud manufacturing, as well as further guidance and recommendations on how to develop and implement cloud manufacturing. In this paper, we review some of the research work and clarify some fundamental terminologies in this field. Further, we developed a cloud manufacturing systems which may serve as an application example. From a systematic and practical perspective, the key requirements of cloud manufacturing platforms are investigated, and then we propose a cloud manufacturing platform prototype, MfgCloud. Finally, a public cloud manufacturing system for small- and medium-sized enterprises (SME) is presented. This paper presents a new perspective for cloud manufacturing, as well as a cloud-to-ground solution. The integrated solution proposed in this paper, including the terminology, MfgCloud, and applications, can push forward this new paradigm from concept to practice.

  12. The International Symposium on Grids and Clouds

    NASA Astrophysics Data System (ADS)

    The International Symposium on Grids and Clouds (ISGC) 2012 will be held at Academia Sinica in Taipei from 26 February to 2 March 2012, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). 2012 is the decennium anniversary of the ISGC which over the last decade has tracked the convergence, collaboration and innovation of individual researchers across the Asia Pacific region to a coherent community. With the continuous support and dedication from the delegates, ISGC has provided the primary international distributed computing platform where distinguished researchers and collaboration partners from around the world share their knowledge and experiences. The last decade has seen the wide-scale emergence of e-Infrastructure as a critical asset for the modern e-Scientist. The emergence of large-scale research infrastructures and instruments that has produced a torrent of electronic data is forcing a generational change in the scientific process and the mechanisms used to analyse the resulting data deluge. No longer can the processing of these vast amounts of data and production of relevant scientific results be undertaken by a single scientist. Virtual Research Communities that span organisations around the world, through an integrated digital infrastructure that connects the trust and administrative domains of multiple resource providers, have become critical in supporting these analyses. Topics covered in ISGC 2012 include: High Energy Physics, Biomedicine & Life Sciences, Earth Science, Environmental Changes and Natural Disaster Mitigation, Humanities & Social Sciences, Operations & Management, Middleware & Interoperability, Security and Networking, Infrastructure Clouds & Virtualisation, Business Models & Sustainability, Data Management, Distributed Volunteer & Desktop Grid Computing, High Throughput Computing, and High Performance, Manycore & GPU Computing.

  13. Exploring the factors influencing the cloud computing adoption: a systematic study on cloud migration.

    PubMed

    Rai, Rashmi; Sahoo, Gadadhar; Mehfuz, Shabana

    2015-01-01

    Today, most of the organizations trust on their age old legacy applications, to support their business-critical systems. However, there are several critical concerns, as maintainability and scalability issues, associated with the legacy system. In this background, cloud services offer a more agile and cost effective platform, to support business applications and IT infrastructure. As the adoption of cloud services has been increasing recently and so has been the academic research in cloud migration. However, there is a genuine need of secondary study to further strengthen this research. The primary objective of this paper is to scientifically and systematically identify, categorize and compare the existing research work in the area of legacy to cloud migration. The paper has also endeavored to consolidate the research on Security issues, which is prime factor hindering the adoption of cloud through classifying the studies on secure cloud migration. SLR (Systematic Literature Review) of thirty selected papers, published from 2009 to 2014 was conducted to properly understand the nuances of the security framework. To categorize the selected studies, authors have proposed a conceptual model for cloud migration which has resulted in a resource base of existing solutions for cloud migration. This study concludes that cloud migration research is in seminal stage but simultaneously it is also evolving and maturing, with increasing participation from academics and industry alike. The paper also identifies the need for a secure migration model, which can fortify organization's trust into cloud migration and facilitate necessary tool support to automate the migration process.

  14. Cloud Computing

    DTIC Science & Technology

    2010-04-29

    Cloud Computing   The answer, my friend, is blowing in the wind.   The answer is blowing in the wind. 1Bingue ‐ Cook  Cloud   Computing  STSC 2010... Cloud   Computing  STSC 2010 Objectives • Define the cloud    • Risks of  cloud   computing f l d i• Essence o  c ou  comput ng • Deployed clouds in DoD 3Bingue...Cook  Cloud   Computing  STSC 2010 Definitions of Cloud Computing       Cloud   computing  is a model for enabling  b d d ku

  15. A cloud-based system for measuring radiation treatment plan similarity

    NASA Astrophysics Data System (ADS)

    Andrea, Jennifer

    PURPOSE: Radiation therapy is used to treat cancer using carefully designed plans that maximize the radiation dose delivered to the target and minimize damage to healthy tissue, with the dose administered over multiple occasions. Creating treatment plans is a laborious process and presents an obstacle to more frequent replanning, which remains an unsolved problem. However, in between new plans being created, the patient's anatomy can change due to multiple factors including reduction in tumor size and loss of weight, which results in poorer patient outcomes. Cloud computing is a newer technology that is slowly being used for medical applications with promising results. The objective of this work was to design and build a system that could analyze a database of previously created treatment plans, which are stored with their associated anatomical information in studies, to find the one with the most similar anatomy to a new patient. The analyses would be performed in parallel on the cloud to decrease the computation time of finding this plan. METHODS: The system used SlicerRT, a radiation therapy toolkit for the open-source platform 3D Slicer, for its tools to perform the similarity analysis algorithm. Amazon Web Services was used for the cloud instances on which the analyses were performed, as well as for storage of the radiation therapy studies and messaging between the instances and a master local computer. A module was built in SlicerRT to provide the user with an interface to direct the system on the cloud, as well as to perform other related tasks. RESULTS: The cloud-based system out-performed previous methods of conducting the similarity analyses in terms of time, as it analyzed 100 studies in approximately 13 minutes, and produced the same similarity values as those methods. It also scaled up to larger numbers of studies to analyze in the database with a small increase in computation time of just over 2 minutes. CONCLUSION: This system successfully analyzes a large database of radiation therapy studies and finds the one that is most similar to a new patient, which represents a potential step forward in achieving feasible adaptive radiation therapy replanning.

  16. Developing a Cloud-Based Online Geospatial Information Sharing and Geoprocessing Platform to Facilitate Collaborative Education and Research

    NASA Astrophysics Data System (ADS)

    Yang, Z. L.; Cao, J.; Hu, K.; Gui, Z. P.; Wu, H. Y.; You, L.

    2016-06-01

    Efficient online discovering and applying geospatial information resources (GIRs) is critical in Earth Science domain as while for cross-disciplinary applications. However, to achieve it is challenging due to the heterogeneity, complexity and privacy of online GIRs. In this article, GeoSquare, a collaborative online geospatial information sharing and geoprocessing platform, was developed to tackle this problem. Specifically, (1) GIRs registration and multi-view query functions allow users to publish and discover GIRs more effectively. (2) Online geoprocessing and real-time execution status checking help users process data and conduct analysis without pre-installation of cumbersome professional tools on their own machines. (3) A service chain orchestration function enables domain experts to contribute and share their domain knowledge with community members through workflow modeling. (4) User inventory management allows registered users to collect and manage their own GIRs, monitor their execution status, and track their own geoprocessing histories. Besides, to enhance the flexibility and capacity of GeoSquare, distributed storage and cloud computing technologies are employed. To support interactive teaching and training, GeoSquare adopts the rich internet application (RIA) technology to create user-friendly graphical user interface (GUI). Results show that GeoSquare can integrate and foster collaboration between dispersed GIRs, computing resources and people. Subsequently, educators and researchers can share and exchange resources in an efficient and harmonious way.

  17. Cloud Computing: A model Construct of Real-Time Monitoring for Big Dataset Analytics Using Apache Spark

    NASA Astrophysics Data System (ADS)

    Alkasem, Ameen; Liu, Hongwei; Zuo, Decheng; Algarash, Basheer

    2018-01-01

    The volume of data being collected, analyzed, and stored has exploded in recent years, in particular in relation to the activity on the cloud computing. While large-scale data processing, analysis, storage, and platform model such as cloud computing were previously and currently are increasingly. Today, the major challenge is it address how to monitor and control these massive amounts of data and perform analysis in real-time at scale. The traditional methods and model systems are unable to cope with these quantities of data in real-time. Here we present a new methodology for constructing a model for optimizing the performance of real-time monitoring of big datasets, which includes a machine learning algorithms and Apache Spark Streaming to accomplish fine-grained fault diagnosis and repair of big dataset. As a case study, we use the failure of Virtual Machines (VMs) to start-up. The methodology proposition ensures that the most sensible action is carried out during the procedure of fine-grained monitoring and generates the highest efficacy and cost-saving fault repair through three construction control steps: (I) data collection; (II) analysis engine and (III) decision engine. We found that running this novel methodology can save a considerate amount of time compared to the Hadoop model, without sacrificing the classification accuracy or optimization of performance. The accuracy of the proposed method (92.13%) is an improvement on traditional approaches.

  18. A Chaotic Particle Swarm Optimization-Based Heuristic for Market-Oriented Task-Level Scheduling in Cloud Workflow Systems.

    PubMed

    Li, Xuejun; Xu, Jia; Yang, Yun

    2015-01-01

    Cloud workflow system is a kind of platform service based on cloud computing. It facilitates the automation of workflow applications. Between cloud workflow system and its counterparts, market-oriented business model is one of the most prominent factors. The optimization of task-level scheduling in cloud workflow system is a hot topic. As the scheduling is a NP problem, Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) have been proposed to optimize the cost. However, they have the characteristic of premature convergence in optimization process and therefore cannot effectively reduce the cost. To solve these problems, Chaotic Particle Swarm Optimization (CPSO) algorithm with chaotic sequence and adaptive inertia weight factor is applied to present the task-level scheduling. Chaotic sequence with high randomness improves the diversity of solutions, and its regularity assures a good global convergence. Adaptive inertia weight factor depends on the estimate value of cost. It makes the scheduling avoid premature convergence by properly balancing between global and local exploration. The experimental simulation shows that the cost obtained by our scheduling is always lower than the other two representative counterparts.

  19. Design Patterns to Achieve 300x Speedup for Oceanographic Analytics in the Cloud

    NASA Astrophysics Data System (ADS)

    Jacob, J. C.; Greguska, F. R., III; Huang, T.; Quach, N.; Wilson, B. D.

    2017-12-01

    We describe how we achieve super-linear speedup over standard approaches for oceanographic analytics on a cluster computer and the Amazon Web Services (AWS) cloud. NEXUS is an open source platform for big data analytics in the cloud that enables this performance through a combination of horizontally scalable data parallelism with Apache Spark and rapid data search, subset, and retrieval with tiled array storage in cloud-aware NoSQL databases like Solr and Cassandra. NEXUS is the engine behind several public portals at NASA and OceanWorks is a newly funded project for the ocean community that will mature and extend this capability for improved data discovery, subset, quality screening, analysis, matchup of satellite and in situ measurements, and visualization. We review the Python language API for Spark and how to use it to quickly convert existing programs to use Spark to run with cloud-scale parallelism, and discuss strategies to improve performance. We explain how partitioning the data over space, time, or both leads to algorithmic design patterns for Spark analytics that can be applied to many different algorithms. We use NEXUS analytics as examples, including area-averaged time series, time averaged map, and correlation map.

  20. A Chaotic Particle Swarm Optimization-Based Heuristic for Market-Oriented Task-Level Scheduling in Cloud Workflow Systems

    PubMed Central

    Li, Xuejun; Xu, Jia; Yang, Yun

    2015-01-01

    Cloud workflow system is a kind of platform service based on cloud computing. It facilitates the automation of workflow applications. Between cloud workflow system and its counterparts, market-oriented business model is one of the most prominent factors. The optimization of task-level scheduling in cloud workflow system is a hot topic. As the scheduling is a NP problem, Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) have been proposed to optimize the cost. However, they have the characteristic of premature convergence in optimization process and therefore cannot effectively reduce the cost. To solve these problems, Chaotic Particle Swarm Optimization (CPSO) algorithm with chaotic sequence and adaptive inertia weight factor is applied to present the task-level scheduling. Chaotic sequence with high randomness improves the diversity of solutions, and its regularity assures a good global convergence. Adaptive inertia weight factor depends on the estimate value of cost. It makes the scheduling avoid premature convergence by properly balancing between global and local exploration. The experimental simulation shows that the cost obtained by our scheduling is always lower than the other two representative counterparts. PMID:26357510

  1. PepArML: A Meta-Search Peptide Identification Platform

    PubMed Central

    Edwards, Nathan J.

    2014-01-01

    The PepArML meta-search peptide identification platform provides a unified search interface to seven search engines; a robust cluster, grid, and cloud computing scheduler for large-scale searches; and an unsupervised, model-free, machine-learning-based result combiner, which selects the best peptide identification for each spectrum, estimates false-discovery rates, and outputs pepXML format identifications. The meta-search platform supports Mascot; Tandem with native, k-score, and s-score scoring; OMSSA; MyriMatch; and InsPecT with MS-GF spectral probability scores — reformatting spectral data and constructing search configurations for each search engine on the fly. The combiner selects the best peptide identification for each spectrum based on search engine results and features that model enzymatic digestion, retention time, precursor isotope clusters, mass accuracy, and proteotypic peptide properties, requiring no prior knowledge of feature utility or weighting. The PepArML meta-search peptide identification platform often identifies 2–3 times more spectra than individual search engines at 10% FDR. PMID:25663956

  2. An Overview of Cloud Computing in Distributed Systems

    NASA Astrophysics Data System (ADS)

    Divakarla, Usha; Kumari, Geetha

    2010-11-01

    Cloud computing is the emerging trend in the field of distributed computing. Cloud computing evolved from grid computing and distributed computing. Cloud plays an important role in huge organizations in maintaining huge data with limited resources. Cloud also helps in resource sharing through some specific virtual machines provided by the cloud service provider. This paper gives an overview of the cloud organization and some of the basic security issues pertaining to the cloud.

  3. Helix Nebula - the Science Cloud: a public-private partnership to build a multidisciplinary cloud platform for data intensive science

    NASA Astrophysics Data System (ADS)

    Jones, Bob; Casu, Francesco

    2013-04-01

    The feasibility of using commercial cloud services for scientific research is of great interest to research organisations such as CERN, ESA and EMBL, to the suppliers of cloud-based services and to the national and European funding agencies. Through the Helix Nebula - the Science Cloud [1] initiative and with the support of the European Commission, these stakeholders are driving a two year pilot-phase during which procurement processes and governance issues for a framework of public/private partnership will be appraised. Three initial flagship use cases from high energy physics, molecular biology and earth-observation are being used to validate the approach, enable a cost-benefit analysis to be undertaken and prepare the next stage of the Science Cloud Strategic Plan [2] to be developed and approved. The power of Helix Nebula lies in a shared set of services for initially 3 very different sciences each supporting a global community and thus building a common e-Science platform. Of particular relevance is the ESA sponsored flagship application SuperSites Exploitation Platform (SSEP [3]) that offers the global geo-hazard community a common platform for the correlation and processing of observation data for supersites monitoring. The US-NSF Earth Cube [4] and Ocean Observatory Initiative [5] (OOI) are taking a similar approach for data intensive science. The work of Helix Nebula and its recent architecture model [6] has shown that is it technically feasible to allow publicly funded infrastructures, such as EGI [7] and GEANT [8], to interoperate with commercial cloud services. Such hybrid systems are in the interest of the existing users of publicly funded infrastructures and funding agencies because they will provide "freedom of choice" over the type of computing resources to be consumed and the manner in which they can be obtained. But to offer such freedom-of choice across a spectrum of suppliers, various issues such as intellectual property, legal responsibility, service quality agreements and related issues need to be addressed. Investigating these issues is one of the goals of the Helix Nebula initiative. The next generation of researchers will put aside the historical categorisation of research as a neatly defined set of disciplines and integrate the data from different sources and instruments into complex models that are as applicable to earth observation or biomedicine as they are to high-energy physics. This aggregation of datasets and development of new models will accelerate scientific development but will only be possible if the issues of data intensive science described above are addressed. The culture of science has the possibility to develop with the availability of Helix Nebula as a "Science Cloud" because: • Large scale datasets from many disciplines will be accessible • Scientists and others will be able to develop and contribute open source tools to expand the set of services available • Collaboration of scientists will take place around the on-demand availability of data, tools and services • Cross-domain research will advance at a faster pace due to the availability of a common platform. References: 1 http://www.helix-nebula.eu/ 2 http://cdsweb.cern.ch/record/1374172/files/CERN-OPEN-2011-036.pdf 3 http://www.helix-nebula.eu/index.php/helix-nebula-use-cases/uc3.html 4 http://www.nsf.gov/geo/earthcube/ 5 http://www.oceanobservatories.org/ 6 http://cdsweb.cern.ch/record/1478364/files/HelixNebula-NOTE-2012-001.pdf 7 http://www.nsf.gov/geo/earthcube/ 8 http://www.geant.net/

  4. JobCenter: an open source, cross-platform, and distributed job queue management system optimized for scalability and versatility.

    PubMed

    Jaschob, Daniel; Riffle, Michael

    2012-07-30

    Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. JobCenter is a client-server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls or "in the cloud") and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, which provides tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multistep workflows. JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/.

  5. Analysis on the security of cloud computing

    NASA Astrophysics Data System (ADS)

    He, Zhonglin; He, Yuhua

    2011-02-01

    Cloud computing is a new technology, which is the fusion of computer technology and Internet development. It will lead the revolution of IT and information field. However, in cloud computing data and application software is stored at large data centers, and the management of data and service is not completely trustable, resulting in safety problems, which is the difficult point to improve the quality of cloud service. This paper briefly introduces the concept of cloud computing. Considering the characteristics of cloud computing, it constructs the security architecture of cloud computing. At the same time, with an eye toward the security threats cloud computing faces, several corresponding strategies are provided from the aspect of cloud computing users and service providers.

  6. Ship-based Observations of Turbulence and Stratocumulus Cloud Microphysics in the SE Pacific Ocean from the VOCALS Field Program

    NASA Astrophysics Data System (ADS)

    Fairall, C. W.; Williams, C.; Grachev, A. A.; Brewer, A.; Choukulkar, A.

    2013-12-01

    The VAMOS (VOCALS) field program involved deployment of several measurement systems based on ships, land and aircraft over the SE Pacific Ocean. The NOAA Ship Ronald H. Brown was the primary platform for surface based measurements which included the High Resolution Doppler Lidar (HRDL) and the motion-stabilized 94-GHz cloud Doppler radar (W-band radar). In this paper, the data from the W-band radar will be used to study the turbulent and microphysical structure of the stratocumulus clouds prevalent in the region. The radar data consists of a 3 Hz time series of radar parameters (backscatter coefficient, mean Doppler shift, and Doppler width) at 175 range gates (25-m spacing). Several statistical methods to de-convolve the turbulent velocity and gravitational settling velocity are examined and an optimized algorithm is developed. 20 days of observations are processed to examine in-cloud profiles of mean turbulent statistics (vertical velocity variance, skewness, dissipation rate) in terms of surface fluxes and estimates of entrainment and cloudtop radiative cooling. The clean separation of turbulent and fall velocities will allow us to compute time-averaged drizzle-drop size spectra within and below the cloud that are significantly superior to previous attempts with surface-based marine cloud radar observations.

  7. Satellite Imagery Analysis for Automated Global Food Security Forecasting

    NASA Astrophysics Data System (ADS)

    Moody, D.; Brumby, S. P.; Chartrand, R.; Keisler, R.; Mathis, M.; Beneke, C. M.; Nicholaeff, D.; Skillman, S.; Warren, M. S.; Poehnelt, J.

    2017-12-01

    The recent computing performance revolution has driven improvements in sensor, communication, and storage technology. Multi-decadal remote sensing datasets at the petabyte scale are now available in commercial clouds, with new satellite constellations generating petabytes/year of daily high-resolution global coverage imagery. Cloud computing and storage, combined with recent advances in machine learning, are enabling understanding of the world at a scale and at a level of detail never before feasible. We present results from an ongoing effort to develop satellite imagery analysis tools that aggregate temporal, spatial, and spectral information and that can scale with the high-rate and dimensionality of imagery being collected. We focus on the problem of monitoring food crop productivity across the Middle East and North Africa, and show how an analysis-ready, multi-sensor data platform enables quick prototyping of satellite imagery analysis algorithms, from land use/land cover classification and natural resource mapping, to yearly and monthly vegetative health change trends at the structural field level.

  8. High-performance parallel approaches for three-dimensional light detection and ranging point clouds gridding

    NASA Astrophysics Data System (ADS)

    Rizki, Permata Nur Miftahur; Lee, Heezin; Lee, Minsu; Oh, Sangyoon

    2017-01-01

    With the rapid advance of remote sensing technology, the amount of three-dimensional point-cloud data has increased extraordinarily, requiring faster processing in the construction of digital elevation models. There have been several attempts to accelerate the computation using parallel methods; however, little attention has been given to investigating different approaches for selecting the most suited parallel programming model for a given computing environment. We present our findings and insights identified by implementing three popular high-performance parallel approaches (message passing interface, MapReduce, and GPGPU) on time demanding but accurate kriging interpolation. The performances of the approaches are compared by varying the size of the grid and input data. In our empirical experiment, we demonstrate the significant acceleration by all three approaches compared to a C-implemented sequential-processing method. In addition, we also discuss the pros and cons of each method in terms of usability, complexity infrastructure, and platform limitation to give readers a better understanding of utilizing those parallel approaches for gridding purposes.

  9. Vision 20/20: Automation and advanced computing in clinical radiation oncology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moore, Kevin L., E-mail: kevinmoore@ucsd.edu; Moiseenko, Vitali; Kagadis, George C.

    This Vision 20/20 paper considers what computational advances are likely to be implemented in clinical radiation oncology in the coming years and how the adoption of these changes might alter the practice of radiotherapy. Four main areas of likely advancement are explored: cloud computing, aggregate data analyses, parallel computation, and automation. As these developments promise both new opportunities and new risks to clinicians and patients alike, the potential benefits are weighed against the hazards associated with each advance, with special considerations regarding patient safety under new computational platforms and methodologies. While the concerns of patient safety are legitimate, the authorsmore » contend that progress toward next-generation clinical informatics systems will bring about extremely valuable developments in quality improvement initiatives, clinical efficiency, outcomes analyses, data sharing, and adaptive radiotherapy.« less

  10. Vision 20/20: Automation and advanced computing in clinical radiation oncology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moore, Kevin L., E-mail: kevinmoore@ucsd.edu; Moiseenko, Vitali; Kagadis, George C.

    2014-01-15

    This Vision 20/20 paper considers what computational advances are likely to be implemented in clinical radiation oncology in the coming years and how the adoption of these changes might alter the practice of radiotherapy. Four main areas of likely advancement are explored: cloud computing, aggregate data analyses, parallel computation, and automation. As these developments promise both new opportunities and new risks to clinicians and patients alike, the potential benefits are weighed against the hazards associated with each advance, with special considerations regarding patient safety under new computational platforms and methodologies. While the concerns of patient safety are legitimate, the authorsmore » contend that progress toward next-generation clinical informatics systems will bring about extremely valuable developments in quality improvement initiatives, clinical efficiency, outcomes analyses, data sharing, and adaptive radiotherapy.« less

  11. Enabling a new Paradigm to Address Big Data and Open Science Challenges

    NASA Astrophysics Data System (ADS)

    Ramamurthy, Mohan; Fisher, Ward

    2017-04-01

    Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers also need to give scientists an ecosystem that includes data, tools, workflows and other services needed to perform analytics, integration, interpretation, and synthesis - all in the same environment or platform. Instead of moving data to processing systems near users, as is the tradition, the cloud permits one to bring processing, computing, analysis and visualization to data - so called data proximate workbench capabilities, also known as server-side processing. In this talk, I will present the ongoing work at Unidata to facilitate a new paradigm for doing science by offering a suite of tools, resources, and platforms to leverage cloud technologies for addressing both big data and Open Science/reproducibility challenges. That work includes the development and deployment of new protocols for data access and server-side operations and Docker container images of key applications, JupyterHub Python notebook tools, and cloud-based analysis and visualization capability via the CloudIDV tool to enable reproducible workflows and effectively use the accessed data.

  12. A Web platform for the interactive visualization and analysis of the 3D fractal dimension of MRI data.

    PubMed

    Jiménez, J; López, A M; Cruz, J; Esteban, F J; Navas, J; Villoslada, P; Ruiz de Miras, J

    2014-10-01

    This study presents a Web platform (http://3dfd.ujaen.es) for computing and analyzing the 3D fractal dimension (3DFD) from volumetric data in an efficient, visual and interactive way. The Web platform is specially designed for working with magnetic resonance images (MRIs) of the brain. The program estimates the 3DFD by calculating the 3D box-counting of the entire volume of the brain, and also of its 3D skeleton. All of this is done in a graphical, fast and optimized way by using novel technologies like CUDA and WebGL. The usefulness of the Web platform presented is demonstrated by its application in a case study where an analysis and characterization of groups of 3D MR images is performed for three neurodegenerative diseases: Multiple Sclerosis, Intrauterine Growth Restriction and Alzheimer's disease. To the best of our knowledge, this is the first Web platform that allows the users to calculate, visualize, analyze and compare the 3DFD from MRI images in the cloud. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Future of Department of Defense Cloud Computing Amid Cultural Confusion

    DTIC Science & Technology

    2013-03-01

    enterprise cloud - computing environment and transition to a public cloud service provider. Services have started the development of individual cloud - computing environments...endorsing cloud computing . It addresses related issues in matters of service culture changes and how strategic leaders will dictate the future of cloud ...through data center consolidation and individual Service provided cloud computing .

  14. Monte Carlo verification of radiotherapy treatments with CloudMC.

    PubMed

    Miras, Hector; Jiménez, Rubén; Perales, Álvaro; Terrón, José Antonio; Bertolet, Alejandro; Ortiz, Antonio; Macías, José

    2018-06-27

    A new implementation has been made on CloudMC, a cloud-based platform presented in a previous work, in order to provide services for radiotherapy treatment verification by means of Monte Carlo in a fast, easy and economical way. A description of the architecture of the application and the new developments implemented is presented together with the results of the tests carried out to validate its performance. CloudMC has been developed over Microsoft Azure cloud. It is based on a map/reduce implementation for Monte Carlo calculations distribution over a dynamic cluster of virtual machines in order to reduce calculation time. CloudMC has been updated with new methods to read and process the information related to radiotherapy treatment verification: CT image set, treatment plan, structures and dose distribution files in DICOM format. Some tests have been designed in order to determine, for the different tasks, the most suitable type of virtual machines from those available in Azure. Finally, the performance of Monte Carlo verification in CloudMC is studied through three real cases that involve different treatment techniques, linac models and Monte Carlo codes. Considering computational and economic factors, D1_v2 and G1 virtual machines were selected as the default type for the Worker Roles and the Reducer Role respectively. Calculation times up to 33 min and costs of 16 € were achieved for the verification cases presented when a statistical uncertainty below 2% (2σ) was required. The costs were reduced to 3-6 € when uncertainty requirements are relaxed to 4%. Advantages like high computational power, scalability, easy access and pay-per-usage model, make Monte Carlo cloud-based solutions, like the one presented in this work, an important step forward to solve the long-lived problem of truly introducing the Monte Carlo algorithms in the daily routine of the radiotherapy planning process.

  15. Metabolizing Data in the Cloud.

    PubMed

    Warth, Benedikt; Levin, Nadine; Rinehart, Duane; Teijaro, John; Benton, H Paul; Siuzdak, Gary

    2017-06-01

    Cloud-based bioinformatic platforms address the fundamental demands of creating a flexible scientific environment, facilitating data processing and general accessibility independent of a countries' affluence. These platforms have a multitude of advantages as demonstrated by omics technologies, helping to support both government and scientific mandates of a more open environment. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Hydrodynamics and Water Quality forecasting over a Cloud Computing environment: INDIGO-DataCloud

    NASA Astrophysics Data System (ADS)

    Aguilar Gómez, Fernando; de Lucas, Jesús Marco; García, Daniel; Monteoliva, Agustín

    2017-04-01

    Algae Bloom due to eutrophication is an extended problem for water reservoirs and lakes that impacts directly in water quality. It can create a dead zone that lacks enough oxygen to support life and it can also be human harmful, so it must be controlled in water masses for supplying, bathing or other uses. Hydrodynamic and Water Quality modelling can contribute to forecast the status of the water system in order to alert authorities before an algae bloom event occurs. It can be used to predict scenarios and find solutions to reduce the harmful impact of the blooms. High resolution models need to process a big amount of data using a robust enough computing infrastructure. INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is an European Commission funded project that aims at developing a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The project addresses the development of solutions for different Case Studies using different Cloud-based alternatives. In the first INDIGO software release, a set of components are ready to manage the deployment of services to perform N number of Delft3D simulations (for calibrating or scenario definition) over a Cloud Computing environment, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator, AAI (Authorization, Authentication) and OneData (Distributed Storage System). Moreover, the Future Gateway portal based on Liferay, provides an user-friendly interface where the user can configure the simulations. Due to the data approach of INDIGO, the developed solutions can contribute to manage the full data life cycle of a project, thanks to different tools to manage datasets or even metadata. Furthermore, the cloud environment contributes to provide a dynamic, scalable and easy-to-use framework for non-IT experts users. This framework is potentially capable to automatize the processing of forecasting applying periodic tasks. For instance, a user can forecast every month the hydrodynamics and water quality status of a reservoir starting from a base model and supplying new data gathered from the instrumentation or observations. This interactive presentation aims to show the use of INDIGO solutions in a particular forecasting use case and to inspire others in the use of a Cloud framework for their applications.

  17. Cloud Based Web 3d GIS Taiwan Platform

    NASA Astrophysics Data System (ADS)

    Tsai, W.-F.; Chang, J.-Y.; Yan, S. Y.; Chen, B.

    2011-09-01

    This article presents the status of the web 3D GIS platform, which has been developed in the National Applied Research Laboratories. The purpose is to develop a global earth observation 3D GIS platform for applications to disaster monitoring and assessment in Taiwan. For quick response to preliminary and detailed assessment after a natural disaster occurs, the web 3D GIS platform is useful to access, transfer, integrate, display and analyze the multi-scale huge data following the international OGC standard. The framework of cloud service for data warehousing management and efficiency enhancement using VMWare is illustrated in this article.

  18. ABA-Cloud: support for collaborative breath research

    PubMed Central

    Elsayed, Ibrahim; Ludescher, Thomas; King, Julian; Ager, Clemens; Trosin, Michael; Senocak, Uygar; Brezany, Peter; Feilhauer, Thomas; Amann, Anton

    2016-01-01

    This paper introduces the advanced breath analysis (ABA) platform, an innovative scientific research platform for the entire breath research domain. Within the ABA project, we are investigating novel data management concepts and semantic web technologies to document breath analysis studies for the long run as well as to enable their full automatic reproducibility. We propose several concept taxonomies (a hierarchical order of terms from a glossary of terms), which can be seen as a first step toward the definition of conceptualized terms commonly used by the international community of breath researchers. They build the basis for the development of an ontology (a concept from computer science used for communication between machines and/or humans and representation and reuse of knowledge) dedicated to breath research. PMID:23619467

  19. ABA-Cloud: support for collaborative breath research.

    PubMed

    Elsayed, Ibrahim; Ludescher, Thomas; King, Julian; Ager, Clemens; Trosin, Michael; Senocak, Uygar; Brezany, Peter; Feilhauer, Thomas; Amann, Anton

    2013-06-01

    This paper introduces the advanced breath analysis (ABA) platform, an innovative scientific research platform for the entire breath research domain. Within the ABA project, we are investigating novel data management concepts and semantic web technologies to document breath analysis studies for the long run as well as to enable their full automatic reproducibility. We propose several concept taxonomies (a hierarchical order of terms from a glossary of terms), which can be seen as a first step toward the definition of conceptualized terms commonly used by the international community of breath researchers. They build the basis for the development of an ontology (a concept from computer science used for communication between machines and/or humans and representation and reuse of knowledge) dedicated to breath research.

  20. Cloud-based large-scale air traffic flow optimization

    NASA Astrophysics Data System (ADS)

    Cao, Yi

    The ever-increasing traffic demand makes the efficient use of airspace an imperative mission, and this paper presents an effort in response to this call. Firstly, a new aggregate model, called Link Transmission Model (LTM), is proposed, which models the nationwide traffic as a network of flight routes identified by origin-destination pairs. The traversal time of a flight route is assumed to be the mode of distribution of historical flight records, and the mode is estimated by using Kernel Density Estimation. As this simplification abstracts away physical trajectory details, the complexity of modeling is drastically decreased, resulting in efficient traffic forecasting. The predicative capability of LTM is validated against recorded traffic data. Secondly, a nationwide traffic flow optimization problem with airport and en route capacity constraints is formulated based on LTM. The optimization problem aims at alleviating traffic congestions with minimal global delays. This problem is intractable due to millions of variables. A dual decomposition method is applied to decompose the large-scale problem such that the subproblems are solvable. However, the whole problem is still computational expensive to solve since each subproblem is an smaller integer programming problem that pursues integer solutions. Solving an integer programing problem is known to be far more time-consuming than solving its linear relaxation. In addition, sequential execution on a standalone computer leads to linear runtime increase when the problem size increases. To address the computational efficiency problem, a parallel computing framework is designed which accommodates concurrent executions via multithreading programming. The multithreaded version is compared with its monolithic version to show decreased runtime. Finally, an open-source cloud computing framework, Hadoop MapReduce, is employed for better scalability and reliability. This framework is an "off-the-shelf" parallel computing model that can be used for both offline historical traffic data analysis and online traffic flow optimization. It provides an efficient and robust platform for easy deployment and implementation. A small cloud consisting of five workstations was configured and used to demonstrate the advantages of cloud computing in dealing with large-scale parallelizable traffic problems.

  1. A resilient and secure software platform and architecture for distributed spacecraft

    NASA Astrophysics Data System (ADS)

    Otte, William R.; Dubey, Abhishek; Karsai, Gabor

    2014-06-01

    A distributed spacecraft is a cluster of independent satellite modules flying in formation that communicate via ad-hoc wireless networks. This system in space is a cloud platform that facilitates sharing sensors and other computing and communication resources across multiple applications, potentially developed and maintained by different organizations. Effectively, such architecture can realize the functions of monolithic satellites at a reduced cost and with improved adaptivity and robustness. Openness of these architectures pose special challenges because the distributed software platform has to support applications from different security domains and organizations, and where information flows have to be carefully managed and compartmentalized. If the platform is used as a robust shared resource its management, configuration, and resilience becomes a challenge in itself. We have designed and prototyped a distributed software platform for such architectures. The core element of the platform is a new operating system whose services were designed to restrict access to the network and the file system, and to enforce resource management constraints for all non-privileged processes Mixed-criticality applications operating at different security labels are deployed and controlled by a privileged management process that is also pre-configuring all information flows. This paper describes the design and objective of this layer.

  2. Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in ultrascale environments

    DOE PAGES

    Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...

    2016-10-06

    The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less

  3. A new airborne sampler for interstitial particles in ice and liquid clouds

    NASA Astrophysics Data System (ADS)

    Moharreri, A.; Craig, L.; Rogers, D. C.; Brown, M.; Dhaniyala, S.

    2011-12-01

    In-situ measurements of cloud droplets and aerosols using aircraft platforms are required for understanding aerosol-cloud processes and aiding development of improved aerosol-cloud models. A variety of clouds with different temperature ranges and cloud particle sizes/phases must be studied for comprehensive knowledge about the role of aerosols in the formation and evolution of cloud systems under different atmospheric conditions. While representative aerosol measurements are regularly made from aircrafts under clear air conditions, aerosol measurements in clouds are often contaminated by the generation of secondary particles from the high speed impaction of ice particles and liquid droplets on the surfaces of the aircraft probes/inlets. A new interstitial particle sampler, called the blunt-body aerosol sampler (BASE) has been designed and used for aerosol sampling during two recent airborne campaigns using NCAR/NSF C-130 aircraft: PLOWS (2009-2010) and ICE-T (2011). Central to the design of the new interstitial inlet is an upstream blunt body housing that acts to shield/deflect large cloud droplets and ice particles from an aft sampling region. The blunt-body design also ensures that small shatter particles created from the impaction of cloud-droplets on the blunt-body are not present in the aft region where the interstitial inlet is located. Computational fluid dynamics (CFD) simulations along with particle transport modeling and wind tunnel studies have been utilized in different stages of design and development of this inlet. The initial flights tests during the PLOWS campaign showed that the inlet had satisfactory performance only in warm clouds and when large precipitation droplets were absent. In the presence of large droplets and ice, the inlet samples were contaminated with significant shatter artifacts. These initial results were reanalyzed in conjunction with a computational droplet shatter model and the numerical results were used to arrive at an improved sampler design. Analysis of the data from the recent ICE-T campaign with the improved sampler design shows that the modified version of BASE can provide shatter-artifact free sampling of aerosol particles in the presence of ice particles and significantly reduced shatter artifacts in warm clouds. Detailed design and modeling aspects of the sampler will be discussed and the sampler performance in warm and cold clouds will be presented and compared with measurements made using other aerosol inlets flown on the NCAR/NSF C-130 aircraft.

  4. Cloud Computing for radiologists.

    PubMed

    Kharat, Amit T; Safvi, Amjad; Thind, Ss; Singh, Amarjit

    2012-07-01

    Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future.

  5. Cloud Computing for radiologists

    PubMed Central

    Kharat, Amit T; Safvi, Amjad; Thind, SS; Singh, Amarjit

    2012-01-01

    Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future. PMID:23599560

  6. Automatic co-registration of 3D multi-sensor point clouds

    NASA Astrophysics Data System (ADS)

    Persad, Ravi Ancil; Armenakis, Costas

    2017-08-01

    We propose an approach for the automatic coarse alignment of 3D point clouds which have been acquired from various platforms. The method is based on 2D keypoint matching performed on height map images of the point clouds. Initially, a multi-scale wavelet keypoint detector is applied, followed by adaptive non-maxima suppression. A scale, rotation and translation-invariant descriptor is then computed for all keypoints. The descriptor is built using the log-polar mapping of Gabor filter derivatives in combination with the so-called Rapid Transform. In the final step, source and target height map keypoint correspondences are determined using a bi-directional nearest neighbour similarity check, together with a threshold-free modified-RANSAC. Experiments with urban and non-urban scenes are presented and results show scale errors ranging from 0.01 to 0.03, 3D rotation errors in the order of 0.2° to 0.3° and 3D translation errors from 0.09 m to 1.1 m.

  7. Clinician accessible tools for GUI computational models of transcranial electrical stimulation: BONSAI and SPHERES.

    PubMed

    Truong, Dennis Q; Hüber, Mathias; Xie, Xihe; Datta, Abhishek; Rahman, Asif; Parra, Lucas C; Dmochowski, Jacek P; Bikson, Marom

    2014-01-01

    Computational models of brain current flow during transcranial electrical stimulation (tES), including transcranial direct current stimulation (tDCS) and transcranial alternating current stimulation (tACS), are increasingly used to understand and optimize clinical trials. We propose that broad dissemination requires a simple graphical user interface (GUI) software that allows users to explore and design montages in real-time, based on their own clinical/experimental experience and objectives. We introduce two complimentary open-source platforms for this purpose: BONSAI and SPHERES. BONSAI is a web (cloud) based application (available at neuralengr.com/bonsai) that can be accessed through any flash-supported browser interface. SPHERES (available at neuralengr.com/spheres) is a stand-alone GUI application that allow consideration of arbitrary montages on a concentric sphere model by leveraging an analytical solution. These open-source tES modeling platforms are designed go be upgraded and enhanced. Trade-offs between open-access approaches that balance ease of access, speed, and flexibility are discussed. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Sentinel-1 Interferometry from the Cloud to the Scientist

    NASA Astrophysics Data System (ADS)

    Garron, J.; Stoner, C.; Johnston, A.; Arko, S. A.

    2017-12-01

    Big data problems and solutions are growing in the technological and scientific sectors daily. Cloud computing is a vertically and horizontally scalable solution available now for archiving and processing large volumes of data quickly, without significant on-site computing hardware costs. Be that as it may, the conversion of scientific data processors to these powerful platforms requires not only the proof of concept, but the demonstration of credibility in an operational setting. The Alaska Satellite Facility (ASF) Distributed Active Archive Center (DAAC), in partnership with NASA's Jet Propulsion Laboratory, is exploring the functional architecture of Amazon Web Services cloud computing environment for the processing, distribution and archival of Synthetic Aperture Radar data in preparation for the NASA-ISRO Synthetic Aperture Radar (NISAR) Mission. Leveraging built-in AWS services for logging, monitoring and dashboarding, the GRFN (Getting Ready for NISAR) team has built a scalable processing, distribution and archival system of Sentinel-1 L2 interferograms produced using the ISCE algorithm. This cloud-based functional prototype provides interferograms over selected global land deformation features (volcanoes, land subsidence, seismic zones) and are accessible to scientists via NASA's EarthData Search client and the ASF DAACs primary SAR interface, Vertex, for direct download. The interferograms are produced using nearest-neighbor logic for identifying pairs of granules for interferometric processing, creating deep stacks of BETA products from almost every satellite orbit for scientists to explore. This presentation highlights the functional lessons learned to date from this exercise, including the cost analysis of various data lifecycle policies as implemented through AWS. While demonstrating the architecture choices in support of efficient big science data management, we invite feedback and questions about the process and products from the InSAR community.

  9. Cloud-based calculators for fast and reliable access to NOAA's geomagnetic field models

    NASA Astrophysics Data System (ADS)

    Woods, A.; Nair, M. C.; Boneh, N.; Chulliat, A.

    2017-12-01

    While the Global Positioning System (GPS) provides accurate point locations, it does not provide pointing directions. Therefore, the absolute directional information provided by the Earth's magnetic field is of primary importance for navigation and for the pointing of technical devices such as aircrafts, satellites and lately, mobile phones. The major magnetic sources that affect compass-based navigation are the Earth's core, its magnetized crust and the electric currents in the ionosphere and magnetosphere. NOAA/CIRES Geomagnetism (ngdc.noaa.gov/geomag/) group develops and distributes models that describe all these important sources to aid navigation. Our geomagnetic models are used in variety of platforms including airplanes, ships, submarines and smartphones. While the magnetic field from Earth's core can be described in relatively fewer parameters and is suitable for offline computation, the magnetic sources from Earth's crust, ionosphere and magnetosphere require either significant computational resources or real-time capabilities and are not suitable for offline calculation. This is especially important for small navigational devices or embedded systems, where computational resources are limited. Recognizing the need for a fast and reliable access to our geomagnetic field models, we developed cloud-based application program interfaces (APIs) for NOAA's ionospheric and magnetospheric magnetic field models. In this paper we will describe the need for reliable magnetic calculators, the challenges faced in running geomagnetic field models in the cloud in real-time and the feedback from our user community. We discuss lessons learned harvesting and validating the data which powers our cloud services, as well as our strategies for maintaining near real-time service, including load-balancing, real-time monitoring, and instance cloning. We will also briefly talk about the progress we achieved on NOAA's Big Earth Data Initiative (BEDI) funded project to develop API interface to our Enhanced Magnetic Model (EMM).

  10. Classification of large-scale fundus image data sets: a cloud-computing framework.

    PubMed

    Roychowdhury, Sohini

    2016-08-01

    Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.

  11. Simple re-instantiation of small databases using cloud computing.

    PubMed

    Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

    2013-01-01

    Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

  12. Simple re-instantiation of small databases using cloud computing

    PubMed Central

    2013-01-01

    Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380

  13. Open NASA Earth Exchange (OpenNEX): Strategies for enabling cross organization collaboration in the earth sciences

    NASA Astrophysics Data System (ADS)

    Michaelis, A.; Ganguly, S.; Nemani, R. R.; Votava, P.; Wang, W.; Lee, T. J.; Dungan, J. L.

    2014-12-01

    Sharing community-valued codes, intermediary datasets and results from individual efforts with others that are not in a direct funded collaboration can be a challenge. Cross organization collaboration is often impeded due to infrastructure security constraints, rigid financial controls, bureaucracy, and workforce nationalities, etc., which can force groups to work in a segmented fashion and/or through awkward and suboptimal web services. We show how a focused community may come together, share modeling and analysis codes, computing configurations, scientific results, knowledge and expertise on a public cloud platform; diverse groups of researchers working together at "arms length". Through the OpenNEX experimental workshop, users can view short technical "how-to" videos and explore encapsulated working environment. Workshop participants can easily instantiate Amazon Machine Images (AMI) or launch full cluster and data processing configurations within minutes. Enabling users to instantiate computing environments from configuration templates on large public cloud infrastructures, such as Amazon Web Services, may provide a mechanism for groups to easily use each others work and collaborate indirectly. Moreover, using the public cloud for this workshop allowed a single group to host a large read only data archive, making datasets of interest to the community widely available on the public cloud, enabling other groups to directly connect to the data and reduce the costs of the collaborative work by freeing other individual groups from redundantly retrieving, integrating or financing the storage of the datasets of interest.

  14. The Application of a Cloud-Based Student, Teacher, and Parent Platform in English as a Foreign Language Education

    ERIC Educational Resources Information Center

    Chiu, Fu-Yuan

    2014-01-01

    This study constructed a cloud-based student, teacher, and parent platform (CSTPP) in collaboration with a Taiwanese textbook publisher. Junior high school students' attitudes to learning English using the developed system were subsequently examined. The study participants were divided into 3 groups: Those in Group A employed the CSTPP with…

  15. A Hybrid Cloud Computing Service for Earth Sciences

    NASA Astrophysics Data System (ADS)

    Yang, C. P.

    2016-12-01

    Cloud Computing is becoming a norm for providing computing capabilities for advancing Earth sciences including big Earth data management, processing, analytics, model simulations, and many other aspects. A hybrid spatiotemporal cloud computing service is bulit at George Mason NSF spatiotemporal innovation center to meet this demands. This paper will report the service including several aspects: 1) the hardware includes 500 computing services and close to 2PB storage as well as connection to XSEDE Jetstream and Caltech experimental cloud computing environment for sharing the resource; 2) the cloud service is geographically distributed at east coast, west coast, and central region; 3) the cloud includes private clouds managed using open stack and eucalyptus, DC2 is used to bridge these and the public AWS cloud for interoperability and sharing computing resources when high demands surfing; 4) the cloud service is used to support NSF EarthCube program through the ECITE project, ESIP through the ESIP cloud computing cluster, semantics testbed cluster, and other clusters; 5) the cloud service is also available for the earth science communities to conduct geoscience. A brief introduction about how to use the cloud service will be included.

  16. A Novel Deployment Method for Communication-Intensive Applications in Service Clouds

    PubMed Central

    Liu, Chuanchang; Yang, Jingqi

    2014-01-01

    The service platforms are migrating to clouds for reasonably solving long construction periods, low resource utilizations, and isolated constructions of service platforms. However, when the migration is conducted in service clouds, there is a little focus of deploying communication-intensive applications in previous deployment methods. To address this problem, this paper proposed the combination of the online deployment and the offline deployment for deploying communication-intensive applications in service clouds. Firstly, the system architecture was designed for implementing the communication-aware deployment method for communication-intensive applications in service clouds. Secondly, in the online-deployment algorithm and the offline-deployment algorithm, service instances were deployed in an optimal cloud node based on the communication overhead which is determined by the communication traffic between services, as well as the communication performance between cloud nodes. Finally, the experimental results demonstrated that the proposed methods deployed communication-intensive applications effectively with lower latency and lower load compared with existing algorithms. PMID:25140331

  17. A novel deployment method for communication-intensive applications in service clouds.

    PubMed

    Liu, Chuanchang; Yang, Jingqi

    2014-01-01

    The service platforms are migrating to clouds for reasonably solving long construction periods, low resource utilizations, and isolated constructions of service platforms. However, when the migration is conducted in service clouds, there is a little focus of deploying communication-intensive applications in previous deployment methods. To address this problem, this paper proposed the combination of the online deployment and the offline deployment for deploying communication-intensive applications in service clouds. Firstly, the system architecture was designed for implementing the communication-aware deployment method for communication-intensive applications in service clouds. Secondly, in the online-deployment algorithm and the offline-deployment algorithm, service instances were deployed in an optimal cloud node based on the communication overhead which is determined by the communication traffic between services, as well as the communication performance between cloud nodes. Finally, the experimental results demonstrated that the proposed methods deployed communication-intensive applications effectively with lower latency and lower load compared with existing algorithms.

  18. Do Clouds Compute? A Framework for Estimating the Value of Cloud Computing

    NASA Astrophysics Data System (ADS)

    Klems, Markus; Nimis, Jens; Tai, Stefan

    On-demand provisioning of scalable and reliable compute services, along with a cost model that charges consumers based on actual service usage, has been an objective in distributed computing research and industry for a while. Cloud Computing promises to deliver on this objective: consumers are able to rent infrastructure in the Cloud as needed, deploy applications and store data, and access them via Web protocols on a pay-per-use basis. The acceptance of Cloud Computing, however, depends on the ability for Cloud Computing providers and consumers to implement a model for business value co-creation. Therefore, a systematic approach to measure costs and benefits of Cloud Computing is needed. In this paper, we discuss the need for valuation of Cloud Computing, identify key components, and structure these components in a framework. The framework assists decision makers in estimating Cloud Computing costs and to compare these costs to conventional IT solutions. We demonstrate by means of representative use cases how our framework can be applied to real world scenarios.

  19. Satellite Sounder-Based OLR-, Cloud- and Atmospheric Temperature Climatologies for Climate Analyses

    NASA Technical Reports Server (NTRS)

    Molnar, Gyula I.; Susskind, Joel

    2006-01-01

    Global energy balance of the Earth-atmosphere system may change due to natural and man-made climate variations. For example, changes in the outgoing longwave radiation (OLR) can be regarded as a crucial indicator of climate variations. Clouds play an important role -still insufficiently assessed in the global energy balance on all spatial and temporal scales, and satellites provide an ideal platform to measure cloud and large-scale atmospheric variables simultaneously. The TOVS series of satellites were the first to provide this type of information since 1979. OLR [Mehta and Susskind], cloud cover and cloud top pressure [Susskind et al] are among the key climatic parameters computed by the TOVS Pathfinder Path-A algorithm using mainly the retrieved temperature and moisture profiles. AIRS, regarded as the new and improved TOVS , has a much higher spectral resolution and greater S/N ratio, retrieving climatic parameters with higher accuracy. First we present encouraging agreements between MODIS and AIRS cloud top pressure (C(sub tp) and effective (A(sub eff), a product of infrared emissivity at 11 microns and physical cloud cover or A(sub c)) cloud fraction seasonal and interannual variabilities for selected months. Next we present validation efforts and preliminary trend analyses of TOVS-retrieved C(sub tp) and A(sub eff). For example, decadal global trends of the TOVS Path-A and ISCCP-D2 P(sub c), and A(sub eff)/A(sub c), values are similar. Furthermore, the TOVS Path-A and ISCCP-AVHRR [available since 19831 cloud fractions correlate even more strongly, including regional trends. We also present TOVS and AIRS OLR validation effort results and (for the longer-term TOVS Pathfinder Path-A dataset) trend analyses. OLR interannual spatial variabilities from the available state-of-the-art CERES measurements and both from the AIRS [Susskind et al] and TOVS OLR computations are in remarkably good agreement. Global monthly mean CERES and TOVS OLR time series show very good agreement in absolute values also. Finally, we will assess correlations among long-term trends of selected parameters, derived simultaneously from the TOVS Pathfinder Path-A datase

  20. Cloud Computing Applications in Support of Earth Science Activities at Marshall Space Flight Center

    NASA Astrophysics Data System (ADS)

    Molthan, A.; Limaye, A. S.

    2011-12-01

    Currently, the NASA Nebula Cloud Computing Platform is available to Agency personnel in a pre-release status as the system undergoes a formal operational readiness review. Over the past year, two projects within the Earth Science Office at NASA Marshall Space Flight Center have been investigating the performance and value of Nebula's "Infrastructure as a Service", or "IaaS" concept and applying cloud computing concepts to advance their respective mission goals. The Short-term Prediction Research and Transition (SPoRT) Center focuses on the transition of unique NASA satellite observations and weather forecasting capabilities for use within the operational forecasting community through partnerships with NOAA's National Weather Service (NWS). SPoRT has evaluated the performance of the Weather Research and Forecasting (WRF) model on virtual machines deployed within Nebula and used Nebula instances to simulate local forecasts in support of regional forecast studies of interest to select NWS forecast offices. In addition to weather forecasting applications, rapidly deployable Nebula virtual machines have supported the processing of high resolution NASA satellite imagery to support disaster assessment following the historic severe weather and tornado outbreak of April 27, 2011. Other modeling and satellite analysis activities are underway in support of NASA's SERVIR program, which integrates satellite observations, ground-based data and forecast models to monitor environmental change and improve disaster response in Central America, the Caribbean, Africa, and the Himalayas. Leveraging SPoRT's experience, SERVIR is working to establish a real-time weather forecasting model for Central America. Other modeling efforts include hydrologic forecasts for Kenya, driven by NASA satellite observations and reanalysis data sets provided by the broader meteorological community. Forecast modeling efforts are supplemented by short-term forecasts of convective initiation, determined by geostationary satellite observations processed on virtual machines powered by Nebula. This presentation will provide an overview of these activities from a scientific and cloud computing applications perspective, identifying the strengths and weaknesses for deploying each project within an IaaS environment, and ways to collaborate with the Nebula or other cloud-user communities to collaborate on projects as they go forward.

  1. Cloud Computing and Its Applications in GIS

    NASA Astrophysics Data System (ADS)

    Kang, Cao

    2011-12-01

    Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)

  2. High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

    NASA Astrophysics Data System (ADS)

    Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

    2012-09-01

    By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data, and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.

  3. IBM Cloud Computing Powering a Smarter Planet

    NASA Astrophysics Data System (ADS)

    Zhu, Jinzy; Fang, Xing; Guo, Zhe; Niu, Meng Hua; Cao, Fan; Yue, Shuang; Liu, Qin Yu

    With increasing need for intelligent systems supporting the world's businesses, Cloud Computing has emerged as a dominant trend to provide a dynamic infrastructure to make such intelligence possible. The article introduced how to build a smarter planet with cloud computing technology. First, it introduced why we need cloud, and the evolution of cloud technology. Secondly, it analyzed the value of cloud computing and how to apply cloud technology. Finally, it predicted the future of cloud in the smarter planet.

  4. Cloud Computing Security Issue: Survey

    NASA Astrophysics Data System (ADS)

    Kamal, Shailza; Kaur, Rajpreet

    2011-12-01

    Cloud computing is the growing field in IT industry since 2007 proposed by IBM. Another company like Google, Amazon, and Microsoft provides further products to cloud computing. The cloud computing is the internet based computing that shared recourses, information on demand. It provides the services like SaaS, IaaS and PaaS. The services and recourses are shared by virtualization that run multiple operation applications on cloud computing. This discussion gives the survey on the challenges on security issues during cloud computing and describes some standards and protocols that presents how security can be managed.

  5. The information science of microbial ecology.

    PubMed

    Hahn, Aria S; Konwar, Kishori M; Louca, Stilianos; Hanson, Niels W; Hallam, Steven J

    2016-06-01

    A revolution is unfolding in microbial ecology where petabytes of 'multi-omics' data are produced using next generation sequencing and mass spectrometry platforms. This cornucopia of biological information has enormous potential to reveal the hidden metabolic powers of microbial communities in natural and engineered ecosystems. However, to realize this potential, the development of new technologies and interpretative frameworks grounded in ecological design principles are needed to overcome computational and analytical bottlenecks. Here we explore the relationship between microbial ecology and information science in the era of cloud-based computation. We consider microorganisms as individual information processing units implementing a distributed metabolic algorithm and describe developments in ecoinformatics and ubiquitous computing with the potential to eliminate bottlenecks and empower knowledge creation and translation. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  6. Bringing education to your virtual doorstep

    NASA Astrophysics Data System (ADS)

    Kaurov, Vitaliy

    2013-03-01

    We currently witness significant migration of academic resources towards online CMS, social networking, and high-end computerized education. This happens for traditional academic programs as well as for outreach initiatives. The talk will go over a set of innovative integrated technologies, many of which are free. These were developed by Wolfram Research in order to facilitate and enhance the learning process in mathematical and physical sciences. Topics include: cloud computing with Mathematica Online; natural language programming; interactive educational resources and web publishing at the Wolfram Demonstrations Project; the computational knowledge engine Wolfram Alpha; Computable Document Format (CDF) and self-publishing with interactive e-books; course assistant apps for mobile platforms. We will also discuss outreach programs where such technologies are extensively used, such as the Wolfram Science Summer School and the Mathematica Summer Camp.

  7. T-Check in System-of-Systems Technologies: Cloud Computing

    DTIC Science & Technology

    2010-09-01

    T-Check in System-of-Systems Technologies: Cloud Computing Harrison D. Strowd Grace A. Lewis September 2010 TECHNICAL NOTE CMU/SEI-2010... Cloud Computing 1 1.2 Types of Cloud Computing 2 1.3 Drivers and Barriers to Cloud Computing Adoption 5 2 Using the T-Check Method 7 2.1 T-Check...Hypothesis 3 25 3.4.2 Deployment View of the Solution for Testing Hypothesis 3 27 3.5 Selecting Cloud Computing Providers 30 3.6 Implementing the T-Check

  8. Storage quality-of-service in cloud-based scientific environments: a standardization approach

    NASA Astrophysics Data System (ADS)

    Millar, Paul; Fuhrmann, Patrick; Hardt, Marcus; Ertl, Benjamin; Brzezniak, Maciej

    2017-10-01

    When preparing the Data Management Plan for larger scientific endeavors, PIs have to balance between the most appropriate qualities of storage space along the line of the planned data life-cycle, its price and the available funding. Storage properties can be the media type, implicitly determining access latency and durability of stored data, the number and locality of replicas, as well as available access protocols or authentication mechanisms. Negotiations between the scientific community and the responsible infrastructures generally happen upfront, where the amount of storage space, media types, like: disk, tape and SSD and the foreseeable data life-cycles are negotiated. With the introduction of cloud management platforms, both in computing and storage, resources can be brokered to achieve the best price per unit of a given quality. However, in order to allow the platform orchestrator to programmatically negotiate the most appropriate resources, a standard vocabulary for different properties of resources and a commonly agreed protocol to communicate those, has to be available. In order to agree on a basic vocabulary for storage space properties, the storage infrastructure group in INDIGO-DataCloud together with INDIGO-associated and external scientific groups, created a working group under the umbrella of the Research Data Alliance (RDA). As communication protocol, to query and negotiate storage qualities, the Cloud Data Management Interface (CDMI) has been selected. Necessary extensions to CDMI are defined in regular meetings between INDIGO and the Storage Network Industry Association (SNIA). Furthermore, INDIGO is contributing to the SNIA CDMI reference implementation as the basis for interfacing the various storage systems in INDIGO to the agreed protocol and to provide an official Open-Source skeleton for systems not being maintained by INDIGO partners.

  9. Information Security: Governmentwide Guidance Needed to Assist Agencies in Implementing Cloud Computing

    DTIC Science & Technology

    2010-07-01

    Cloud computing , an emerging form of computing in which users have access to scalable, on-demand capabilities that are provided through Internet... cloud computing , (2) the information security implications of using cloud computing services in the Federal Government, and (3) federal guidance and...efforts to address information security when using cloud computing . The complete report is titled Information Security: Federal Guidance Needed to

  10. Flexible Description Language for HPC based Processing of Remote Sense Data

    NASA Astrophysics Data System (ADS)

    Nandra, Constantin; Gorgan, Dorian; Bacu, Victor

    2016-04-01

    When talking about Big Data, the most challenging aspect lays in processing them in order to gain new insight, find new patterns and gain knowledge from them. This problem is likely most apparent in the case of Earth Observation (EO) data. With ever higher numbers of data sources and increasing data acquisition rates, dealing with EO data is indeed a challenge [1]. Geoscientists should address this challenge by using flexible and efficient tools and platforms. To answer this trend, the BigEarth project [2] aims to combine the advantages of high performance computing solutions with flexible processing description methodologies in order to reduce both task execution times and task definition time and effort. As a component of the BigEarth platform, WorDeL (Workflow Description Language) [3] is intended to offer a flexible, compact and modular approach to the task definition process. WorDeL, unlike other description alternatives such as Python or shell scripts, is oriented towards the description topologies, using them as abstractions for the processing programs. This feature is intended to make it an attractive alternative for users lacking in programming experience. By promoting modular designs, WorDeL not only makes the processing descriptions more user-readable and intuitive, but also helps organizing the processing tasks into independent sub-tasks, which can be executed in parallel on multi-processor platforms in order to improve execution times. As a BigEarth platform [4] component, WorDeL represents the means by which the user interacts with the system, describing processing algorithms in terms of existing operators and workflows [5], which are ultimately translated into sets of executable commands. The WorDeL language has been designed to help in the definition of compute-intensive, batch tasks which can be distributed and executed on high-performance, cloud or grid-based architectures in order to improve the processing time. Main references for further information: [1] Gorgan, D., "Flexible and Adaptive Processing of Earth Observation Data over High Performance Computation Architectures", International Conference and Exhibition Satellite 2015, August 17-19, Houston, Texas, USA. [2] Bigearth project - flexible processing of big earth data over high performance computing architectures. http://cgis.utcluj.ro/bigearth, (2014) [3] Nandra, C., Gorgan, D., "Workflow Description Language for Defining Big Earth Data Processing Tasks", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 461-468, (2015). [4] Bacu, V., Stefan, T., Gorgan, D., "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015). [5] Mihon, D., Bacu, V., Colceriu, V., Gorgan, D., "Modeling of Earth Observation Use Cases through the KEOPS System", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 455-460, (2015).

  11. Modeling the topography of shallow braided rivers using Structure-from-Motion photogrammetry

    NASA Astrophysics Data System (ADS)

    Javernick, L.; Brasington, J.; Caruso, B.

    2014-05-01

    Recent advances in computer vision and image analysis have led to the development of a novel, fully automated photogrammetric method to generate dense 3d point cloud data. This approach, termed Structure-from-Motion or SfM, requires only limited ground-control and is ideally suited to imagery obtained from low-cost, non-metric cameras acquired either at close-range or using aerial platforms. Terrain models generated using SfM have begun to emerge recently and with a growing spectrum of software now available, there is an urgent need to provide a robust quality assessment of the data products generated using standard field and computational workflows. To address this demand, we present a detailed error analysis of sub-meter resolution terrain models of two contiguous reaches (1.6 and 1.7 km long) of the braided Ahuriri River, New Zealand, generated using SfM. A six stage methodology is described, involving: i) hand-held image acquisition from an aerial platform, ii) 3d point cloud extraction modeling using Agisoft PhotoScan, iii) georeferencing on a redundant network of GPS-surveyed ground-control points, iv) point cloud filtering to reduce computational demand as well as reduce vegetation noise, v) optical bathymetric modeling of inundated areas; and vi) data fusion and surface modeling to generate sub-meter raster terrain models. Bootstrapped geo-registration as well as extensive distributed GPS and sonar-based bathymetric check-data were used to quantify the quality of the models generated after each processing step. The results obtained provide the first quantified analysis of SfM applied to model the complex terrain of a braided river. Results indicate that geo-registration errors of 0.04 m (planar) and 0.10 m (elevation) and vertical surface errors of 0.10 m in non-vegetation areas can be achieved from a dataset of photographs taken at 600 m and 800 m above the ground level. These encouraging results suggest that this low-cost, logistically simple method can deliver high quality terrain datasets competitive with those obtained with significantly more expensive laser scanning, and suitable for geomorphic change detection and hydrodynamic modeling.

  12. Scaling up ATLAS Event Service to production levels on opportunistic computing platforms

    NASA Astrophysics Data System (ADS)

    Benjamin, D.; Caballero, J.; Ernst, M.; Guan, W.; Hover, J.; Lesny, D.; Maeno, T.; Nilsson, P.; Tsulaia, V.; van Gemmeren, P.; Vaniachine, A.; Wang, F.; Wenaus, T.; ATLAS Collaboration

    2016-10-01

    Continued growth in public cloud and HPC resources is on track to exceed the dedicated resources available for ATLAS on the WLCG. Examples of such platforms are Amazon AWS EC2 Spot Instances, Edison Cray XC30 supercomputer, backfill at Tier 2 and Tier 3 sites, opportunistic resources at the Open Science Grid (OSG), and ATLAS High Level Trigger farm between the data taking periods. Because of specific aspects of opportunistic resources such as preemptive job scheduling and data I/O, their efficient usage requires workflow innovations provided by the ATLAS Event Service. Thanks to the finer granularity of the Event Service data processing workflow, the opportunistic resources are used more efficiently. We report on our progress in scaling opportunistic resource usage to double-digit levels in ATLAS production.

  13. Big Data Analytics with Datalog Queries on Spark.

    PubMed

    Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

    2016-01-01

    There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics.

  14. Big Data Analytics with Datalog Queries on Spark

    PubMed Central

    Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

    2017-01-01

    There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics. PMID:28626296

  15. A Computing Infrastructure for Supporting Climate Studies

    NASA Astrophysics Data System (ADS)

    Yang, C.; Bambacus, M.; Freeman, S. M.; Huang, Q.; Li, J.; Sun, M.; Xu, C.; Wojcik, G. S.; Cahalan, R. F.; NASA Climate @ Home Project Team

    2011-12-01

    Climate change is one of the major challenges facing us on the Earth planet in the 21st century. Scientists build many models to simulate the past and predict the climate change for the next decades or century. Most of the models are at a low resolution with some targeting high resolution in linkage to practical climate change preparedness. To calibrate and validate the models, millions of model runs are needed to find the best simulation and configuration. This paper introduces the NASA effort on Climate@Home project to build a supercomputer based-on advanced computing technologies, such as cloud computing, grid computing, and others. Climate@Home computing infrastructure includes several aspects: 1) a cloud computing platform is utilized to manage the potential spike access to the centralized components, such as grid computing server for dispatching and collecting models runs results; 2) a grid computing engine is developed based on MapReduce to dispatch models, model configuration, and collect simulation results and contributing statistics; 3) a portal serves as the entry point for the project to provide the management, sharing, and data exploration for end users; 4) scientists can access customized tools to configure model runs and visualize model results; 5) the public can access twitter and facebook to get the latest about the project. This paper will introduce the latest progress of the project and demonstrate the operational system during the AGU fall meeting. It will also discuss how this technology can become a trailblazer for other climate studies and relevant sciences. It will share how the challenges in computation and software integration were solved.

  16. Risk in the Clouds?: Security Issues Facing Government Use of Cloud Computing

    NASA Astrophysics Data System (ADS)

    Wyld, David C.

    Cloud computing is poised to become one of the most important and fundamental shifts in how computing is consumed and used. Forecasts show that government will play a lead role in adopting cloud computing - for data storage, applications, and processing power, as IT executives seek to maximize their returns on limited procurement budgets in these challenging economic times. After an overview of the cloud computing concept, this article explores the security issues facing public sector use of cloud computing and looks to the risk and benefits of shifting to cloud-based models. It concludes with an analysis of the challenges that lie ahead for government use of cloud resources.

  17. Tourism guide cloud service quality: What actually delights customers?

    PubMed

    Lin, Shu-Ping; Yang, Chen-Lung; Pi, Han-Chung; Ho, Thao-Minh

    2016-01-01

    The emergence of advanced IT and cloud services has beneficially supported the information-intensive tourism industry, simultaneously caused extreme competitions in attracting customers through building efficient service platforms. On response, numerous nations have implemented cloud platforms to provide value-added sightseeing information and personal intelligent service experiences. Despite these efforts, customers' actual perspectives have yet been sufficiently understood. To bridge the gap, this study attempts to investigate what aspects of tourism cloud services actually delight customers' satisfaction and loyalty. 336 valid survey questionnaire answers were analyzed using structural equation modeling method. The results prove positive impacts of function quality, enjoyment, multiple visual aids, and information quality on customers' satisfaction as well as of enjoyment and satisfaction on use loyalty. The findings hope to provide helpful references of customer use behaviors for enhancing cloud service quality in order to achieve better organizational competitiveness.

  18. A Review Study on Cloud Computing Issues

    NASA Astrophysics Data System (ADS)

    Kanaan Kadhim, Qusay; Yusof, Robiah; Sadeq Mahdi, Hamid; Al-shami, Sayed Samer Ali; Rahayu Selamat, Siti

    2018-05-01

    Cloud computing is the most promising current implementation of utility computing in the business world, because it provides some key features over classic utility computing, such as elasticity to allow clients dynamically scale-up and scale-down the resources in execution time. Nevertheless, cloud computing is still in its premature stage and experiences lack of standardization. The security issues are the main challenges to cloud computing adoption. Thus, critical industries such as government organizations (ministries) are reluctant to trust cloud computing due to the fear of losing their sensitive data, as it resides on the cloud with no knowledge of data location and lack of transparency of Cloud Service Providers (CSPs) mechanisms used to secure their data and applications which have created a barrier against adopting this agile computing paradigm. This study aims to review and classify the issues that surround the implementation of cloud computing which a hot area that needs to be addressed by future research.

  19. Sharing health data in Belgium: A home care case study using the Vitalink platform.

    PubMed

    De Backere, Femke; Bonte, Pieter; Verstichel, Stijn; Ongenae, Femke; De Turck, Filip

    2018-01-01

    In 2013, the Flemish Government launched the Vitalink platform. This initiative focuses on the sharing of health and welfare data to support primary healthcare. In this paper, the objectives and mission of the Vitalink initiative are discussed. Security and privacy measures are reviewed, and the technical implementation of the Vitalink platform is presented. Through a case study, the possibility of interaction with cloud solutions for healthcare is also investigated upon; this was initially not the focus of Vitalink. The Vitalink initiative provides support for secure data sharing in primary healthcare, which in the long term will improve the efficiency of care and will decrease costs. Based on the results of the case study, Vitalink allowed cloud solutions or applications not providing end-to-end security to use their system. The most important lesson learned during this research was the need for firm regulations and stipulations for cloud solutions to interact with the Vitalink platform. However, these are currently still vague.

  20. Microphysical properties and ice particle morphology of cirrus clouds inferred from combined CALIOP-IIR measurements

    NASA Astrophysics Data System (ADS)

    Saito, M.; Iwabuchi, H.; Yang, P.; Tang, G.; King, M. D.; Sekiguchi, M.

    2016-12-01

    Cirrus clouds cover about 25% of the globe. Knowledge about the optical and microphysical properties of these clouds [particularly, optical thickness (COT) and effective radius (CER)] is essential to radiative forcing assessment. Previous studies of those properties using satellite remote sensing techniques based on observations by passive and active sensors gave inconsistent retrievals. In particular, COTs from the Cloud Aerosol Lidar with Orthogonal Polarization (CALIOP) using the unconstrained method are affected by variable particle morphology, especially the fraction of horizontally oriented plate particles (HPLT), because the method assumes the lidar ratio to be constant, which should have different values for different ice particle shapes. More realistic ice particle morphology improves estimates of the optical and microphysical properties. In this study, we develop an optimal estimation-based algorithm to infer cirrus COT and CER in addition to morphological parameters (e.g., Fraction of HPLT) using the observations made by CALIOP and the Infrared Imaging Radiometer (IIR) on the CALIPSO platform. The assumed ice particle model is a mixture of a few habits with variable HPLT. Ice particle single-scattering properties are computed using state-of-the-art light-scattering computational capabilities. Rigorous estimation of uncertainties associated with surface properties, atmospheric gases and cloud heterogeneity is performed. The results based on the present method show that COTs are quite consistent with the MODIS and CALIOP counterparts, and CERs essentially agree with the IIR operational retrievals. The lidar ratio is calculated from the bulk optical properties based on the inferred parameters. The presentation will focus on latitudinal variations of particle morphology and the lidar ratio on a global scale.

Top