Evaluating the Efficacy of the Cloud for Cluster Computation
NASA Technical Reports Server (NTRS)
Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom
2012-01-01
Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Application of microarray analysis on computer cluster and cloud platforms.
Bernau, C; Boulesteix, A-L; Knaus, J
2013-01-01
Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
A Hybrid Cloud Computing Service for Earth Sciences
NASA Astrophysics Data System (ADS)
Yang, C. P.
2016-12-01
Cloud Computing is becoming a norm for providing computing capabilities for advancing Earth sciences including big Earth data management, processing, analytics, model simulations, and many other aspects. A hybrid spatiotemporal cloud computing service is bulit at George Mason NSF spatiotemporal innovation center to meet this demands. This paper will report the service including several aspects: 1) the hardware includes 500 computing services and close to 2PB storage as well as connection to XSEDE Jetstream and Caltech experimental cloud computing environment for sharing the resource; 2) the cloud service is geographically distributed at east coast, west coast, and central region; 3) the cloud includes private clouds managed using open stack and eucalyptus, DC2 is used to bridge these and the public AWS cloud for interoperability and sharing computing resources when high demands surfing; 4) the cloud service is used to support NSF EarthCube program through the ECITE project, ESIP through the ESIP cloud computing cluster, semantics testbed cluster, and other clusters; 5) the cloud service is also available for the earth science communities to conduct geoscience. A brief introduction about how to use the cloud service will be included.
Construction and application of Red5 cluster based on OpenStack
NASA Astrophysics Data System (ADS)
Wang, Jiaqing; Song, Jianxin
2017-08-01
With the application and development of cloud computing technology in various fields, the resource utilization rate of the data center has been improved obviously, and the system based on cloud computing platform has also improved the expansibility and stability. In the traditional way, Red5 cluster resource utilization is low and the system stability is poor. This paper uses cloud computing to efficiently calculate the resource allocation ability, and builds a Red5 server cluster based on OpenStack. Multimedia applications can be published to the Red5 cloud server cluster. The system achieves the flexible construction of computing resources, but also greatly improves the stability of the cluster and service efficiency.
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Galaxy CloudMan: delivering cloud compute clusters.
Afgan, Enis; Baker, Dannon; Coraor, Nate; Chapman, Brad; Nekrutenko, Anton; Taylor, James
2010-12-21
Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is "cloud computing", which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate "as is" use by experimental biologists. We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.
Galaxy CloudMan: delivering cloud compute clusters
2010-01-01
Background Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is “cloud computing”, which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate “as is” use by experimental biologists. Results We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon’s EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. Conclusions The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge. PMID:21210983
High-performance scientific computing in the cloud
NASA Astrophysics Data System (ADS)
Jorissen, Kevin; Vila, Fernando; Rehr, John
2011-03-01
Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
Dynamic Extension of a Virtualized Cluster by using Cloud Resources
NASA Astrophysics Data System (ADS)
Oberst, Oliver; Hauth, Thomas; Kernert, David; Riedel, Stephan; Quast, Günter
2012-12-01
The specific requirements concerning the software environment within the HEP community constrain the choice of resource providers for the outsourcing of computing infrastructure. The use of virtualization in HPC clusters and in the context of cloud resources is therefore a subject of recent developments in scientific computing. The dynamic virtualization of worker nodes in common batch systems provided by ViBatch serves each user with a dynamically virtualized subset of worker nodes on a local cluster. Now it can be transparently extended by the use of common open source cloud interfaces like OpenNebula or Eucalyptus, launching a subset of the virtual worker nodes within the cloud. This paper demonstrates how a dynamically virtualized computing cluster is combined with cloud resources by attaching remotely started virtual worker nodes to the local batch system.
GATE Monte Carlo simulation in a cloud computing environment
NASA Astrophysics Data System (ADS)
Rowedder, Blake Austin
The GEANT4-based GATE is a unique and powerful Monte Carlo (MC) platform, which provides a single code library allowing the simulation of specific medical physics applications, e.g. PET, SPECT, CT, radiotherapy, and hadron therapy. However, this rigorous yet flexible platform is used only sparingly in the clinic due to its lengthy calculation time. By accessing the powerful computational resources of a cloud computing environment, GATE's runtime can be significantly reduced to clinically feasible levels without the sizable investment of a local high performance cluster. This study investigated a reliable and efficient execution of GATE MC simulations using a commercial cloud computing services. Amazon's Elastic Compute Cloud was used to launch several nodes equipped with GATE. Job data was initially broken up on the local computer, then uploaded to the worker nodes on the cloud. The results were automatically downloaded and aggregated on the local computer for display and analysis. Five simulations were repeated for every cluster size between 1 and 20 nodes. Ultimately, increasing cluster size resulted in a decrease in calculation time that could be expressed with an inverse power model. Comparing the benchmark results to the published values and error margins indicated that the simulation results were not affected by the cluster size and thus that integrity of a calculation is preserved in a cloud computing environment. The runtime of a 53 minute long simulation was decreased to 3.11 minutes when run on a 20-node cluster. The ability to improve the speed of simulation suggests that fast MC simulations are viable for imaging and radiotherapy applications. With high power computing continuing to lower in price and accessibility, implementing Monte Carlo techniques with cloud computing for clinical applications will continue to become more attractive.
Operating Dedicated Data Centers - Is It Cost-Effective?
NASA Astrophysics Data System (ADS)
Ernst, M.; Hogue, R.; Hollowell, C.; Strecker-Kellog, W.; Wong, A.; Zaytsev, A.
2014-06-01
The advent of cloud computing centres such as Amazon's EC2 and Google's Computing Engine has elicited comparisons with dedicated computing clusters. Discussions on appropriate usage of cloud resources (both academic and commercial) and costs have ensued. This presentation discusses a detailed analysis of the costs of operating and maintaining the RACF (RHIC and ATLAS Computing Facility) compute cluster at Brookhaven National Lab and compares them with the cost of cloud computing resources under various usage scenarios. An extrapolation of likely future cost effectiveness of dedicated computing resources is also presented.
Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds
NASA Astrophysics Data System (ADS)
Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.
In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.
Tidal disruption of open clusters in their parent molecular clouds
NASA Technical Reports Server (NTRS)
Long, Kevin
1989-01-01
A simple model of tidal encounters has been applied to the problem of an open cluster in a clumpy molecular cloud. The parameters of the clumps are taken from the Blitz, Stark, and Long (1988) catalog of clumps in the Rosette molecular cloud. Encounters are modeled as impulsive, rectilinear collisions between Plummer spheres, but the tidal approximation is not invoked. Mass and binding energy changes during an encounter are computed by considering the velocity impulses given to individual stars in a random realization of a Plummer sphere. Mean rates of mass and binding energy loss are then computed by integrating over many encounters. Self-similar evolutionary calculations using these rates indicate that the disruption process is most sensitive to the cluster radius and relatively insensitive to cluster mass. The calculations indicate that clusters which are born in a cloud similar to the Rosette with a cluster radius greater than about 2.5 pc will not survive long enough to leave the cloud. The majority of clusters, however, have smaller radii and will survive the passage through their parent cloud.
Dynamic VM Provisioning for TORQUE in a Cloud Environment
NASA Astrophysics Data System (ADS)
Zhang, S.; Boland, L.; Coddington, P.; Sevior, M.
2014-06-01
Cloud computing, also known as an Infrastructure-as-a-Service (IaaS), is attracting more interest from the commercial and educational sectors as a way to provide cost-effective computational infrastructure. It is an ideal platform for researchers who must share common resources but need to be able to scale up to massive computational requirements for specific periods of time. This paper presents the tools and techniques developed to allow the open source TORQUE distributed resource manager and Maui cluster scheduler to dynamically integrate OpenStack cloud resources into existing high throughput computing clusters.
Oh, Jeongsu; Choi, Chi-Hwan; Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology-a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.
Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology–a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr. PMID:26954507
Genotyping in the cloud with Crossbow.
Gurtowski, James; Schatz, Michael C; Langmead, Ben
2012-09-01
Crossbow is a scalable, portable, and automatic cloud computing tool for identifying SNPs from high-coverage, short-read resequencing data. It is built on Apache Hadoop, an implementation of the MapReduce software framework. Hadoop allows Crossbow to distribute read alignment and SNP calling subtasks over a cluster of commodity computers. Two robust tools, Bowtie and SOAPsnp, implement the fundamental alignment and variant calling operations respectively, and have demonstrated capabilities within Crossbow of analyzing approximately one billion short reads per hour on a commodity Hadoop cluster with 320 cores. Through protocol examples, this unit will demonstrate the use of Crossbow for identifying variations in three different operating modes: on a Hadoop cluster, on a single computer, and on the Amazon Elastic MapReduce cloud computing service.
Scaling predictive modeling in drug development with cloud computing.
Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola
2015-01-26
Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Z; Gao, M
Purpose: Monte Carlo simulation plays an important role for proton Pencil Beam Scanning (PBS) technique. However, MC simulation demands high computing power and is limited to few large proton centers that can afford a computer cluster. We study the feasibility of utilizing cloud computing in the MC simulation of PBS beams. Methods: A GATE/GEANT4 based MC simulation software was installed on a commercial cloud computing virtual machine (Linux 64-bits, Amazon EC2). Single spot Integral Depth Dose (IDD) curves and in-air transverse profiles were used to tune the source parameters to simulate an IBA machine. With the use of StarCluster softwaremore » developed at MIT, a Linux cluster with 2–100 nodes can be conveniently launched in the cloud. A proton PBS plan was then exported to the cloud where the MC simulation was run. Results: The simulated PBS plan has a field size of 10×10cm{sup 2}, 20cm range, 10cm modulation, and contains over 10,000 beam spots. EC2 instance type m1.medium was selected considering the CPU/memory requirement and 40 instances were used to form a Linux cluster. To minimize cost, master node was created with on-demand instance and worker nodes were created with spot-instance. The hourly cost for the 40-node cluster was $0.63 and the projected cost for a 100-node cluster was $1.41. Ten million events were simulated to plot PDD and profile, with each job containing 500k events. The simulation completed within 1 hour and an overall statistical uncertainty of < 2% was achieved. Good agreement between MC simulation and measurement was observed. Conclusion: Cloud computing is a cost-effective and easy to maintain platform to run proton PBS MC simulation. When proton MC packages such as GATE and TOPAS are combined with cloud computing, it will greatly facilitate the pursuing of PBS MC studies, especially for newly established proton centers or individual researchers.« less
An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing
Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing
2014-01-01
With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971
Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing.
Trudgian, David C; Mirzaei, Hamid
2012-12-07
We have extended the functionality of the Central Proteomics Facilities Pipeline (CPFP) to allow use of remote cloud and high performance computing (HPC) resources for shotgun proteomics data processing. CPFP has been modified to include modular local and remote scheduling for data processing jobs. The pipeline can now be run on a single PC or server, a local cluster, a remote HPC cluster, and/or the Amazon Web Services (AWS) cloud. We provide public images that allow easy deployment of CPFP in its entirety in the AWS cloud. This significantly reduces the effort necessary to use the software, and allows proteomics laboratories to pay for compute time ad hoc, rather than obtaining and maintaining expensive local server clusters. Alternatively the Amazon cloud can be used to increase the throughput of a local installation of CPFP as necessary. We demonstrate that cloud CPFP allows users to process data at higher speed than local installations but with similar cost and lower staff requirements. In addition to the computational improvements, the web interface to CPFP is simplified, and other functionalities are enhanced. The software is under active development at two leading institutions and continues to be released under an open-source license at http://cpfp.sourceforge.net.
DESPIC: Detecting Early Signatures of Persuasion in Information Cascades
2015-08-27
over NoSQL Databases, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). 26-MAY-14, . : , P...over NoSQL Databases. Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). Chicago, IL, USA...distributed NoSQL databases including HBase and Riak, we finalized the requirements of the optimal computational architecture to support our framework
Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.
Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P
2010-12-22
Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.
Cloud Computing for Pharmacometrics: Using AWS, NONMEM, PsN, Grid Engine, and Sonic
Sanduja, S; Jewell, P; Aron, E; Pharai, N
2015-01-01
Cloud computing allows pharmacometricians to access advanced hardware, network, and security resources available to expedite analysis and reporting. Cloud-based computing environments are available at a fraction of the time and effort when compared to traditional local datacenter-based solutions. This tutorial explains how to get started with building your own personal cloud computer cluster using Amazon Web Services (AWS), NONMEM, PsN, Grid Engine, and Sonic. PMID:26451333
Cloud Computing for Pharmacometrics: Using AWS, NONMEM, PsN, Grid Engine, and Sonic.
Sanduja, S; Jewell, P; Aron, E; Pharai, N
2015-09-01
Cloud computing allows pharmacometricians to access advanced hardware, network, and security resources available to expedite analysis and reporting. Cloud-based computing environments are available at a fraction of the time and effort when compared to traditional local datacenter-based solutions. This tutorial explains how to get started with building your own personal cloud computer cluster using Amazon Web Services (AWS), NONMEM, PsN, Grid Engine, and Sonic.
Low cost, high performance processing of single particle cryo-electron microscopy data in the cloud.
Cianfrocco, Michael A; Leschziner, Andres E
2015-05-08
The advent of a new generation of electron microscopes and direct electron detectors has realized the potential of single particle cryo-electron microscopy (cryo-EM) as a technique to generate high-resolution structures. Calculating these structures requires high performance computing clusters, a resource that may be limiting to many likely cryo-EM users. To address this limitation and facilitate the spread of cryo-EM, we developed a publicly available 'off-the-shelf' computing environment on Amazon's elastic cloud computing infrastructure. This environment provides users with single particle cryo-EM software packages and the ability to create computing clusters with 16-480+ CPUs. We tested our computing environment using a publicly available 80S yeast ribosome dataset and estimate that laboratories could determine high-resolution cryo-EM structures for $50 to $1500 per structure within a timeframe comparable to local clusters. Our analysis shows that Amazon's cloud computing environment may offer a viable computing environment for cryo-EM.
Jade: using on-demand cloud analysis to give scientists back their flow
NASA Astrophysics Data System (ADS)
Robinson, N.; Tomlinson, J.; Hilson, A. J.; Arribas, A.; Powell, T.
2017-12-01
The UK's Met Office generates 400 TB weather and climate data every day by running physical models on its Top 20 supercomputer. As data volumes explode, there is a danger that analysis workflows become dominated by watching progress bars, and not thinking about science. We have been researching how we can use distributed computing to allow analysts to process these large volumes of high velocity data in a way that's easy, effective and cheap.Our prototype analysis stack, Jade, tries to encapsulate this. Functionality includes: An under-the-hood Dask engine which parallelises and distributes computations, without the need to retrain analysts Hybrid compute clusters (AWS, Alibaba, and local compute) comprising many thousands of cores Clusters which autoscale up/down in response to calculation load using Kubernetes, and balances the cluster across providers based on the current price of compute Lazy data access from cloud storage via containerised OpenDAP This technology stack allows us to perform calculations many orders of magnitude faster than is possible on local workstations. It is also possible to outperform dedicated local compute clusters, as cloud compute can, in principle, scale to much larger scales. The use of ephemeral compute resources also makes this implementation cost efficient.
Integration of High-Performance Computing into Cloud Computing Services
NASA Astrophysics Data System (ADS)
Vouk, Mladen A.; Sills, Eric; Dreher, Patrick
High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).
TOSCA-based orchestration of complex clusters at the IaaS level
NASA Astrophysics Data System (ADS)
Caballer, M.; Donvito, G.; Moltó, G.; Rocha, R.; Velten, M.
2017-10-01
This paper describes the adoption and extension of the TOSCA standard by the INDIGO-DataCloud project for the definition and deployment of complex computing clusters together with the required support in both OpenStack and OpenNebula, carried out in close collaboration with industry partners such as IBM. Two examples of these clusters are described in this paper, the definition of an elastic computing cluster to support the Galaxy bioinformatics application where the nodes are dynamically added and removed from the cluster to adapt to the workload, and the definition of an scalable Apache Mesos cluster for the execution of batch jobs and support for long-running services. The coupling of TOSCA with Ansible Roles to perform automated installation has resulted in the definition of high-level, deterministic templates to provision complex computing clusters across different Cloud sites.
Tools for Analyzing Computing Resource Management Strategies and Algorithms for SDR Clouds
NASA Astrophysics Data System (ADS)
Marojevic, Vuk; Gomez-Miguelez, Ismael; Gelonch, Antoni
2012-09-01
Software defined radio (SDR) clouds centralize the computing resources of base stations. The computing resource pool is shared between radio operators and dynamically loads and unloads digital signal processing chains for providing wireless communications services on demand. Each new user session request particularly requires the allocation of computing resources for executing the corresponding SDR transceivers. The huge amount of computing resources of SDR cloud data centers and the numerous session requests at certain hours of a day require an efficient computing resource management. We propose a hierarchical approach, where the data center is divided in clusters that are managed in a distributed way. This paper presents a set of computing resource management tools for analyzing computing resource management strategies and algorithms for SDR clouds. We use the tools for evaluating a different strategies and algorithms. The results show that more sophisticated algorithms can achieve higher resource occupations and that a tradeoff exists between cluster size and algorithm complexity.
MCloud: Secure Provenance for Mobile Cloud Users
2016-10-03
Feasibility of Smartphone Clouds , 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 04-MAY- 15, Shenzhen, China...final decision. MCloud: Secure Provenance for Mobile Cloud Users Final Report Bogdan Carbunar Florida International University Computing and...Release; Distribution Unlimited UU UU UU UU 03-10-2016 31-May-2013 30-May-2016 Final Report: MCloud: Secure Provenance for Mobile Cloud Users The views
3D Viewer Platform of Cloud Clustering Management System: Google Map 3D
NASA Astrophysics Data System (ADS)
Choi, Sung-Ja; Lee, Gang-Soo
The new management system of framework for cloud envrionemnt is needed by the platfrom of convergence according to computing environments of changes. A ISV and small business model is hard to adapt management system of platform which is offered from super business. This article suggest the clustering management system of cloud computing envirionments for ISV and a man of enterprise in small business model. It applies the 3D viewer adapt from map3D & earth of google. It is called 3DV_CCMS as expand the CCMS[1].
Low cost, high performance processing of single particle cryo-electron microscopy data in the cloud
Cianfrocco, Michael A; Leschziner, Andres E
2015-01-01
The advent of a new generation of electron microscopes and direct electron detectors has realized the potential of single particle cryo-electron microscopy (cryo-EM) as a technique to generate high-resolution structures. Calculating these structures requires high performance computing clusters, a resource that may be limiting to many likely cryo-EM users. To address this limitation and facilitate the spread of cryo-EM, we developed a publicly available ‘off-the-shelf’ computing environment on Amazon's elastic cloud computing infrastructure. This environment provides users with single particle cryo-EM software packages and the ability to create computing clusters with 16–480+ CPUs. We tested our computing environment using a publicly available 80S yeast ribosome dataset and estimate that laboratories could determine high-resolution cryo-EM structures for $50 to $1500 per structure within a timeframe comparable to local clusters. Our analysis shows that Amazon's cloud computing environment may offer a viable computing environment for cryo-EM. DOI: http://dx.doi.org/10.7554/eLife.06664.001 PMID:25955969
Cost-Effective Cloud Computing: A Case Study Using the Comparative Genomics Tool, Roundup
Kudtarkar, Parul; DeLuca, Todd F.; Fusaro, Vincent A.; Tonellato, Peter J.; Wall, Dennis P.
2010-01-01
Background Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource—Roundup—using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Methods Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon’s Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. Results We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon’s computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure. PMID:21258651
CloudMC: a cloud computing application for Monte Carlo simulation.
Miras, H; Jiménez, R; Miras, C; Gomà, C
2013-04-21
This work presents CloudMC, a cloud computing application-developed in Windows Azure®, the platform of the Microsoft® cloud-for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based-the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.
Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R
2015-01-01
With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing
Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.
2015-01-01
With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746
Assessing the Amazon Cloud Suitability for CLARREO's Computational Needs
NASA Technical Reports Server (NTRS)
Goldin, Daniel; Vakhnin, Andrei A.; Currey, Jon C.
2015-01-01
In this document we compare the performance of the Amazon Web Services (AWS), also known as Amazon Cloud, with the CLARREO (Climate Absolute Radiance and Refractivity Observatory) cluster and assess its suitability for computational needs of the CLARREO mission. A benchmark executable to process one month and one year of PARASOL (Polarization and Anistropy of Reflectances for Atmospheric Sciences coupled with Observations from a Lidar) data was used. With the optimal AWS configuration, adequate data-processing times, comparable to the CLARREO cluster, were found. The assessment of alternatives to the CLARREO cluster continues and several options, such as a NASA-based cluster, are being considered.
Cloud computing for comparative genomics with windows azure platform.
Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P
2012-01-01
Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.
Cloud Computing for Comparative Genomics with Windows Azure Platform
Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd F.; Nelson, Tristan H.; Wall, Dennis P.
2012-01-01
Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609
NASA Astrophysics Data System (ADS)
Chen, Xiuhong; Huang, Xianglei; Jiao, Chaoyi; Flanner, Mark G.; Raeker, Todd; Palen, Brock
2017-01-01
The suites of numerical models used for simulating climate of our planet are usually run on dedicated high-performance computing (HPC) resources. This study investigates an alternative to the usual approach, i.e. carrying out climate model simulations on commercially available cloud computing environment. We test the performance and reliability of running the CESM (Community Earth System Model), a flagship climate model in the United States developed by the National Center for Atmospheric Research (NCAR), on Amazon Web Service (AWS) EC2, the cloud computing environment by Amazon.com, Inc. StarCluster is used to create virtual computing cluster on the AWS EC2 for the CESM simulations. The wall-clock time for one year of CESM simulation on the AWS EC2 virtual cluster is comparable to the time spent for the same simulation on a local dedicated high-performance computing cluster with InfiniBand connections. The CESM simulation can be efficiently scaled with the number of CPU cores on the AWS EC2 virtual cluster environment up to 64 cores. For the standard configuration of the CESM at a spatial resolution of 1.9° latitude by 2.5° longitude, increasing the number of cores from 16 to 64 reduces the wall-clock running time by more than 50% and the scaling is nearly linear. Beyond 64 cores, the communication latency starts to outweigh the benefit of distributed computing and the parallel speedup becomes nearly unchanged.
Hybrid cloud and cluster computing paradigms for life science applications
2010-01-01
Background Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Results Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. Conclusions The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. Methods We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments. PMID:21210982
Hybrid cloud and cluster computing paradigms for life science applications.
Qiu, Judy; Ekanayake, Jaliya; Gunarathne, Thilina; Choi, Jong Youl; Bae, Seung-Hee; Li, Hui; Zhang, Bingjing; Wu, Tak-Lon; Ruan, Yang; Ekanayake, Saliya; Hughes, Adam; Fox, Geoffrey
2010-12-21
Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.
Scalable cloud without dedicated storage
NASA Astrophysics Data System (ADS)
Batkovich, D. V.; Kompaniets, M. V.; Zarochentsev, A. K.
2015-05-01
We present a prototype of a scalable computing cloud. It is intended to be deployed on the basis of a cluster without the separate dedicated storage. The dedicated storage is replaced by the distributed software storage. In addition, all cluster nodes are used both as computing nodes and as storage nodes. This solution increases utilization of the cluster resources as well as improves fault tolerance and performance of the distributed storage. Another advantage of this solution is high scalability with a relatively low initial and maintenance cost. The solution is built on the basis of the open source components like OpenStack, CEPH, etc.
Searching for SNPs with cloud computing
2009-01-01
As DNA sequencing outpaces improvements in computer speed, there is a critical need to accelerate tasks like alignment and SNP calling. Crossbow is a cloud-computing software tool that combines the aligner Bowtie and the SNP caller SOAPsnp. Executing in parallel using Hadoop, Crossbow analyzes data comprising 38-fold coverage of the human genome in three hours using a 320-CPU cluster rented from a cloud computing service for about $85. Crossbow is available from http://bowtie-bio.sourceforge.net/crossbow/. PMID:19930550
Radiative Feedback of Forming Star Clusters on Their GMC Environments: Theory and Simulation
NASA Astrophysics Data System (ADS)
Howard, C. S.; Pudritz, R. E.; Harris, W. E.
2013-07-01
Star clusters form from dense clumps within a molecular cloud. Radiation from these newly formed clusters feeds back on their natal molecular cloud through heating and ionization which ultimately stops gas accretion into the cluster. Recent studies suggest that radiative feedback effects from a single cluster may be sufficient to disrupt an entire cloud over a short timescale. Simulating cluster formation on a large scale, however, is computationally demanding due to the high number of stars involved. For this reason, we present a model for representing the radiative output of an entire cluster which involves randomly sampling an initial mass function (IMF) as the cluster accretes mass. We show that this model is able to reproduce the star formation histories of observed clusters. To examine the degree to which radiative feedback shapes the evolution of a molecular cloud, we use the FLASH adaptive-mesh refinement hydrodynamics code to simulate cluster formation in a turbulent cloud. Unlike previous studies, sink particles are used to represent a forming cluster rather than individual stars. Our cluster model is then coupled with a raytracing scheme to treat radiative transfer as the clusters grow in mass. This poster will outline the details of our model and present preliminary results from our 3D hydrodynamical simulations.
Analysis and Research on Spatial Data Storage Model Based on Cloud Computing Platform
NASA Astrophysics Data System (ADS)
Hu, Yong
2017-12-01
In this paper, the data processing and storage characteristics of cloud computing are analyzed and studied. On this basis, a cloud computing data storage model based on BP neural network is proposed. In this data storage model, it can carry out the choice of server cluster according to the different attributes of the data, so as to complete the spatial data storage model with load balancing function, and have certain feasibility and application advantages.
2011-08-01
5 Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis...classification of streaming data. Example input images (top left). All digit prototypes (cluster centers) found, with size proportional to frequency (top...Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis 1 http
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Cloud Quantum Computing of an Atomic Nucleus
NASA Astrophysics Data System (ADS)
Dumitrescu, E. F.; McCaskey, A. J.; Hagen, G.; Jansen, G. R.; Morris, T. D.; Papenbrock, T.; Pooser, R. C.; Dean, D. J.; Lougovski, P.
2018-05-01
We report a quantum simulation of the deuteron binding energy on quantum processors accessed via cloud servers. We use a Hamiltonian from pionless effective field theory at leading order. We design a low-depth version of the unitary coupled-cluster ansatz, use the variational quantum eigensolver algorithm, and compute the binding energy to within a few percent. Our work is the first step towards scalable nuclear structure computations on a quantum processor via the cloud, and it sheds light on how to map scientific computing applications onto nascent quantum devices.
Cloud Quantum Computing of an Atomic Nucleus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dumitrescu, Eugene F.; McCaskey, Alex J.; Hagen, Gaute
Here, we report a quantum simulation of the deuteron binding energy on quantum processors accessed via cloud servers. We use a Hamiltonian from pionless effective field theory at leading order. We design a low-depth version of the unitary coupled-cluster ansatz, use the variational quantum eigensolver algorithm, and compute the binding energy to within a few percent. Our work is the first step towards scalable nuclear structure computations on a quantum processor via the cloud, and it sheds light on how to map scientific computing applications onto nascent quantum devices.
Cloud Quantum Computing of an Atomic Nucleus.
Dumitrescu, E F; McCaskey, A J; Hagen, G; Jansen, G R; Morris, T D; Papenbrock, T; Pooser, R C; Dean, D J; Lougovski, P
2018-05-25
We report a quantum simulation of the deuteron binding energy on quantum processors accessed via cloud servers. We use a Hamiltonian from pionless effective field theory at leading order. We design a low-depth version of the unitary coupled-cluster ansatz, use the variational quantum eigensolver algorithm, and compute the binding energy to within a few percent. Our work is the first step towards scalable nuclear structure computations on a quantum processor via the cloud, and it sheds light on how to map scientific computing applications onto nascent quantum devices.
Cloud Quantum Computing of an Atomic Nucleus
Dumitrescu, Eugene F.; McCaskey, Alex J.; Hagen, Gaute; ...
2018-05-23
Here, we report a quantum simulation of the deuteron binding energy on quantum processors accessed via cloud servers. We use a Hamiltonian from pionless effective field theory at leading order. We design a low-depth version of the unitary coupled-cluster ansatz, use the variational quantum eigensolver algorithm, and compute the binding energy to within a few percent. Our work is the first step towards scalable nuclear structure computations on a quantum processor via the cloud, and it sheds light on how to map scientific computing applications onto nascent quantum devices.
Translational bioinformatics in the cloud: an affordable alternative
2010-01-01
With the continued exponential expansion of publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine. Although cloud computing technology is being heralded as a key enabling technology for the future of genomic research, available case studies are limited to applications in the domain of high-throughput sequence data analysis. The goal of this study was to evaluate the computational and economic characteristics of cloud computing in performing a large-scale data integration and analysis representative of research problems in genomic medicine. We find that the cloud-based analysis compares favorably in both performance and cost in comparison to a local computational cluster, suggesting that cloud computing technologies might be a viable resource for facilitating large-scale translational research in genomic medicine. PMID:20691073
Cloud computing and validation of expandable in silico livers.
Ropella, Glen E P; Hunt, C Anthony
2010-12-03
In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware.
Integration of Openstack cloud resources in BES III computing cluster
NASA Astrophysics Data System (ADS)
Li, Haibo; Cheng, Yaodong; Huang, Qiulan; Cheng, Zhenjing; Shi, Jingyan
2017-10-01
Cloud computing provides a new technical means for data processing of high energy physics experiment. However, the resource of each queue is fixed and the usage of the resource is static in traditional job management system. In order to make it simple and transparent for physicist to use, we developed a virtual cluster system (vpmanager) to integrate IHEPCloud and different batch systems such as Torque and HTCondor. Vpmanager provides dynamic virtual machines scheduling according to the job queue. The BES III use case results show that resource efficiency is greatly improved.
Spatial Analysis of Great Lakes Regional Icing Cloud Liquid Water Content
NASA Technical Reports Server (NTRS)
Ryerson, Charles C.; Koenig, George G.; Melloh, Rae A.; Meese, Debra A.; Reehorst, Andrew L.; Miller, Dean R.
2003-01-01
Abstract Clustering of cloud microphysical conditions, such as liquid water content (LWC) and drop size, can affect the rate and shape of ice accretion and the airworthiness of aircraft. Clustering may also degrade the accuracy of cloud LWC measurements from radars and microwave radiometers being developed by the government for remotely mapping icing conditions ahead of aircraft in flight. This paper evaluates spatial clustering of LWC in icing clouds using measurements collected during NASA research flights in the Great Lakes region. We used graphical and analytical approaches to describe clustering. The analytical approach involves determining the average size of clusters and computing a clustering intensity parameter. We analyzed flight data composed of 1-s-frequency LWC measurements for 12 periods ranging from 17.4 minutes (73 km) to 45.3 minutes (190 km) in duration. Graphically some flight segments showed evidence of consistency with regard to clustering patterns. Cluster intensity varied from 0.06, indicating little clustering, to a high of 2.42. Cluster lengths ranged from 0.1 minutes (0.6 km) to 4.1 minutes (17.3 km). Additional analyses will allow us to determine if clustering climatologies can be developed to characterize cluster conditions by region, time period, or weather condition. Introduction
Formation of Very Young Massive Clusters and Implications for Globular Clusters
NASA Astrophysics Data System (ADS)
Banerjee, Sambaran; Kroupa, Pavel
How Very Young Massive star Clusters (VYMCs; also known as "starburst" clusters), which typically are of ≳ 104 M ⊙ and are a few Myr old, form out of Giant Molecular Clouds is still largely an open question. Increasingly detailed observations of young star clusters and star-forming molecular clouds and computational studies provide clues about their formation scenarios and the underlying physical processes involved. This chapter is focused on reviewing the decade-long studies that attempt to computationally reproduce the well-observed nearby VYMCs, such as the Orion Nebula Cluster, R136 and NGC 3603 young cluster, thereby shedding light on birth conditions of massive star clusters, in general. On this regard, focus is given on direct N-body modelling of real-sized massive star clusters, with a monolithic structure and undergoing residual gas expulsion, which have consistently reproduced the observed characteristics of several VYMCs and also of young star clusters, in general. The connection of these relatively simplified model calculations with the structural richness of dense molecular clouds and the complexity of hydrodynamic calculations of star cluster formation is presented in detail. Furthermore, the connections of such VYMCs with globular clusters, which are nearly as old as our Universe, is discussed. The chapter is concluded by addressing long-term deeply gas-embedded (at least apparently) and substructured systems like W3 Main. While most of the results are quoted from existing and up-to-date literature, in an integrated fashion, several new insights and discussions are provided.
ATLAS user analysis on private cloud resources at GoeGrid
NASA Astrophysics Data System (ADS)
Glaser, F.; Nadal Serrano, J.; Grabowski, J.; Quadt, A.
2015-12-01
User analysis job demands can exceed available computing resources, especially before major conferences. ATLAS physics results can potentially be slowed down due to the lack of resources. For these reasons, cloud research and development activities are now included in the skeleton of the ATLAS computing model, which has been extended by using resources from commercial and private cloud providers to satisfy the demands. However, most of these activities are focused on Monte-Carlo production jobs, extending the resources at Tier-2. To evaluate the suitability of the cloud-computing model for user analysis jobs, we developed a framework to launch an ATLAS user analysis cluster in a cloud infrastructure on demand and evaluated two solutions. The first solution is entirely integrated in the Grid infrastructure by using the same mechanism, which is already in use at Tier-2: A designated Panda-Queue is monitored and additional worker nodes are launched in a cloud environment and assigned to a corresponding HTCondor queue according to the demand. Thereby, the use of cloud resources is completely transparent to the user. However, using this approach, submitted user analysis jobs can still suffer from a certain delay introduced by waiting time in the queue and the deployed infrastructure lacks customizability. Therefore, our second solution offers the possibility to easily deploy a totally private, customizable analysis cluster on private cloud resources belonging to the university.
Sector and Sphere: the design and implementation of a high-performance data cloud
Gu, Yunhong; Grossman, Robert L.
2009-01-01
Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with the existing storage and compute clouds, Sector can manage data not only within a data centre, but also across geographically distributed data centres. Similarly, the Sphere compute cloud supports user-defined functions (UDFs) over data both within and across data centres. As a special case, MapReduce-style programming can be implemented in Sphere by using a Map UDF followed by a Reduce UDF. We describe some experimental studies comparing Sector/Sphere and Hadoop using the Terasort benchmark. In these studies, Sector is approximately twice as fast as Hadoop. Sector/Sphere is open source. PMID:19451100
Sideloading - Ingestion of Large Point Clouds Into the Apache Spark Big Data Engine
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.; Alis, C.
2016-06-01
In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.
3D reconstruction from non-uniform point clouds via local hierarchical clustering
NASA Astrophysics Data System (ADS)
Yang, Jiaqi; Li, Ruibo; Xiao, Yang; Cao, Zhiguo
2017-07-01
Raw scanned 3D point clouds are usually irregularly distributed due to the essential shortcomings of laser sensors, which therefore poses a great challenge for high-quality 3D surface reconstruction. This paper tackles this problem by proposing a local hierarchical clustering (LHC) method to improve the consistency of point distribution. Specifically, LHC consists of two steps: 1) adaptive octree-based decomposition of 3D space, and 2) hierarchical clustering. The former aims at reducing the computational complexity and the latter transforms the non-uniform point set into uniform one. Experimental results on real-world scanned point clouds validate the effectiveness of our method from both qualitative and quantitative aspects.
TethysCluster: A comprehensive approach for harnessing cloud resources for hydrologic modeling
NASA Astrophysics Data System (ADS)
Nelson, J.; Jones, N.; Ames, D. P.
2015-12-01
Advances in water resources modeling are improving the information that can be supplied to support decisions affecting the safety and sustainability of society. However, as water resources models become more sophisticated and data-intensive they require more computational power to run. Purchasing and maintaining the computing facilities needed to support certain modeling tasks has been cost-prohibitive for many organizations. With the advent of the cloud, the computing resources needed to address this challenge are now available and cost-effective, yet there still remains a significant technical barrier to leverage these resources. This barrier inhibits many decision makers and even trained engineers from taking advantage of the best science and tools available. Here we present the Python tools TethysCluster and CondorPy, that have been developed to lower the barrier to model computation in the cloud by providing (1) programmatic access to dynamically scalable computing resources, (2) a batch scheduling system to queue and dispatch the jobs to the computing resources, (3) data management for job inputs and outputs, and (4) the ability to dynamically create, submit, and monitor computing jobs. These Python tools leverage the open source, computing-resource management, and job management software, HTCondor, to offer a flexible and scalable distributed-computing environment. While TethysCluster and CondorPy can be used independently to provision computing resources and perform large modeling tasks, they have also been integrated into Tethys Platform, a development platform for water resources web apps, to enable computing support for modeling workflows and decision-support systems deployed as web apps.
NASA Astrophysics Data System (ADS)
Romanchuk, V. A.; Lukashenko, V. V.
2018-05-01
The technique of functioning of a control system by a computing cluster based on neurocomputers is proposed. Particular attention is paid to the method of choosing the structure of the computing cluster due to the fact that the existing methods are not effective because of a specialized hardware base - neurocomputers, which are highly parallel computer devices with an architecture different from the von Neumann architecture. A developed algorithm for choosing the computational structure of a cloud cluster is described, starting from the direction of data transfer in the flow control graph of the program and its adjacency matrix.
Cloud computing and validation of expandable in silico livers
2010-01-01
Background In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. Results The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. Conclusions The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware. PMID:21129207
Cloudgene: A graphical execution platform for MapReduce programs on private and public clouds
2012-01-01
Background The MapReduce framework enables a scalable processing and analyzing of large datasets by distributing the computational load on connected computer nodes, referred to as a cluster. In Bioinformatics, MapReduce has already been adopted to various case scenarios such as mapping next generation sequencing data to a reference genome, finding SNPs from short read data or matching strings in genotype files. Nevertheless, tasks like installing and maintaining MapReduce on a cluster system, importing data into its distributed file system or executing MapReduce programs require advanced knowledge in computer science and could thus prevent scientists from usage of currently available and useful software solutions. Results Here we present Cloudgene, a freely available platform to improve the usability of MapReduce programs in Bioinformatics by providing a graphical user interface for the execution, the import and export of data and the reproducibility of workflows on in-house (private clouds) and rented clusters (public clouds). The aim of Cloudgene is to build a standardized graphical execution environment for currently available and future MapReduce programs, which can all be integrated by using its plug-in interface. Since Cloudgene can be executed on private clusters, sensitive datasets can be kept in house at all time and data transfer times are therefore minimized. Conclusions Our results show that MapReduce programs can be integrated into Cloudgene with little effort and without adding any computational overhead to existing programs. This platform gives developers the opportunity to focus on the actual implementation task and provides scientists a platform with the aim to hide the complexity of MapReduce. In addition to MapReduce programs, Cloudgene can also be used to launch predefined systems (e.g. Cloud BioLinux, RStudio) in public clouds. Currently, five different bioinformatic programs using MapReduce and two systems are integrated and have been successfully deployed. Cloudgene is freely available at http://cloudgene.uibk.ac.at. PMID:22888776
Cloud Computing: Virtual Clusters, Data Security, and Disaster Recovery
NASA Astrophysics Data System (ADS)
Hwang, Kai
Dr. Kai Hwang is a Professor of Electrical Engineering and Computer Science and Director of Internet and Cloud Computing Lab at the Univ. of Southern California (USC). He received the Ph.D. in Electrical Engineering and Computer Science from the Univ. of California, Berkeley. Prior to joining USC, he has taught at Purdue Univ. for many years. He has also served as a visiting Chair Professor at Minnesota, Hong Kong Univ., Zhejiang Univ., and Tsinghua Univ. He has published 8 books and over 210 scientific papers in computer science/engineering.
GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment.
Liu, Yangchuan; Tang, Yuguo; Gao, Xin
2017-12-01
The GATE Monte Carlo simulation platform has good application prospects of treatment planning and quality assurance. However, accurate dose calculation using GATE is time consuming. The purpose of this study is to implement a novel cloud computing method for accurate GATE Monte Carlo simulation of dose distribution using MapReduce. An Amazon Machine Image installed with Hadoop and GATE is created to set up Hadoop clusters on Amazon Elastic Compute Cloud (EC2). Macros, the input files for GATE, are split into a number of self-contained sub-macros. Through Hadoop Streaming, the sub-macros are executed by GATE in Map tasks and the sub-results are aggregated into final outputs in Reduce tasks. As an evaluation, GATE simulations were performed in a cubical water phantom for X-ray photons of 6 and 18 MeV. The parallel simulation on the cloud computing platform is as accurate as the single-threaded simulation on a local server and the simulation correctness is not affected by the failure of some worker nodes. The cloud-based simulation time is approximately inversely proportional to the number of worker nodes. For the simulation of 10 million photons on a cluster with 64 worker nodes, time decreases of 41× and 32× were achieved compared to the single worker node case and the single-threaded case, respectively. The test of Hadoop's fault tolerance showed that the simulation correctness was not affected by the failure of some worker nodes. The results verify that the proposed method provides a feasible cloud computing solution for GATE.
Efficient and Flexible Climate Analysis with Python in a Cloud-Based Distributed Computing Framework
NASA Astrophysics Data System (ADS)
Gannon, C.
2017-12-01
As climate models become progressively more advanced, and spatial resolution further improved through various downscaling projects, climate projections at a local level are increasingly insightful and valuable. However, the raw size of climate datasets presents numerous hurdles for analysts wishing to develop customized climate risk metrics or perform site-specific statistical analysis. Four Twenty Seven, a climate risk consultancy, has implemented a Python-based distributed framework to analyze large climate datasets in the cloud. With the freedom afforded by efficiently processing these datasets, we are able to customize and continually develop new climate risk metrics using the most up-to-date data. Here we outline our process for using Python packages such as XArray and Dask to evaluate netCDF files in a distributed framework, StarCluster to operate in a cluster-computing environment, cloud computing services to access publicly hosted datasets, and how this setup is particularly valuable for generating climate change indicators and performing localized statistical analysis.
Integration of Cloud resources in the LHCb Distributed Computing
NASA Astrophysics Data System (ADS)
Úbeda García, Mario; Méndez Muñoz, Víctor; Stagni, Federico; Cabarrou, Baptiste; Rauschmayr, Nathalie; Charpentier, Philippe; Closier, Joel
2014-06-01
This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using its specific Dirac extension (LHCbDirac) as an interware for its Distributed Computing. So far, it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of DIRAC (VMDIRAC) allows the integration of Cloud computing infrastructures. It is able to interact with multiple types of infrastructures in commercial and institutional clouds, supported by multiple interfaces (Amazon EC2, OpenNebula, OpenStack and CloudStack) - instantiates, monitors and manages Virtual Machines running on this aggregation of Cloud resources. Moreover, specifications for institutional Cloud resources proposed by Worldwide LHC Computing Grid (WLCG), mainly by the High Energy Physics Unix Information Exchange (HEPiX) group, have been taken into account. Several initiatives and computing resource providers in the eScience environment have already deployed IaaS in production during 2013. Keeping this on mind, pros and cons of a cloud based infrasctructure have been studied in contrast with the current setup. As a result, this work addresses four different use cases which represent a major improvement on several levels of our infrastructure. We describe the solution implemented by LHCb for the contextualisation of the VMs based on the idea of Cloud Site. We report on operational experience of using in production several institutional Cloud resources that are thus becoming integral part of the LHCb Distributed Computing resources. Furthermore, we describe as well the gradual migration of our Service Infrastructure towards a fully distributed architecture following the Service as a Service (SaaS) model.
NASA Astrophysics Data System (ADS)
Bassier, M.; Bonduel, M.; Van Genechten, B.; Vergauwen, M.
2017-11-01
Point cloud segmentation is a crucial step in scene understanding and interpretation. The goal is to decompose the initial data into sets of workable clusters with similar properties. Additionally, it is a key aspect in the automated procedure from point cloud data to BIM. Current approaches typically only segment a single type of primitive such as planes or cylinders. Also, current algorithms suffer from oversegmenting the data and are often sensor or scene dependent. In this work, a method is presented to automatically segment large unstructured point clouds of buildings. More specifically, the segmentation is formulated as a graph optimisation problem. First, the data is oversegmented with a greedy octree-based region growing method. The growing is conditioned on the segmentation of planes as well as smooth surfaces. Next, the candidate clusters are represented by a Conditional Random Field after which the most likely configuration of candidate clusters is computed given a set of local and contextual features. The experiments prove that the used method is a fast and reliable framework for unstructured point cloud segmentation. Processing speeds up to 40,000 points per second are recorded for the region growing. Additionally, the recall and precision of the graph clustering is approximately 80%. Overall, nearly 22% of oversegmentation is reduced by clustering the data. These clusters will be classified and used as a basis for the reconstruction of BIM models.
Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing
NASA Astrophysics Data System (ADS)
Tang, Jingyin; Matyas, Corene J.
2018-02-01
Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.
Using the cloud to speed-up calibration of watershed-scale hydrologic models (Invited)
NASA Astrophysics Data System (ADS)
Goodall, J. L.; Ercan, M. B.; Castronova, A. M.; Humphrey, M.; Beekwilder, N.; Steele, J.; Kim, I.
2013-12-01
This research focuses on using the cloud to address computational challenges associated with hydrologic modeling. One example is calibration of a watershed-scale hydrologic model, which can take days of execution time on typical computers. While parallel algorithms for model calibration exist and some researchers have used multi-core computers or clusters to run these algorithms, these solutions do not fully address the challenge because (i) calibration can still be too time consuming even on multicore personal computers and (ii) few in the community have the time and expertise needed to manage a compute cluster. Given this, another option for addressing this challenge that we are exploring through this work is the use of the cloud for speeding-up calibration of watershed-scale hydrologic models. The cloud used in this capacity provides a means for renting a specific number and type of machines for only the time needed to perform a calibration model run. The cloud allows one to precisely balance the duration of the calibration with the financial costs so that, if the budget allows, the calibration can be performed more quickly by renting more machines. Focusing specifically on the SWAT hydrologic model and a parallel version of the DDS calibration algorithm, we show significant speed-up time across a range of watershed sizes using up to 256 cores to perform a model calibration. The tool provides a simple web-based user interface and the ability to monitor the calibration job submission process during the calibration process. Finally this talk concludes with initial work to leverage the cloud for other tasks associated with hydrologic modeling including tasks related to preparing inputs for constructing place-based hydrologic models.
Towards real-time photon Monte Carlo dose calculation in the cloud
NASA Astrophysics Data System (ADS)
Ziegenhein, Peter; Kozin, Igor N.; Kamerling, Cornelis Ph; Oelfke, Uwe
2017-06-01
Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
Towards real-time photon Monte Carlo dose calculation in the cloud.
Ziegenhein, Peter; Kozin, Igor N; Kamerling, Cornelis Ph; Oelfke, Uwe
2017-06-07
Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
Uncover the Cloud for Geospatial Sciences and Applications to Adopt Cloud Computing
NASA Astrophysics Data System (ADS)
Yang, C.; Huang, Q.; Xia, J.; Liu, K.; Li, J.; Xu, C.; Sun, M.; Bambacus, M.; Xu, Y.; Fay, D.
2012-12-01
Cloud computing is emerging as the future infrastructure for providing computing resources to support and enable scientific research, engineering development, and application construction, as well as work force education. On the other hand, there is a lot of doubt about the readiness of cloud computing to support a variety of scientific research, development and educations. This research is a project funded by NASA SMD to investigate through holistic studies how ready is the cloud computing to support geosciences. Four applications with different computing characteristics including data, computing, concurrent, and spatiotemporal intensities are taken to test the readiness of cloud computing to support geosciences. Three popular and representative cloud platforms including Amazon EC2, Microsoft Azure, and NASA Nebula as well as a traditional cluster are utilized in the study. Results illustrates that cloud is ready to some degree but more research needs to be done to fully implemented the cloud benefit as advertised by many vendors and defined by NIST. Specifically, 1) most cloud platform could help stand up new computing instances, a new computer, in a few minutes as envisioned, therefore, is ready to support most computing needs in an on demand fashion; 2) the load balance and elasticity, a defining characteristic, is ready in some cloud platforms, such as Amazon EC2, to support bigger jobs, e.g., needs response in minutes, while some are not ready to support the elasticity and load balance well. All cloud platform needs further research and development to support real time application at subminute level; 3) the user interface and functionality of cloud platforms vary a lot and some of them are very professional and well supported/documented, such as Amazon EC2, some of them needs significant improvement for the general public to adopt cloud computing without professional training or knowledge about computing infrastructure; 4) the security is a big concern in cloud computing platform, with the sharing spirit of cloud computing, it is very hard to ensure higher level security, except a private cloud is built for a specific organization without public access, public cloud platform does not support FISMA medium level yet and may never be able to support FISMA high level; 5) HPC jobs needs of cloud computing is not well supported and only Amazon EC2 supports this well. The research is being taken by NASA and other agencies to consider cloud computing adoption. We hope the publication of the research would also benefit the public to adopt cloud computing.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Infrastructures for Distributed Computing: the case of BESIII
NASA Astrophysics Data System (ADS)
Pellegrino, J.
2018-05-01
The BESIII is an electron-positron collision experiment hosted at BEPCII in Beijing and aimed to investigate Tau-Charm physics. Now BESIII has been running for several years and gathered more than 1PB raw data. In order to analyze these data and perform massive Monte Carlo simulations, a large amount of computing and storage resources is needed. The distributed computing system is based up on DIRAC and it is in production since 2012. It integrates computing and storage resources from different institutes and a variety of resource types such as cluster, grid, cloud or volunteer computing. About 15 sites from BESIII Collaboration from all over the world joined this distributed computing infrastructure, giving a significant contribution to the IHEP computing facility. Nowadays cloud computing is playing a key role in the HEP computing field, due to its scalability and elasticity. Cloud infrastructures take advantages of several tools, such as VMDirac, to manage virtual machines through cloud managers according to the job requirements. With the virtually unlimited resources from commercial clouds, the computing capacity could scale accordingly in order to deal with any burst demands. General computing models have been discussed in the talk and are addressed herewith, with particular focus on the BESIII infrastructure. Moreover new computing tools and upcoming infrastructures will be addressed.
NASA Astrophysics Data System (ADS)
Huang, Qian
2014-09-01
Scientific computing often requires the availability of a massive number of computers for performing large-scale simulations, and computing in mineral physics is no exception. In order to investigate physical properties of minerals at extreme conditions in computational mineral physics, parallel computing technology is used to speed up the performance by utilizing multiple computer resources to process a computational task simultaneously thereby greatly reducing computation time. Traditionally, parallel computing has been addressed by using High Performance Computing (HPC) solutions and installed facilities such as clusters and super computers. Today, it has been seen that there is a tremendous growth in cloud computing. Infrastructure as a Service (IaaS), the on-demand and pay-as-you-go model, creates a flexible and cost-effective mean to access computing resources. In this paper, a feasibility report of HPC on a cloud infrastructure is presented. It is found that current cloud services in IaaS layer still need to improve performance to be useful to research projects. On the other hand, Software as a Service (SaaS), another type of cloud computing, is introduced into an HPC system for computing in mineral physics, and an application of which is developed. In this paper, an overall description of this SaaS application is presented. This contribution can promote cloud application development in computational mineral physics, and cross-disciplinary studies.
Design and deployment of an elastic network test-bed in IHEP data center based on SDN
NASA Astrophysics Data System (ADS)
Zeng, Shan; Qi, Fazhi; Chen, Gang
2017-10-01
High energy physics experiments produce huge amounts of raw data, while because of the sharing characteristics of the network resources, there is no guarantee of the available bandwidth for each experiment which may cause link congestion problems. On the other side, with the development of cloud computing technologies, IHEP have established a cloud platform based on OpenStack which can ensure the flexibility of the computing and storage resources, and more and more computing applications have been deployed on virtual machines established by OpenStack. However, under the traditional network architecture, network capability can’t be required elastically, which becomes the bottleneck of restricting the flexible application of cloud computing. In order to solve the above problems, we propose an elastic cloud data center network architecture based on SDN, and we also design a high performance controller cluster based on OpenDaylight. In the end, we present our current test results.
NASA Astrophysics Data System (ADS)
Micheletti, Natan; Tonini, Marj; Lane, Stuart N.
2017-02-01
Acquisition of high density point clouds using terrestrial laser scanners (TLSs) has become commonplace in geomorphic science. The derived point clouds are often interpolated onto regular grids and the grids compared to detect change (i.e. erosion and deposition/advancement movements). This procedure is necessary for some applications (e.g. digital terrain analysis), but it inevitably leads to a certain loss of potentially valuable information contained within the point clouds. In the present study, an alternative methodology for geomorphological analysis and feature detection from point clouds is proposed. It rests on the use of the Density-Based Spatial Clustering of Applications with Noise (DBSCAN), applied to TLS data for a rock glacier front slope in the Swiss Alps. The proposed methods allowed the detection and isolation of movements directly from point clouds which yield to accuracies in the following computation of volumes that depend only on the actual registered distance between points. We demonstrated that these values are more conservative than volumes computed with the traditional DEM comparison. The results are illustrated for the summer of 2015, a season of enhanced geomorphic activity associated with exceptionally high temperatures.
Elastic Cloud Computing Architecture and System for Heterogeneous Spatiotemporal Computing
NASA Astrophysics Data System (ADS)
Shi, X.
2017-10-01
Spatiotemporal computation implements a variety of different algorithms. When big data are involved, desktop computer or standalone application may not be able to complete the computation task due to limited memory and computing power. Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure and platforms. Some are perfect for implementation on a cluster of graphics processing units (GPUs), while GPUs may not be useful on certain kind of spatiotemporal computation. This is the same situation in utilizing a cluster of Intel's many-integrated-core (MIC) or Xeon Phi, as well as Hadoop or Spark platforms, to handle big spatiotemporal data. Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation could be similar or better than GPUs and MICs. It is expected that an elastic cloud computing architecture and system that integrates all of GPUs, MICs, and FPGAs could be developed and deployed to support spatiotemporal computing over heterogeneous data types and computational problems.
Coarse Point Cloud Registration by Egi Matching of Voxel Clusters
NASA Astrophysics Data System (ADS)
Wang, Jinhu; Lindenbergh, Roderik; Shen, Yueqian; Menenti, Massimo
2016-06-01
Laser scanning samples the surface geometry of objects efficiently and records versatile information as point clouds. However, often more scans are required to fully cover a scene. Therefore, a registration step is required that transforms the different scans into a common coordinate system. The registration of point clouds is usually conducted in two steps, i.e. coarse registration followed by fine registration. In this study an automatic marker-free coarse registration method for pair-wise scans is presented. First the two input point clouds are re-sampled as voxels and dimensionality features of the voxels are determined by principal component analysis (PCA). Then voxel cells with the same dimensionality are clustered. Next, the Extended Gaussian Image (EGI) descriptor of those voxel clusters are constructed using significant eigenvectors of each voxel in the cluster. Correspondences between clusters in source and target data are obtained according to the similarity between their EGI descriptors. The random sampling consensus (RANSAC) algorithm is employed to remove outlying correspondences until a coarse alignment is obtained. If necessary, a fine registration is performed in a final step. This new method is illustrated on scan data sampling two indoor scenarios. The results of the tests are evaluated by computing the point to point distance between the two input point clouds. The presented two tests resulted in mean distances of 7.6 mm and 9.5 mm respectively, which are adequate for fine registration.
Implementation of Grid Tier 2 and Tier 3 facilities on a Distributed OpenStack Cloud
NASA Astrophysics Data System (ADS)
Limosani, Antonio; Boland, Lucien; Coddington, Paul; Crosby, Sean; Huang, Joanna; Sevior, Martin; Wilson, Ross; Zhang, Shunde
2014-06-01
The Australian Government is making a AUD 100 million investment in Compute and Storage for the academic community. The Compute facilities are provided in the form of 30,000 CPU cores located at 8 nodes around Australia in a distributed virtualized Infrastructure as a Service facility based on OpenStack. The storage will eventually consist of over 100 petabytes located at 6 nodes. All will be linked via a 100 Gb/s network. This proceeding describes the development of a fully connected WLCG Tier-2 grid site as well as a general purpose Tier-3 computing cluster based on this architecture. The facility employs an extension to Torque to enable dynamic allocations of virtual machine instances. A base Scientific Linux virtual machine (VM) image is deployed in the OpenStack cloud and automatically configured as required using Puppet. Custom scripts are used to launch multiple VMs, integrate them into the dynamic Torque cluster and to mount remote file systems. We report on our experience in developing this nation-wide ATLAS and Belle II Tier 2 and Tier 3 computing infrastructure using the national Research Cloud and storage facilities.
Methods of editing cloud and atmospheric layer affected pixels from satellite data
NASA Technical Reports Server (NTRS)
Nixon, P. R. (Principal Investigator); Wiegand, C. L.; Richardson, A. J.; Johnson, M. P.
1982-01-01
Practical methods of computer screening cloud-contaminated pixels from data of various satellite systems are proposed. Examples are given of the location of clouds and representative landscape features in HCMM spectral space of reflectance (VIS) vs emission (IR). Methods of screening out cloud affected HCMM are discussed. The character of subvisible absorbing-emitting atmospheric layers (subvisible cirrus or SCi) in HCMM data is considered and radiosonde soundings are examined in relation to the presence of SCi. The statistical characteristics of multispectral meteorological satellite data in clear and SCi affected areas are discussed. Examples in TIROS-N and NOAA-7 data from several states and Mexico are presented. The VIS-IR cluster screening method for removing clouds is applied to a 262, 144 pixel HCMM scene from south Texas and northeast Mexico. The SCi that remain after cluster screening are sited out by applying a statistically determined IR limit.
Hybrid Cloud Computing Environment for EarthCube and Geoscience Community
NASA Astrophysics Data System (ADS)
Yang, C. P.; Qin, H.
2016-12-01
The NSF EarthCube Integration and Test Environment (ECITE) has built a hybrid cloud computing environment to provides cloud resources from private cloud environments by using cloud system software - OpenStack and Eucalyptus, and also manages public cloud - Amazon Web Service that allow resource synchronizing and bursting between private and public cloud. On ECITE hybrid cloud platform, EarthCube and geoscience community can deploy and manage the applications by using base virtual machine images or customized virtual machines, analyze big datasets by using virtual clusters, and real-time monitor the virtual resource usage on the cloud. Currently, a number of EarthCube projects have deployed or started migrating their projects to this platform, such as CHORDS, BCube, CINERGI, OntoSoft, and some other EarthCube building blocks. To accomplish the deployment or migration, administrator of ECITE hybrid cloud platform prepares the specific needs (e.g. images, port numbers, usable cloud capacity, etc.) of each project in advance base on the communications between ECITE and participant projects, and then the scientists or IT technicians in those projects launch one or multiple virtual machines, access the virtual machine(s) to set up computing environment if need be, and migrate their codes, documents or data without caring about the heterogeneity in structure and operations among different cloud platforms.
Accelerating statistical image reconstruction algorithms for fan-beam x-ray CT using cloud computing
NASA Astrophysics Data System (ADS)
Srivastava, Somesh; Rao, A. Ravishankar; Sheinin, Vadim
2011-03-01
Statistical image reconstruction algorithms potentially offer many advantages to x-ray computed tomography (CT), e.g. lower radiation dose. But, their adoption in practical CT scanners requires extra computation power, which is traditionally provided by incorporating additional computing hardware (e.g. CPU-clusters, GPUs, FPGAs etc.) into a scanner. An alternative solution is to access the required computation power over the internet from a cloud computing service, which is orders-of-magnitude more cost-effective. This is because users only pay a small pay-as-you-go fee for the computation resources used (i.e. CPU time, storage etc.), and completely avoid purchase, maintenance and upgrade costs. In this paper, we investigate the benefits and shortcomings of using cloud computing for statistical image reconstruction. We parallelized the most time-consuming parts of our application, the forward and back projectors, using MapReduce, the standard parallelization library on clouds. From preliminary investigations, we found that a large speedup is possible at a very low cost. But, communication overheads inside MapReduce can limit the maximum speedup, and a better MapReduce implementation might become necessary in the future. All the experiments for this paper, including development and testing, were completed on the Amazon Elastic Compute Cloud (EC2) for less than $20.
CE-ACCE: The Cloud Enabled Advanced sCience Compute Environment
NASA Astrophysics Data System (ADS)
Cinquini, L.; Freeborn, D. J.; Hardman, S. H.; Wong, C.
2017-12-01
Traditionally, Earth Science data from NASA remote sensing instruments has been processed by building custom data processing pipelines (often based on a common workflow engine or framework) which are typically deployed and run on an internal cluster of computing resources. This approach has some intrinsic limitations: it requires each mission to develop and deploy a custom software package on top of the adopted framework; it makes use of dedicated hardware, network and storage resources, which must be specifically purchased, maintained and re-purposed at mission completion; and computing services cannot be scaled on demand beyond the capability of the available servers.More recently, the rise of Cloud computing, coupled with other advances in containerization technology (most prominently, Docker) and micro-services architecture, has enabled a new paradigm, whereby space mission data can be processed through standard system architectures, which can be seamlessly deployed and scaled on demand on either on-premise clusters, or commercial Cloud providers. In this talk, we will present one such architecture named CE-ACCE ("Cloud Enabled Advanced sCience Compute Environment"), which we have been developing at the NASA Jet Propulsion Laboratory over the past year. CE-ACCE is based on the Apache OODT ("Object Oriented Data Technology") suite of services for full data lifecycle management, which are turned into a composable array of Docker images, and complemented by a plug-in model for mission-specific customization. We have applied this infrastructure to both flying and upcoming NASA missions, such as ECOSTRESS and SMAP, and demonstrated deployment on the Amazon Cloud, either using simple EC2 instances, or advanced AWS services such as Amazon Lambda and ECS (EC2 Container Services).
Research on elastic resource management for multi-queue under cloud computing environment
NASA Astrophysics Data System (ADS)
CHENG, Zhenjing; LI, Haibo; HUANG, Qiulan; Cheng, Yaodong; CHEN, Gang
2017-10-01
As a new approach to manage computing resource, virtualization technology is more and more widely applied in the high-energy physics field. A virtual computing cluster based on Openstack was built at IHEP, using HTCondor as the job queue management system. In a traditional static cluster, a fixed number of virtual machines are pre-allocated to the job queue of different experiments. However this method cannot be well adapted to the volatility of computing resource requirements. To solve this problem, an elastic computing resource management system under cloud computing environment has been designed. This system performs unified management of virtual computing nodes on the basis of job queue in HTCondor based on dual resource thresholds as well as the quota service. A two-stage pool is designed to improve the efficiency of resource pool expansion. This paper will present several use cases of the elastic resource management system in IHEPCloud. The practical run shows virtual computing resource dynamically expanded or shrunk while computing requirements change. Additionally, the CPU utilization ratio of computing resource was significantly increased when compared with traditional resource management. The system also has good performance when there are multiple condor schedulers and multiple job queues.
A Model for Protostellar Cluster Luminosities and the Impact on the CO–H2 Conversion Factor
NASA Astrophysics Data System (ADS)
Gaches, Brandt A. L.; Offner, Stella S. R.
2018-02-01
We construct a semianalytic model to study the effect of far-ultraviolet (FUV) radiation on gas chemistry from embedded protostars. We use the protostellar luminosity function (PLF) formalism of Offner & McKee to calculate the total, FUV, and ionizing cluster luminosity for various protostellar accretion histories and cluster sizes. We2 compare the model predictions with surveys of Gould Belt star-forming regions and find that the tapered turbulent core model matches best the mean luminosities and the spread in the data. We combine the cluster model with the photodissociation region astrochemistry code, 3D-PDR, to compute the impact of the FUV luminosity from embedded protostars on the CO-to-H2 conversion factor, X CO, as a function of cluster size, gas mass, and star formation efficiency. We find that X CO has a weak dependence on the FUV radiation from embedded sources for large clusters owing to high cloud optical depths. In smaller and more efficient clusters the embedded FUV increases X CO to levels consistent with the average Milky Way values. The internal physical and chemical structures of the cloud are significantly altered, and X CO depends strongly on the protostellar cluster mass for small efficient clouds.
Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
ERIC Educational Resources Information Center
Younge, Andrew J.
2016-01-01
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…
A Cloud-Based Simulation Architecture for Pandemic Influenza Simulation
Eriksson, Henrik; Raciti, Massimiliano; Basile, Maurizio; Cunsolo, Alessandro; Fröberg, Anders; Leifler, Ola; Ekberg, Joakim; Timpka, Toomas
2011-01-01
High-fidelity simulations of pandemic outbreaks are resource consuming. Cluster-based solutions have been suggested for executing such complex computations. We present a cloud-based simulation architecture that utilizes computing resources both locally available and dynamically rented online. The approach uses the Condor framework for job distribution and management of the Amazon Elastic Computing Cloud (EC2) as well as local resources. The architecture has a web-based user interface that allows users to monitor and control simulation execution. In a benchmark test, the best cost-adjusted performance was recorded for the EC2 H-CPU Medium instance, while a field trial showed that the job configuration had significant influence on the execution time and that the network capacity of the master node could become a bottleneck. We conclude that it is possible to develop a scalable simulation environment that uses cloud-based solutions, while providing an easy-to-use graphical user interface. PMID:22195089
HPC on Competitive Cloud Resources
NASA Astrophysics Data System (ADS)
Bientinesi, Paolo; Iakymchuk, Roman; Napper, Jeff
Computing as a utility has reached the mainstream. Scientists can now easily rent time on large commercial clusters that can be expanded and reduced on-demand in real-time. However, current commercial cloud computing performance falls short of systems specifically designed for scientific applications. Scientific computing needs are quite different from those of the web applications that have been the focus of cloud computing vendors. In this chapter we demonstrate through empirical evaluation the computational efficiency of high-performance numerical applications in a commercial cloud environment when resources are shared under high contention. Using the Linpack benchmark as a case study, we show that cache utilization becomes highly unpredictable and similarly affects computation time. For some problems, not only is it more efficient to underutilize resources, but the solution can be reached sooner in realtime (wall-time). We also show that the smallest, cheapest (64-bit) instance on the studied environment is the best for price to performance ration. In light of the high-contention we witness, we believe that alternative definitions of efficiency for commercial cloud environments should be introduced where strong performance guarantees do not exist. Concepts like average, expected performance and execution time, expected cost to completion, and variance measures--traditionally ignored in the high-performance computing context--now should complement or even substitute the standard definitions of efficiency.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-07
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
Abdulhamid, Shafi’i Muhammad; Abd Latiff, Muhammad Shafie; Abdul-Salaam, Gaddafi; Hussain Madni, Syed Hamid
2016-01-01
Cloud computing system is a huge cluster of interconnected servers residing in a datacenter and dynamically provisioned to clients on-demand via a front-end interface. Scientific applications scheduling in the cloud computing environment is identified as NP-hard problem due to the dynamic nature of heterogeneous resources. Recently, a number of metaheuristics optimization schemes have been applied to address the challenges of applications scheduling in the cloud system, without much emphasis on the issue of secure global scheduling. In this paper, scientific applications scheduling techniques using the Global League Championship Algorithm (GBLCA) optimization technique is first presented for global task scheduling in the cloud environment. The experiment is carried out using CloudSim simulator. The experimental results show that, the proposed GBLCA technique produced remarkable performance improvement rate on the makespan that ranges between 14.44% to 46.41%. It also shows significant reduction in the time taken to securely schedule applications as parametrically measured in terms of the response time. In view of the experimental results, the proposed technique provides better-quality scheduling solution that is suitable for scientific applications task execution in the Cloud Computing environment than the MinMin, MaxMin, Genetic Algorithm (GA) and Ant Colony Optimization (ACO) scheduling techniques. PMID:27384239
Abdulhamid, Shafi'i Muhammad; Abd Latiff, Muhammad Shafie; Abdul-Salaam, Gaddafi; Hussain Madni, Syed Hamid
2016-01-01
Cloud computing system is a huge cluster of interconnected servers residing in a datacenter and dynamically provisioned to clients on-demand via a front-end interface. Scientific applications scheduling in the cloud computing environment is identified as NP-hard problem due to the dynamic nature of heterogeneous resources. Recently, a number of metaheuristics optimization schemes have been applied to address the challenges of applications scheduling in the cloud system, without much emphasis on the issue of secure global scheduling. In this paper, scientific applications scheduling techniques using the Global League Championship Algorithm (GBLCA) optimization technique is first presented for global task scheduling in the cloud environment. The experiment is carried out using CloudSim simulator. The experimental results show that, the proposed GBLCA technique produced remarkable performance improvement rate on the makespan that ranges between 14.44% to 46.41%. It also shows significant reduction in the time taken to securely schedule applications as parametrically measured in terms of the response time. In view of the experimental results, the proposed technique provides better-quality scheduling solution that is suitable for scientific applications task execution in the Cloud Computing environment than the MinMin, MaxMin, Genetic Algorithm (GA) and Ant Colony Optimization (ACO) scheduling techniques.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heidelberg, S T; Fitzgerald, K J; Richmond, G H
2006-01-24
There has been substantial development of the Lustre parallel filesystem prior to the configuration described below for this milestone. The initial Lustre filesystems that were deployed were directly connected to the cluster interconnect, i.e. Quadrics Elan3. That is, the clients (OSSes) and Meta-data Servers (MDS) were all directly connected to the cluster's internal high speed interconnect. This configuration serves a single cluster very well, but does not provide sharing of the filesystem among clusters. LLNL funded the development of high-efficiency ''portals router'' code by CFS (the company that develops Lustre) to enable us to move the Lustre servers to amore » GigE-connected network configuration, thus making it possible to connect to the servers from several clusters. With portals routing available, here is what changes: (1) another storage-only cluster is deployed to front the Lustre storage devices (these become the Lustre OSSes and MDS), (2) this ''Lustre cluster'' is attached via GigE connections to a large GigE switch/router cloud, (3) a small number of compute-cluster nodes are designated as ''gateway'' or ''portal router'' nodes, and (4) the portals router nodes are GigE-connected to the switch/router cloud. The Lustre configuration is then changed to reflect the new network paths. A typical example of this is a compute cluster and a related visualization cluster: the compute cluster produces the data (writes it to the Lustre filesystem), and the visualization cluster consumes some of the data (reads it from the Lustre filesystem). This process can be expanded by aggregating several collections of Lustre backend storage resources into one or more ''centralized'' Lustre filesystems, and then arranging to have several ''client'' clusters mount these centralized filesystems. The ''client clusters'' can be any combination of compute, visualization, archiving, or other types of cluster. This milestone demonstrates the operation and performance of a scaled-down version of such a large, centralized, shared Lustre filesystem concept.« less
NASA Technical Reports Server (NTRS)
Endlich, R. M.; Wolf, D. E.
1980-01-01
The automatic cloud tracking system was applied to METEOSAT 6.7 micrometers water vapor measurements to learn whether the system can track the motions of water vapor patterns. Data for the midlatitudes, subtropics, and tropics were selected from a sequence of METEOSAT pictures for 25 April 1978. Trackable features in the water vapor patterns were identified using a clustering technique and the features were tracked by two different methods. In flat (low contrast) water vapor fields, the automatic motion computations were not reliable, but in areas where the water vapor fields contained small scale structure (such as in the vicinity of active weather phenomena) the computations were successful. Cloud motions were computed using METEOSAT infrared observations (including tropical convective systems and midlatitude jet stream cirrus).
Distributed MRI reconstruction using Gadgetron-based cloud computing.
Xue, Hui; Inati, Souheil; Sørensen, Thomas Sangild; Kellman, Peter; Hansen, Michael S
2015-03-01
To expand the open source Gadgetron reconstruction framework to support distributed computing and to demonstrate that a multinode version of the Gadgetron can be used to provide nonlinear reconstruction with clinically acceptable latency. The Gadgetron framework was extended with new software components that enable an arbitrary number of Gadgetron instances to collaborate on a reconstruction task. This cloud-enabled version of the Gadgetron was deployed on three different distributed computing platforms ranging from a heterogeneous collection of commodity computers to the commercial Amazon Elastic Compute Cloud. The Gadgetron cloud was used to provide nonlinear, compressed sensing reconstruction on a clinical scanner with low reconstruction latency (eg, cardiac and neuroimaging applications). The proposed setup was able to handle acquisition and 11 -SPIRiT reconstruction of nine high temporal resolution real-time, cardiac short axis cine acquisitions, covering the ventricles for functional evaluation, in under 1 min. A three-dimensional high-resolution brain acquisition with 1 mm(3) isotropic pixel size was acquired and reconstructed with nonlinear reconstruction in less than 5 min. A distributed computing enabled Gadgetron provides a scalable way to improve reconstruction performance using commodity cluster computing. Nonlinear, compressed sensing reconstruction can be deployed clinically with low image reconstruction latency. © 2014 Wiley Periodicals, Inc.
Clustering, randomness, and regularity in cloud fields. 4. Stratocumulus cloud fields
NASA Astrophysics Data System (ADS)
Lee, J.; Chou, J.; Weger, R. C.; Welch, R. M.
1994-07-01
To complete the analysis of the spatial distribution of boundary layer cloudiness, the present study focuses on nine stratocumulus Landsat scenes. The results indicate many similarities between stratocumulus and cumulus spatial distributions. Most notably, at full spatial resolution all scenes exhibit a decidedly clustered distribution. The strength of the clustering signal decreases with increasing cloud size; the clusters themselves consist of a few clouds (less than 10), occupy a small percentage of the cloud field area (less than 5%), contain between 20% and 60% of the cloud field population, and are randomly located within the scene. In contrast, stratocumulus in almost every respect are more strongly clustered than are cumulus cloud fields. For instance, stratocumulus clusters contain more clouds per cluster, occupy a larger percentage of the total area, and have a larger percentage of clouds participating in clusters than the corresponding cumulus examples. To investigate clustering at intermediate spatial scales, the local dimensionality statistic is introduced. Results obtained from this statistic provide the first direct evidence for regularity among large (>900 m in diameter) clouds in stratocumulus and cumulus cloud fields, in support of the inhibition hypothesis of Ramirez and Bras (1990). Also, the size compensated point-to-cloud cumulative distribution function statistic is found to be necessary to obtain a consistent description of stratocumulus cloud distributions. A hypothesis regarding the underlying physical mechanisms responsible for cloud clustering is presented. It is suggested that cloud clusters often arise from 4 to 10 triggering events localized within regions less than 2 km in diameter and randomly distributed within the cloud field. As the size of the cloud surpasses the scale of the triggering region, the clustering signal weakens and the larger cloud locations become more random.
Clustering, randomness, and regularity in cloud fields. 4: Stratocumulus cloud fields
NASA Technical Reports Server (NTRS)
Lee, J.; Chou, J.; Weger, R. C.; Welch, R. M.
1994-01-01
To complete the analysis of the spatial distribution of boundary layer cloudiness, the present study focuses on nine stratocumulus Landsat scenes. The results indicate many similarities between stratocumulus and cumulus spatial distributions. Most notably, at full spatial resolution all scenes exhibit a decidedly clustered distribution. The strength of the clustering signal decreases with increasing cloud size; the clusters themselves consist of a few clouds (less than 10), occupy a small percentage of the cloud field area (less than 5%), contain between 20% and 60% of the cloud field population, and are randomly located within the scene. In contrast, stratocumulus in almost every respect are more strongly clustered than are cumulus cloud fields. For instance, stratocumulus clusters contain more clouds per cluster, occupy a larger percentage of the total area, and have a larger percentage of clouds participating in clusters than the corresponding cumulus examples. To investigate clustering at intermediate spatial scales, the local dimensionality statistic is introduced. Results obtained from this statistic provide the first direct evidence for regularity among large (more than 900 m in diameter) clouds in stratocumulus and cumulus cloud fields, in support of the inhibition hypothesis of Ramirez and Bras (1990). Also, the size compensated point-to-cloud cumulative distribution function statistic is found to be necessary to obtain a consistent description of stratocumulus cloud distributions. A hypothesis regarding the underlying physical mechanisms responsible for cloud clustering is presented. It is suggested that cloud clusters often arise from 4 to 10 triggering events localized within regions less than 2 km in diameter and randomly distributed within the cloud field. As the size of the cloud surpasses the scale of the triggering region, the clustering signal weakens and the larger cloud locations become more random.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-18
... device to function as a cloud computing device similar to a network storage RAID array (HDDs strung... contract. This final determination, in HQ H082476, was issued at the request of Scale Computing under... response to your request dated October 15, 2009, made on behalf of Scale Computing (``Scale''). You ask for...
On-demand provisioning of HEP compute resources on cloud sites and shared HPC centers
NASA Astrophysics Data System (ADS)
Erli, G.; Fischer, F.; Fleig, G.; Giffels, M.; Hauth, T.; Quast, G.; Schnepf, M.; Heese, J.; Leppert, K.; Arnaez de Pedro, J.; Sträter, R.
2017-10-01
This contribution reports on solutions, experiences and recent developments with the dynamic, on-demand provisioning of remote computing resources for analysis and simulation workflows. Local resources of a physics institute are extended by private and commercial cloud sites, ranging from the inclusion of desktop clusters over institute clusters to HPC centers. Rather than relying on dedicated HEP computing centers, it is nowadays more reasonable and flexible to utilize remote computing capacity via virtualization techniques or container concepts. We report on recent experience from incorporating a remote HPC center (NEMO Cluster, Freiburg University) and resources dynamically requested from the commercial provider 1&1 Internet SE into our intitute’s computing infrastructure. The Freiburg HPC resources are requested via the standard batch system, allowing HPC and HEP applications to be executed simultaneously, such that regular batch jobs run side by side to virtual machines managed via OpenStack [1]. For the inclusion of the 1&1 commercial resources, a Python API and SDK as well as the possibility to upload images were available. Large scale tests prove the capability to serve the scientific use case in the European 1&1 datacenters. The described environment at the Institute of Experimental Nuclear Physics (IEKP) at KIT serves the needs of researchers participating in the CMS and Belle II experiments. In total, resources exceeding half a million CPU hours have been provided by remote sites.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasenkamp, Daren; Sim, Alexander; Wehner, Michael
Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less
Scalable computing for evolutionary genomics.
Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert
2012-01-01
Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure
NASA Astrophysics Data System (ADS)
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-01
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering.
Guo, Xuan; Meng, Yu; Yu, Ning; Pan, Yi
2014-04-10
Taking the advantage of high-throughput single nucleotide polymorphism (SNP) genotyping technology, large genome-wide association studies (GWASs) have been considered to hold promise for unravelling complex relationships between genotype and phenotype. At present, traditional single-locus-based methods are insufficient to detect interactions consisting of multiple-locus, which are broadly existing in complex traits. In addition, statistic tests for high order epistatic interactions with more than 2 SNPs propose computational and analytical challenges because the computation increases exponentially as the cardinality of SNPs combinations gets larger. In this paper, we provide a simple, fast and powerful method using dynamic clustering and cloud computing to detect genome-wide multi-locus epistatic interactions. We have constructed systematic experiments to compare powers performance against some recently proposed algorithms, including TEAM, SNPRuler, EDCF and BOOST. Furthermore, we have applied our method on two real GWAS datasets, Age-related macular degeneration (AMD) and Rheumatoid arthritis (RA) datasets, where we find some novel potential disease-related genetic factors which are not shown up in detections of 2-loci epistatic interactions. Experimental results on simulated data demonstrate that our method is more powerful than some recently proposed methods on both two- and three-locus disease models. Our method has discovered many novel high-order associations that are significantly enriched in cases from two real GWAS datasets. Moreover, the running time of the cloud implementation for our method on AMD dataset and RA dataset are roughly 2 hours and 50 hours on a cluster with forty small virtual machines for detecting two-locus interactions, respectively. Therefore, we believe that our method is suitable and effective for the full-scale analysis of multiple-locus epistatic interactions in GWAS.
Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering
2014-01-01
Backgroud Taking the advan tage of high-throughput single nucleotide polymorphism (SNP) genotyping technology, large genome-wide association studies (GWASs) have been considered to hold promise for unravelling complex relationships between genotype and phenotype. At present, traditional single-locus-based methods are insufficient to detect interactions consisting of multiple-locus, which are broadly existing in complex traits. In addition, statistic tests for high order epistatic interactions with more than 2 SNPs propose computational and analytical challenges because the computation increases exponentially as the cardinality of SNPs combinations gets larger. Results In this paper, we provide a simple, fast and powerful method using dynamic clustering and cloud computing to detect genome-wide multi-locus epistatic interactions. We have constructed systematic experiments to compare powers performance against some recently proposed algorithms, including TEAM, SNPRuler, EDCF and BOOST. Furthermore, we have applied our method on two real GWAS datasets, Age-related macular degeneration (AMD) and Rheumatoid arthritis (RA) datasets, where we find some novel potential disease-related genetic factors which are not shown up in detections of 2-loci epistatic interactions. Conclusions Experimental results on simulated data demonstrate that our method is more powerful than some recently proposed methods on both two- and three-locus disease models. Our method has discovered many novel high-order associations that are significantly enriched in cases from two real GWAS datasets. Moreover, the running time of the cloud implementation for our method on AMD dataset and RA dataset are roughly 2 hours and 50 hours on a cluster with forty small virtual machines for detecting two-locus interactions, respectively. Therefore, we believe that our method is suitable and effective for the full-scale analysis of multiple-locus epistatic interactions in GWAS. PMID:24717145
Enhanced K-means clustering with encryption on cloud
NASA Astrophysics Data System (ADS)
Singh, Iqjot; Dwivedi, Prerna; Gupta, Taru; Shynu, P. G.
2017-11-01
This paper tries to solve the problem of storing and managing big files over cloud by implementing hashing on Hadoop in big-data and ensure security while uploading and downloading files. Cloud computing is a term that emphasis on sharing data and facilitates to share infrastructure and resources.[10] Hadoop is an open source software that gives us access to store and manage big files according to our needs on cloud. K-means clustering algorithm is an algorithm used to calculate distance between the centroid of the cluster and the data points. Hashing is a algorithm in which we are storing and retrieving data with hash keys. The hashing algorithm is called as hash function which is used to portray the original data and later to fetch the data stored at the specific key. [17] Encryption is a process to transform electronic data into non readable form known as cipher text. Decryption is the opposite process of encryption, it transforms the cipher text into plain text that the end user can read and understand well. For encryption and decryption we are using Symmetric key cryptographic algorithm. In symmetric key cryptography are using DES algorithm for a secure storage of the files. [3
The StratusLab cloud distribution: Use-cases and support for scientific applications
NASA Astrophysics Data System (ADS)
Floros, E.
2012-04-01
The StratusLab project is integrating an open cloud software distribution that enables organizations to setup and provide their own private or public IaaS (Infrastructure as a Service) computing clouds. StratusLab distribution capitalizes on popular infrastructure virtualization solutions like KVM, the OpenNebula virtual machine manager, Claudia service manager and SlipStream deployment platform, which are further enhanced and expanded with additional components developed within the project. The StratusLab distribution covers the core aspects of a cloud IaaS architecture, namely Computing (life-cycle management of virtual machines), Storage, Appliance management and Networking. The resulting software stack provides a packaged turn-key solution for deploying cloud computing services. The cloud computing infrastructures deployed using StratusLab can support a wide range of scientific and business use cases. Grid computing has been the primary use case pursued by the project and for this reason the initial priority has been the support for the deployment and operation of fully virtualized production-level grid sites; a goal that has already been achieved by operating such a site as part of EGI's (European Grid Initiative) pan-european grid infrastructure. In this area the project is currently working to provide non-trivial capabilities like elastic and autonomic management of grid site resources. Although grid computing has been the motivating paradigm, StratusLab's cloud distribution can support a wider range of use cases. Towards this direction, we have developed and currently provide support for setting up general purpose computing solutions like Hadoop, MPI and Torque clusters. For what concerns scientific applications the project is collaborating closely with the Bioinformatics community in order to prepare VM appliances and deploy optimized services for bioinformatics applications. In a similar manner additional scientific disciplines like Earth Science can take advantage of StratusLab cloud solutions. Interested users are welcomed to join StratusLab's user community by getting access to the reference cloud services deployed by the project and offered to the public.
Cloud-Based Perception and Control of Sensor Nets and Robot Swarms
2016-04-01
distributed stream processing framework provides the necessary API and infrastructure to develop and execute such applications in a cluster of computation...streaming DDDAS applications based on challenges they present to the backend Cloud control system. Figure 2 Parallel SLAM Application 3 1) Set of...the art deep learning- based object detectors can recognize among hundreds of object classes and this capability would be very useful for mobile
A convergent model for distributed processing of Big Sensor Data in urban engineering networks
NASA Astrophysics Data System (ADS)
Parygin, D. S.; Finogeev, A. G.; Kamaev, V. A.; Finogeev, A. A.; Gnedkova, E. P.; Tyukov, A. P.
2017-01-01
The problems of development and research of a convergent model of the grid, cloud, fog and mobile computing for analytical Big Sensor Data processing are reviewed. The model is meant to create monitoring systems of spatially distributed objects of urban engineering networks and processes. The proposed approach is the convergence model of the distributed data processing organization. The fog computing model is used for the processing and aggregation of sensor data at the network nodes and/or industrial controllers. The program agents are loaded to perform computing tasks for the primary processing and data aggregation. The grid and the cloud computing models are used for integral indicators mining and accumulating. A computing cluster has a three-tier architecture, which includes the main server at the first level, a cluster of SCADA system servers at the second level, a lot of GPU video cards with the support for the Compute Unified Device Architecture at the third level. The mobile computing model is applied to visualize the results of intellectual analysis with the elements of augmented reality and geo-information technologies. The integrated indicators are transferred to the data center for accumulation in a multidimensional storage for the purpose of data mining and knowledge gaining.
Chung, Wei-Chun; Chen, Chien-Chih; Ho, Jan-Ming; Lin, Chung-Yen; Hsu, Wen-Lian; Wang, Yu-Chun; Lee, D T; Lai, Feipei; Huang, Chih-Wei; Chang, Yu-Jung
2014-01-01
Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome assembly, mapping reads to genomes, and finding single nucleotide polymorphisms. Major cloud providers offer Hadoop cloud services to their users. However, it remains technically challenging to deploy a Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce. We present CloudDOE, a platform-independent software package implemented in Java. CloudDOE encapsulates technical details behind a user-friendly graphical interface, thus liberating scientists from having to perform complicated operational procedures. Users are guided through the user interface to deploy a Hadoop cloud within in-house computing environments and to run applications specifically targeted for bioinformatics, including CloudBurst, CloudBrush, and CloudRS. One may also use CloudDOE on top of a public cloud. CloudDOE consists of three wizards, i.e., Deploy, Operate, and Extend wizards. Deploy wizard is designed to aid the system administrator to deploy a Hadoop cloud. It installs Java runtime environment version 1.6 and Hadoop version 0.20.203, and initiates the service automatically. Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the dashboard list, the administrator may install a new MapReduce application using Extend wizard. CloudDOE is a user-friendly tool for deploying a Hadoop cloud. Its smart wizards substantially reduce the complexity and costs of deployment, execution, enhancement, and management. Interested users may collaborate to improve the source code of CloudDOE to further incorporate more MapReduce bioinformatics tools into CloudDOE and support next-generation big data open source tools, e.g., Hadoop BigTop and Spark. CloudDOE is distributed under Apache License 2.0 and is freely available at http://clouddoe.iis.sinica.edu.tw/.
Chung, Wei-Chun; Chen, Chien-Chih; Ho, Jan-Ming; Lin, Chung-Yen; Hsu, Wen-Lian; Wang, Yu-Chun; Lee, D. T.; Lai, Feipei; Huang, Chih-Wei; Chang, Yu-Jung
2014-01-01
Background Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome assembly, mapping reads to genomes, and finding single nucleotide polymorphisms. Major cloud providers offer Hadoop cloud services to their users. However, it remains technically challenging to deploy a Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce. Results We present CloudDOE, a platform-independent software package implemented in Java. CloudDOE encapsulates technical details behind a user-friendly graphical interface, thus liberating scientists from having to perform complicated operational procedures. Users are guided through the user interface to deploy a Hadoop cloud within in-house computing environments and to run applications specifically targeted for bioinformatics, including CloudBurst, CloudBrush, and CloudRS. One may also use CloudDOE on top of a public cloud. CloudDOE consists of three wizards, i.e., Deploy, Operate, and Extend wizards. Deploy wizard is designed to aid the system administrator to deploy a Hadoop cloud. It installs Java runtime environment version 1.6 and Hadoop version 0.20.203, and initiates the service automatically. Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the dashboard list, the administrator may install a new MapReduce application using Extend wizard. Conclusions CloudDOE is a user-friendly tool for deploying a Hadoop cloud. Its smart wizards substantially reduce the complexity and costs of deployment, execution, enhancement, and management. Interested users may collaborate to improve the source code of CloudDOE to further incorporate more MapReduce bioinformatics tools into CloudDOE and support next-generation big data open source tools, e.g., Hadoop BigTop and Spark. Availability: CloudDOE is distributed under Apache License 2.0 and is freely available at http://clouddoe.iis.sinica.edu.tw/. PMID:24897343
Volunteer Clouds and Citizen Cyberscience for LHC Physics
NASA Astrophysics Data System (ADS)
Aguado Sanchez, Carlos; Blomer, Jakob; Buncic, Predrag; Chen, Gang; Ellis, John; Garcia Quintas, David; Harutyunyan, Artem; Grey, Francois; Lombrana Gonzalez, Daniel; Marquina, Miguel; Mato, Pere; Rantala, Jarno; Schulz, Holger; Segal, Ben; Sharma, Archana; Skands, Peter; Weir, David; Wu, Jie; Wu, Wenjing; Yadav, Rohit
2011-12-01
Computing for the LHC, and for HEP more generally, is traditionally viewed as requiring specialized infrastructure and software environments, and therefore not compatible with the recent trend in "volunteer computing", where volunteers supply free processing time on ordinary PCs and laptops via standard Internet connections. In this paper, we demonstrate that with the use of virtual machine technology, at least some standard LHC computing tasks can be tackled with volunteer computing resources. Specifically, by presenting volunteer computing resources to HEP scientists as a "volunteer cloud", essentially identical to a Grid or dedicated cluster from a job submission perspective, LHC simulations can be processed effectively. This article outlines both the technical steps required for such a solution and the implications for LHC computing as well as for LHC public outreach and for participation by scientists from developing regions in LHC research.
Federated and Cloud Enabled Resources for Data Management and Utilization
NASA Astrophysics Data System (ADS)
Rankin, R.; Gordon, M.; Potter, R. G.; Satchwill, B.
2011-12-01
The emergence of cloud computing over the past three years has led to a paradigm shift in how data can be managed, processed and made accessible. Building on the federated data management system offered through the Canadian Space Science Data Portal (www.cssdp.ca), we demonstrate how heterogeneous and geographically distributed data sets and modeling tools have been integrated to form a virtual data center and computational modeling platform that has services for data processing and visualization embedded within it. We also discuss positive and negative experiences in utilizing Eucalyptus and OpenStack cloud applications, and job scheduling facilitated by Condor and Star Cluster. We summarize our findings by demonstrating use of these technologies in the Cloud Enabled Space Weather Data Assimilation and Modeling Platform CESWP (www.ceswp.ca), which is funded through Canarie's (canarie.ca) Network Enabled Platforms program in Canada.
Biomedical cloud computing with Amazon Web Services.
Fusaro, Vincent A; Patil, Prasad; Gafni, Erik; Wall, Dennis P; Tonellato, Peter J
2011-08-01
In this overview to biomedical computing in the cloud, we discussed two primary ways to use the cloud (a single instance or cluster), provided a detailed example using NGS mapping, and highlighted the associated costs. While many users new to the cloud may assume that entry is as straightforward as uploading an application and selecting an instance type and storage options, we illustrated that there is substantial up-front effort required before an application can make full use of the cloud's vast resources. Our intention was to provide a set of best practices and to illustrate how those apply to a typical application pipeline for biomedical informatics, but also general enough for extrapolation to other types of computational problems. Our mapping example was intended to illustrate how to develop a scalable project and not to compare and contrast alignment algorithms for read mapping and genome assembly. Indeed, with a newer aligner such as Bowtie, it is possible to map the entire African genome using one m2.2xlarge instance in 48 hours for a total cost of approximately $48 in computation time. In our example, we were not concerned with data transfer rates, which are heavily influenced by the amount of available bandwidth, connection latency, and network availability. When transferring large amounts of data to the cloud, bandwidth limitations can be a major bottleneck, and in some cases it is more efficient to simply mail a storage device containing the data to AWS (http://aws.amazon.com/importexport/). More information about cloud computing, detailed cost analysis, and security can be found in references.
Cloud GPU-based simulations for SQUAREMR.
Kantasis, George; Xanthis, Christos G; Haris, Kostas; Heiberg, Einar; Aletras, Anthony H
2017-01-01
Quantitative Magnetic Resonance Imaging (MRI) is a research tool, used more and more in clinical practice, as it provides objective information with respect to the tissues being imaged. Pixel-wise T 1 quantification (T 1 mapping) of the myocardium is one such application with diagnostic significance. A number of mapping sequences have been developed for myocardial T 1 mapping with a wide range in terms of measurement accuracy and precision. Furthermore, measurement results obtained with these pulse sequences are affected by errors introduced by the particular acquisition parameters used. SQUAREMR is a new method which has the potential of improving the accuracy of these mapping sequences through the use of massively parallel simulations on Graphical Processing Units (GPUs) by taking into account different acquisition parameter sets. This method has been shown to be effective in myocardial T 1 mapping; however, execution times may exceed 30min which is prohibitively long for clinical applications. The purpose of this study was to accelerate the construction of SQUAREMR's multi-parametric database to more clinically acceptable levels. The aim of this study was to develop a cloud-based cluster in order to distribute the computational load to several GPU-enabled nodes and accelerate SQUAREMR. This would accommodate high demands for computational resources without the need for major upfront equipment investment. Moreover, the parameter space explored by the simulations was optimized in order to reduce the computational load without compromising the T 1 estimates compared to a non-optimized parameter space approach. A cloud-based cluster with 16 nodes resulted in a speedup of up to 13.5 times compared to a single-node execution. Finally, the optimized parameter set approach allowed for an execution time of 28s using the 16-node cluster, without compromising the T 1 estimates by more than 10ms. The developed cloud-based cluster and optimization of the parameter set reduced the execution time of the simulations involved in constructing the SQUAREMR multi-parametric database thus bringing SQUAREMR's applicability within time frames that would be likely acceptable in the clinic. Copyright © 2016 Elsevier Inc. All rights reserved.
Templet Web: the use of volunteer computing approach in PaaS-style cloud
NASA Astrophysics Data System (ADS)
Vostokin, Sergei; Artamonov, Yuriy; Tsarev, Daniil
2018-03-01
This article presents the Templet Web cloud service. The service is designed for high-performance scientific computing automation. The use of high-performance technology is specifically required by new fields of computational science such as data mining, artificial intelligence, machine learning, and others. Cloud technologies provide a significant cost reduction for high-performance scientific applications. The main objectives to achieve this cost reduction in the Templet Web service design are: (a) the implementation of "on-demand" access; (b) source code deployment management; (c) high-performance computing programs development automation. The distinctive feature of the service is the approach mainly used in the field of volunteer computing, when a person who has access to a computer system delegates his access rights to the requesting user. We developed an access procedure, algorithms, and software for utilization of free computational resources of the academic cluster system in line with the methods of volunteer computing. The Templet Web service has been in operation for five years. It has been successfully used for conducting laboratory workshops and solving research problems, some of which are considered in this article. The article also provides an overview of research directions related to service development.
A case study of tuning MapReduce for efficient Bioinformatics in the cloud
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Lizhen; Wang, Zhong; Yu, Weikuan
The combination of the Hadoop MapReduce programming model and cloud computing allows biological scientists to analyze next-generation sequencing (NGS) data in a timely and cost-effective manner. Cloud computing platforms remove the burden of IT facility procurement and management from end users and provide ease of access to Hadoop clusters. However, biological scientists are still expected to choose appropriate Hadoop parameters for running their jobs. More importantly, the available Hadoop tuning guidelines are either obsolete or too general to capture the particular characteristics of bioinformatics applications. In this paper, we aim to minimize the cloud computing cost spent on bioinformatics datamore » analysis by optimizing the extracted significant Hadoop parameters. When using MapReduce-based bioinformatics tools in the cloud, the default settings often lead to resource underutilization and wasteful expenses. We choose k-mer counting, a representative application used in a large number of NGS data analysis tools, as our study case. Experimental results show that, with the fine-tuned parameters, we achieve a total of 4× speedup compared with the original performance (using the default settings). Finally, this paper presents an exemplary case for tuning MapReduce-based bioinformatics applications in the cloud, and documents the key parameters that could lead to significant performance benefits.« less
Dashti, Ali; Komarov, Ivan; D'Souza, Roshan M
2013-01-01
This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
Are Cloud Environments Ready for Scientific Applications?
NASA Astrophysics Data System (ADS)
Mehrotra, P.; Shackleford, K.
2011-12-01
Cloud computing environments are becoming widely available both in the commercial and government sectors. They provide flexibility to rapidly provision resources in order to meet dynamic and changing computational needs without the customers incurring capital expenses and/or requiring technical expertise. Clouds also provide reliable access to resources even though the end-user may not have in-house expertise for acquiring or operating such resources. Consolidation and pooling in a cloud environment allow organizations to achieve economies of scale in provisioning or procuring computing resources and services. Because of these and other benefits, many businesses and organizations are migrating their business applications (e.g., websites, social media, and business processes) to cloud environments-evidenced by the commercial success of offerings such as the Amazon EC2. In this paper, we focus on the feasibility of utilizing cloud environments for scientific workloads and workflows particularly of interest to NASA scientists and engineers. There is a wide spectrum of such technical computations. These applications range from small workstation-level computations to mid-range computing requiring small clusters to high-performance simulations requiring supercomputing systems with high bandwidth/low latency interconnects. Data-centric applications manage and manipulate large data sets such as satellite observational data and/or data previously produced by high-fidelity modeling and simulation computations. Most of the applications are run in batch mode with static resource requirements. However, there do exist situations that have dynamic demands, particularly ones with public-facing interfaces providing information to the general public, collaborators and partners, as well as to internal NASA users. In the last few months we have been studying the suitability of cloud environments for NASA's technical and scientific workloads. We have ported several applications to multiple cloud environments including NASA's Nebula environment, Amazon's EC2, Magellan at NERSC, and SGI's Cyclone system. We critically examined the performance of the applications on these systems. We also collected information on the usability of these cloud environments. In this talk we will present the results of our study focusing on the efficacy of using clouds for NASA's scientific applications.
NGScloud: RNA-seq analysis of non-model species using cloud computing.
Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai
2018-05-03
RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.
Federated data storage system prototype for LHC experiments and data intensive science
NASA Astrophysics Data System (ADS)
Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.
2017-10-01
Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.
Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E
2012-03-19
A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community
2012-01-01
Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them. PMID:22429538
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL.
Stone, John E; Messmer, Peter; Sisneros, Robert; Schulten, Klaus
2016-05-01
Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL
Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus
2016-01-01
Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
NASA Astrophysics Data System (ADS)
Capone, V.; Esposito, R.; Pardi, S.; Taurino, F.; Tortone, G.
2012-12-01
Over the last few years we have seen an increasing number of services and applications needed to manage and maintain cloud computing facilities. This is particularly true for computing in high energy physics, which often requires complex configurations and distributed infrastructures. In this scenario a cost effective rationalization and consolidation strategy is the key to success in terms of scalability and reliability. In this work we describe an IaaS (Infrastructure as a Service) cloud computing system, with high availability and redundancy features, which is currently in production at INFN-Naples and ATLAS Tier-2 data centre. The main goal we intended to achieve was a simplified method to manage our computing resources and deliver reliable user services, reusing existing hardware without incurring heavy costs. A combined usage of virtualization and clustering technologies allowed us to consolidate our services on a small number of physical machines, reducing electric power costs. As a result of our efforts we developed a complete solution for data and computing centres that can be easily replicated using commodity hardware. Our architecture consists of 2 main subsystems: a clustered storage solution, built on top of disk servers running GlusterFS file system, and a virtual machines execution environment. GlusterFS is a network file system able to perform parallel writes on multiple disk servers, providing this way live replication of data. High availability is also achieved via a network configuration using redundant switches and multiple paths between hypervisor hosts and disk servers. We also developed a set of management scripts to easily perform basic system administration tasks such as automatic deployment of new virtual machines, adaptive scheduling of virtual machines on hypervisor hosts, live migration and automated restart in case of hypervisor failures.
EMBEDDED CLUSTERS IN THE LARGE MAGELLANIC CLOUD USING THE VISTA MAGELLANIC CLOUDS SURVEY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romita, Krista; Lada, Elizabeth; Cioni, Maria-Rosa, E-mail: k.a.romita@ufl.edu, E-mail: elada@ufl.edu, E-mail: mcioni@aip.de
We present initial results of the first large-scale survey of embedded star clusters in molecular clouds in the Large Magellanic Cloud (LMC) using near-infrared imaging from the Visible and Infrared Survey Telescope for Astronomy Magellanic Clouds Survey. We explored a ∼1.65 deg{sup 2} area of the LMC, which contains the well-known star-forming region 30 Doradus as well as ∼14% of the galaxy’s CO clouds, and identified 67 embedded cluster candidates, 45 of which are newly discovered as clusters. We have determined the sizes, luminosities, and masses for these embedded clusters, examined the star formation rates (SFRs) of their corresponding molecularmore » clouds, and made a comparison between the LMC and the Milky Way. Our preliminary results indicate that embedded clusters in the LMC are generally larger, more luminous, and more massive than those in the local Milky Way. We also find that the surface densities of both embedded clusters and molecular clouds is ∼3 times higher than in our local environment, the embedded cluster mass surface density is ∼40 times higher, the SFR is ∼20 times higher, and the star formation efficiency is ∼10 times higher. Despite these differences, the SFRs of the LMC molecular clouds are consistent with the SFR scaling law presented in Lada et al. This consistency indicates that while the conditions of embedded cluster formation may vary between environments, the overall process within molecular clouds may be universal.« less
Multivariate Spatial Condition Mapping Using Subtractive Fuzzy Cluster Means
Sabit, Hakilo; Al-Anbuky, Adnan
2014-01-01
Wireless sensor networks are usually deployed for monitoring given physical phenomena taking place in a specific space and over a specific duration of time. The spatio-temporal distribution of these phenomena often correlates to certain physical events. To appropriately characterise these events-phenomena relationships over a given space for a given time frame, we require continuous monitoring of the conditions. WSNs are perfectly suited for these tasks, due to their inherent robustness. This paper presents a subtractive fuzzy cluster means algorithm and its application in data stream mining for wireless sensor systems over a cloud-computing-like architecture, which we call sensor cloud data stream mining. Benchmarking on standard mining algorithms, the k-means and the FCM algorithms, we have demonstrated that the subtractive fuzzy cluster means model can perform high quality distributed data stream mining tasks comparable to centralised data stream mining. PMID:25313495
Demonstration of measurement-only blind quantum computing
NASA Astrophysics Data System (ADS)
Greganti, Chiara; Roehsner, Marie-Christine; Barz, Stefanie; Morimae, Tomoyuki; Walther, Philip
2016-01-01
Blind quantum computing allows for secure cloud networks of quasi-classical clients and a fully fledged quantum server. Recently, a new protocol has been proposed, which requires a client to perform only measurements. We demonstrate a proof-of-principle implementation of this measurement-only blind quantum computing, exploiting a photonic setup to generate four-qubit cluster states for computation and verification. Feasible technological requirements for the client and the device-independent blindness make this scheme very applicable for future secure quantum networks.
OCCAM: a flexible, multi-purpose and extendable HPC cluster
NASA Astrophysics Data System (ADS)
Aldinucci, M.; Bagnasco, S.; Lusso, S.; Pasteris, P.; Rabellino, S.; Vallero, S.
2017-10-01
The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multipurpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Sezione di Torino of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing use cases, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others. Furthermore, it will serve as a platform for R&D activities on computational technologies themselves, with topics ranging from GPU acceleration to Cloud Computing technologies. A heterogeneous and reconfigurable system like this poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, containers, virtual farms, jobs, interactive bare-metal sessions, etc. This work describes some of the use cases that prompted the design and construction of the HPC cluster, its architecture and resource provisioning model, along with a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.
MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants.
Elshazly, Hatem; Souilmi, Yassine; Tonellato, Peter J; Wall, Dennis P; Abouelhoda, Mohamed
2017-01-20
Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices. In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers. MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot machines as a result of a sudden price increase. The package has a web-interface and it is available for free for academic use.
Yim, Wen-Wai; Chien, Shu; Kusumoto, Yasuyuki; Date, Susumu; Haga, Jason
2010-01-01
Large-scale in-silico screening is a necessary part of drug discovery and Grid computing is one answer to this demand. A disadvantage of using Grid computing is the heterogeneous computational environments characteristic of a Grid. In our study, we have found that for the molecular docking simulation program DOCK, different clusters within a Grid organization can yield inconsistent results. Because DOCK in-silico virtual screening (VS) is currently used to help select chemical compounds to test with in-vitro experiments, such differences have little effect on the validity of using virtual screening before subsequent steps in the drug discovery process. However, it is difficult to predict whether the accumulation of these discrepancies over sequentially repeated VS experiments will significantly alter the results if VS is used as the primary means for identifying potential drugs. Moreover, such discrepancies may be unacceptable for other applications requiring more stringent thresholds. This highlights the need for establishing a more complete solution to provide the best scientific accuracy when executing an application across Grids. One possible solution to platform heterogeneity in DOCK performance explored in our study involved the use of virtual machines as a layer of abstraction. This study investigated the feasibility and practicality of using virtual machine and recent cloud computing technologies in a biological research application. We examined the differences and variations of DOCK VS variables, across a Grid environment composed of different clusters, with and without virtualization. The uniform computer environment provided by virtual machines eliminated inconsistent DOCK VS results caused by heterogeneous clusters, however, the execution time for the DOCK VS increased. In our particular experiments, overhead costs were found to be an average of 41% and 2% in execution time for two different clusters, while the actual magnitudes of the execution time costs were minimal. Despite the increase in overhead, virtual clusters are an ideal solution for Grid heterogeneity. With greater development of virtual cluster technology in Grid environments, the problem of platform heterogeneity may be eliminated through virtualization, allowing greater usage of VS, and will benefit all Grid applications in general.
Cuenca-Alba, Jesús; Del Cano, Laura; Gómez Blanco, Josué; de la Rosa Trevín, José Miguel; Conesa Mingo, Pablo; Marabini, Roberto; S Sorzano, Carlos Oscar; Carazo, Jose María
2017-10-01
New instrumentation for cryo electron microscopy (cryoEM) has significantly increased data collection rate as well as data quality, creating bottlenecks at the image processing level. Current image processing model of moving the acquired images from the data source (electron microscope) to desktops or local clusters for processing is encountering many practical limitations. However, computing may also take place in distributed and decentralized environments. In this way, cloud is a new form of accessing computing and storage resources on demand. Here, we evaluate on how this new computational paradigm can be effectively used by extending our current integrative framework for image processing, creating ScipionCloud. This new development has resulted in a full installation of Scipion both in public and private clouds, accessible as public "images", with all the required preinstalled cryoEM software, just requiring a Web browser to access all Graphical User Interfaces. We have profiled the performance of different configurations on Amazon Web Services and the European Federated Cloud, always on architectures incorporating GPU's, and compared them with a local facility. We have also analyzed the economical convenience of different scenarios, so cryoEM scientists have a clearer picture of the setup that is best suited for their needs and budgets. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
X-ray and IR Surveys of the Orion Molecular Clouds and the Cepheus OB3b Cluster
NASA Astrophysics Data System (ADS)
Megeath, S. Thomas; Wolk, Scott J.; Pillitteri, Ignazio; Allen, Tom
2014-08-01
X-ray and IR surveys of molecular clouds between 400 and 700 pc provide complementary means to map the spatial distribution of young low mass stars associated with the clouds. We overview an XMM survey of the Orion Molecular Clouds, at a distance of 400 pc. By using the fraction of X-ray sources with disks as a proxy for age, this survey has revealed three older clusters rich in diskless X-ray sources. Two are smaller clusters found at the northern and southern edges of the Orion A molecular cloud. The third cluster surrounds the O-star Iota Ori (the point of Orion's sword) and is in the foreground to the Orion molecular cloud. In addition, we present a Chandra and Spitzer survey of the Cep OB3b cluster at 700 pc. These data show a spatially variable disk fraction indicative of age variations within the cluster. We discuss the implication of these results for understanding the spread of ages in young clusters and the star formation histories of molecular clouds.
Lightweight Data Systems in the Cloud: Costs, Benefits and Best Practices
NASA Astrophysics Data System (ADS)
Fatland, R.; Arendt, A. A.; Howe, B.; Hess, N. J.; Futrelle, J.
2015-12-01
We present here a simple analysis of both the cost and the benefit of using the cloud in environmental science circa 2016. We present this set of ideas to enable the potential 'cloud adopter' research scientist to explore and understand the tradeoffs in moving some aspect of their compute work to the cloud. We present examples, design patterns and best practices as an evolving body of knowledge that help optimize benefit to the research team. Thematically this generally means not starting from a blank page but rather learning how to find 90% of the solution to a problem pre-built. We will touch on four topics of interest. (1) Existing cloud data resources (NASA, WHOI BCO DMO, etc) and how they can be discovered, used and improved. (2) How to explore, compare and evaluate cost and compute power from many cloud options, particularly in relation to data scale (size/complexity). (3) What are simple / fast 'Lightweight Data System' procedures that take from 20 minutes to one day to implement and that have a clear immediate payoff in environmental data-driven research. Examples include publishing a SQL Share URL at (EarthCube's) CINERGI as a registered data resource and creating executable papers on a cloud-hosted Jupyter instance, particularly iPython notebooks. (4) Translating the computational terminology landscape ('cloud', 'HPC cluster', 'hadoop', 'spark', 'machine learning') into examples from the community of practice to help the geoscientist build or expand their mental map. In the course of this discussion -- which is about resource discovery, adoption and mastery -- we provide direction to online resources in support of these themes.
Monte Carlo simulation of photon migration in a cloud computing environment with MapReduce
Pratx, Guillem; Xing, Lei
2011-01-01
Monte Carlo simulation is considered the most reliable method for modeling photon migration in heterogeneous media. However, its widespread use is hindered by the high computational cost. The purpose of this work is to report on our implementation of a simple MapReduce method for performing fault-tolerant Monte Carlo computations in a massively-parallel cloud computing environment. We ported the MC321 Monte Carlo package to Hadoop, an open-source MapReduce framework. In this implementation, Map tasks compute photon histories in parallel while a Reduce task scores photon absorption. The distributed implementation was evaluated on a commercial compute cloud. The simulation time was found to be linearly dependent on the number of photons and inversely proportional to the number of nodes. For a cluster size of 240 nodes, the simulation of 100 billion photon histories took 22 min, a 1258 × speed-up compared to the single-threaded Monte Carlo program. The overall computational throughput was 85,178 photon histories per node per second, with a latency of 100 s. The distributed simulation produced the same output as the original implementation and was resilient to hardware failure: the correctness of the simulation was unaffected by the shutdown of 50% of the nodes. PMID:22191916
SCIMITAR: Scalable Stream-Processing for Sensor Information Brokering
2013-11-01
IaaS) cloud frameworks including Amazon Web Services and Eucalyptus . For load testing, we used The Grinder [9], a Java load testing framework that...internal Eucalyptus cluster which we could not scale as large as the Amazon environment due to a lack of computation resources. We recreated our
The Integration of CloudStack and OCCI/OpenNebula with DIRAC
NASA Astrophysics Data System (ADS)
Méndez Muñoz, Víctor; Fernández Albor, Víctor; Graciani Diaz, Ricardo; Casajús Ramo, Adriàn; Fernández Pena, Tomás; Merino Arévalo, Gonzalo; José Saborido Silva, Juan
2012-12-01
The increasing availability of Cloud resources is arising as a realistic alternative to the Grid as a paradigm for enabling scientific communities to access large distributed computing resources. The DIRAC framework for distributed computing is an easy way to efficiently access to resources from both systems. This paper explains the integration of DIRAC with two open-source Cloud Managers: OpenNebula (taking advantage of the OCCI standard) and CloudStack. These are computing tools to manage the complexity and heterogeneity of distributed data center infrastructures, allowing to create virtual clusters on demand, including public, private and hybrid clouds. This approach has required to develop an extension to the previous DIRAC Virtual Machine engine, which was developed for Amazon EC2, allowing the connection with these new cloud managers. In the OpenNebula case, the development has been based on the CernVM Virtual Software Appliance with appropriate contextualization, while in the case of CloudStack, the infrastructure has been kept more general, which permits other Virtual Machine sources and operating systems being used. In both cases, CernVM File System has been used to facilitate software distribution to the computing nodes. With the resulting infrastructure, the cloud resources are transparent to the users through a friendly interface, like the DIRAC Web Portal. The main purpose of this integration is to get a system that can manage cloud and grid resources at the same time. This particular feature pushes DIRAC to a new conceptual denomination as interware, integrating different middleware. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine which is transparent to the user. This paper presents an analysis of the overhead of the virtual layer, doing some tests to compare the proposed approach with the existing Grid solution. License Notice: Published under licence in Journal of Physics: Conference Series by IOP Publishing Ltd.
Combinations of SNP genotypes from the Wellcome Trust Case Control Study of bipolar patients.
Mellerup, Erling; Jørgensen, Martin Balslev; Dam, Henrik; Møller, Gert Lykke
2018-04-01
Combinations of genetic variants are the basis for polygenic disorders. We examined combinations of SNP genotypes taken from the 446 729 SNPs in The Wellcome Trust Case Control Study of bipolar patients. Parallel computing by graphics processing units, cloud computing, and data mining tools were used to scan The Wellcome Trust data set for combinations. Two clusters of combinations were significantly associated with bipolar disorder. One cluster contained 68 combinations, each of which included five SNP genotypes. Of the 1998 patients, 305 had combinations from this cluster in their genome, but none of the 1500 controls had any of these combinations in their genome. The other cluster contained six combinations, each of which included five SNP genotypes. Of the 1998 patients, 515 had combinations from the cluster in their genome, but none of the 1500 controls had any of these combinations in their genome. Clusters of combinations of genetic variants can be considered general risk factors for polygenic disorders, whereas accumulation of combinations from the clusters in the genome of a patient can be considered a personal risk factor.
Calibration of radio-astronomical data on the cloud. LOFAR, the pathway to SKA
NASA Astrophysics Data System (ADS)
Sabater, J.; Sánchez-Expósito, S.; Garrido, J.; Ruiz, J. E.; Best, P. N.; Verdes-Montenegro, L.
2015-05-01
The radio interferometer LOFAR (LOw Frequency ARray) is fully operational now. This Square Kilometre Array (SKA) pathfinder allows the observation of the sky at frequencies between 10 and 240 MHz, a relatively unexplored region of the spectrum. LOFAR is a software defined telescope: the data is mainly processed using specialized software running in common computing facilities. That means that the capabilities of the telescope are virtually defined by software and mainly limited by the available computing power. However, the quantity of data produced can quickly reach huge volumes (several Petabytes per day). After the correlation and pre-processing of the data in a dedicated cluster, the final dataset is handled to the user (typically several Terabytes). The calibration of these data requires a powerful computing facility in which the specific state of the art software under heavy continuous development can be easily installed and updated. That makes this case a perfect candidate for a cloud infrastructure which adds the advantages of an on demand, flexible solution. We present our approach to the calibration of LOFAR data using Ibercloud, the cloud infrastructure provided by Ibergrid. With the calibration work-flow adapted to the cloud, we can explore calibration strategies for the SKA and show how private or commercial cloud infrastructures (Ibercloud, Amazon EC2, Google Compute Engine, etc.) can help to solve the problems with big datasets that will be prevalent in the future of astronomy.
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.
Thackston, Russell; Fortenberry, Ryan C
2015-05-05
The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
NASA Technical Reports Server (NTRS)
Weger, R. C.; Lee, J.; Zhu, Tianri; Welch, R. M.
1992-01-01
The current controversy existing in reference to the regularity vs. clustering in cloud fields is examined by means of analysis and simulation studies based upon nearest-neighbor cumulative distribution statistics. It is shown that the Poisson representation of random point processes is superior to pseudorandom-number-generated models and that pseudorandom-number-generated models bias the observed nearest-neighbor statistics towards regularity. Interpretation of this nearest-neighbor statistics is discussed for many cases of superpositions of clustering, randomness, and regularity. A detailed analysis is carried out of cumulus cloud field spatial distributions based upon Landsat, AVHRR, and Skylab data, showing that, when both large and small clouds are included in the cloud field distributions, the cloud field always has a strong clustering signal.
The destruction of an Oort Cloud in a rich stellar cluster
NASA Astrophysics Data System (ADS)
Nordlander, T.; Rickman, H.; Gustafsson, B.
2017-07-01
Context. It is possible that the formation of the Oort Cloud dates back to the earliest epochs of solar system history. At that time, the Sun was almost certainly a member of the stellar cluster where it was born. Since the solar birth cluster is likely to have been massive (103-104ℳ⊙), and therefore long-lived, an issue concerns the survival of such a primordial Oort Cloud. Aims: We have investigated this issue by simulating the orbital evolution of Oort Cloud comets for several hundred Myr, assuming the Sun to start its life as a typical member of such a massive cluster. Methods: We have devised a synthetic representation of the relevant dynamics, where the cluster potential is represented by a King model, and about 20 close encounters with individual cluster stars are selected and integrated based on the solar orbit and the cluster structure. Thousands of individual simulations are made, each including 3000 comets with orbits with three different initial semi-major axes. Results: Practically the entire initial Oort Cloud is found to be lost for our choice of semi-major axes (5000-20 000 au), independent of the cluster mass, although the chance of survival is better for the smaller cluster, since in a certain fraction of the simulations the Sun orbits at relatively safe distances from the dense cluster centre. Conclusions: For the range of birth cluster sizes that we investigate, a primordial Oort Cloud will likely survive only as a small inner core with semi-major axes ≲3000 au. Such a population of comets would be inert to orbital diffusion into an outer halo and subsequent injection into observable orbits. Some mechanism is therefore needed to accomplish this transfer, in case the Oort Cloud is primordial and the birth cluster did not have a low mass. From this point of view, our results lend some support to a delayed formation of the Oort Cloud, that occurred after the Sun had left its birth cluster.
Small-Scale Drop-Size Variability: Empirical Models for Drop-Size-Dependent Clustering in Clouds
NASA Technical Reports Server (NTRS)
Marshak, Alexander; Knyazikhin, Yuri; Larsen, Michael L.; Wiscombe, Warren J.
2005-01-01
By analyzing aircraft measurements of individual drop sizes in clouds, it has been shown in a companion paper that the probability of finding a drop of radius r at a linear scale l decreases as l(sup D(r)), where 0 less than or equals D(r) less than or equals 1. This paper shows striking examples of the spatial distribution of large cloud drops using models that simulate the observed power laws. In contrast to currently used models that assume homogeneity and a Poisson distribution of cloud drops, these models illustrate strong drop clustering, especially with larger drops. The degree of clustering is determined by the observed exponents D(r). The strong clustering of large drops arises naturally from the observed power-law statistics. This clustering has vital consequences for rain physics, including how fast rain can form. For radiative transfer theory, clustering of large drops enhances their impact on the cloud optical path. The clustering phenomenon also helps explain why remotely sensed cloud drop size is generally larger than that measured in situ.
Clustering molecular dynamics trajectories for optimizing docking experiments.
De Paris, Renata; Quevedo, Christian V; Ruiz, Duncan D; Norberto de Souza, Osmar; Barros, Rodrigo C
2015-01-01
Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR) model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand.
Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew
2015-01-01
Background Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. Results We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. Conclusions This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation. PMID:26501966
Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.
Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew
2015-01-01
Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation.
Reconstructing evolutionary trees in parallel for massive sequences.
Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam
2017-12-14
Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
Scalable and cost-effective NGS genotyping in the cloud.
Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P
2015-10-15
While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.
Detection of long duration cloud contamination in hyper-temporal NDVI imagery
NASA Astrophysics Data System (ADS)
Ali, A.; de Bie, C. A. J. M.; Skidmore, A. K.; Scarrott, R. G.
2012-04-01
NDVI time series imagery are commonly used as a reliable source for land use and land cover mapping and monitoring. However long duration cloud can significantly influence its precision in areas where persistent clouds prevails. Therefore quantifying errors related to cloud contamination are essential for accurate land cover mapping and monitoring. This study aims to detect long duration cloud contamination in hyper-temporal NDVI imagery based land cover mapping and monitoring. MODIS-Terra NDVI imagery (250 m; 16-day; Feb'03-Dec'09) were used after necessary pre-processing using quality flags and upper envelope filter (ASAVOGOL). Subsequently stacked MODIS-Terra NDVI image (161 layers) was classified for 10 to 100 clusters using ISODATA. After classifications, 97 clusters image was selected as best classified with the help of divergence statistics. To detect long duration cloud contamination, mean NDVI class profiles of 97 clusters image was analyzed for temporal artifacts. Results showed that long duration clouds affect the normal temporal progression of NDVI and caused anomalies. Out of total 97 clusters, 32 clusters were found with cloud contamination. Cloud contamination was found more prominent in areas where high rainfall occurs. This study can help to stop error propagation in regional land cover mapping and monitoring, caused by long duration cloud contamination.
Beating the tyranny of scale with a private cloud configured for Big Data
NASA Astrophysics Data System (ADS)
Lawrence, Bryan; Bennett, Victoria; Churchill, Jonathan; Juckes, Martin; Kershaw, Philip; Pepler, Sam; Pritchard, Matt; Stephens, Ag
2015-04-01
The Joint Analysis System, JASMIN, consists of a five significant hardware components: a batch computing cluster, a hypervisor cluster, bulk disk storage, high performance disk storage, and access to a tape robot. Each of the computing clusters consists of a heterogeneous set of servers, supporting a range of possible data analysis tasks - and a unique network environment makes it relatively trivial to migrate servers between the two clusters. The high performance disk storage will include the world's largest (publicly visible) deployment of the Panasas parallel disk system. Initially deployed in April 2012, JASMIN has already undergone two major upgrades, culminating in a system which by April 2015, will have in excess of 16 PB of disk and 4000 cores. Layered on the basic hardware are a range of services, ranging from managed services, such as the curated archives of the Centre for Environmental Data Archival or the data analysis environment for the National Centres for Atmospheric Science and Earth Observation, to a generic Infrastructure as a Service (IaaS) offering for the UK environmental science community. Here we present examples of some of the big data workloads being supported in this environment - ranging from data management tasks, such as checksumming 3 PB of data held in over one hundred million files, to science tasks, such as re-processing satellite observations with new algorithms, or calculating new diagnostics on petascale climate simulation outputs. We will demonstrate how the provision of a cloud environment closely coupled to a batch computing environment, all sharing the same high performance disk system allows massively parallel processing without the necessity to shuffle data excessively - even as it supports many different virtual communities, each with guaranteed performance. We will discuss the advantages of having a heterogeneous range of servers with available memory from tens of GB at the low end to (currently) two TB at the high end. There are some limitations of the JASMIN environment, the high performance disk environment is not fully available in the IaaS environment, and a planned ability to burst compute heavy jobs into the public cloud is not yet fully available. There are load balancing and performance issues that need to be understood. We will conclude with projections for future usage, and our plans to meet those requirements.
Climate simulations and services on HPC, Cloud and Grid infrastructures
NASA Astrophysics Data System (ADS)
Cofino, Antonio S.; Blanco, Carlos; Minondo Tshuma, Antonio
2017-04-01
Cloud, Grid and High Performance Computing have changed the accessibility and availability of computing resources for Earth Science research communities, specially for Climate community. These paradigms are modifying the way how climate applications are being executed. By using these technologies the number, variety and complexity of experiments and resources are increasing substantially. But, although computational capacity is increasing, traditional applications and tools used by the community are not good enough to manage this large volume and variety of experiments and computing resources. In this contribution, we evaluate the challenges to run climate simulations and services on Grid, Cloud and HPC infrestructures and how to tackle them. The Grid and Cloud infrastructures provided by EGI's VOs ( esr , earth.vo.ibergrid and fedcloud.egi.eu) will be evaluated, as well as HPC resources from PRACE infrastructure and institutional clusters. To solve those challenges, solutions using DRM4G framework will be shown. DRM4G provides a good framework to manage big volume and variety of computing resources for climate experiments. This work has been supported by the Spanish National R&D Plan under projects WRF4G (CGL2011-28864), INSIGNIA (CGL2016-79210-R) and MULTI-SDM (CGL2015-66583-R) ; the IS-ENES2 project from the 7FP of the European Commission (grant agreement no. 312979); the European Regional Development Fund—ERDF and the Programa de Personal Investigador en Formación Predoctoral from Universidad de Cantabria and Government of Cantabria.
Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.
2013-12-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the 'A-Train' platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (MERRA), stratify the comparisons using a classification of the 'cloud scenes' from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically 'sharded' by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved 'clock time' speedups in fusing datasets on our own compute nodes and in the public Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed 'near' the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill. SciReduce Architecture
A Lagrangian analysis of cold cloud clusters and their life cycles with satellite observations
Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong
2018-01-01
Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi-coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Niño. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics. PMID:29744257
A Lagrangian analysis of cold cloud clusters and their life cycles with satellite observations.
Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong
2016-10-16
Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi-coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Niño. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics.
A Lagrangian Analysis of Cold Cloud Clusters and Their Life Cycles With Satellite Observations
NASA Technical Reports Server (NTRS)
Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong
2016-01-01
Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Nino. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics.
Formation of young massive clusters from turbulent molecular clouds
NASA Astrophysics Data System (ADS)
Fujii, Michiko; Portegies Zwart, Simon
2015-08-01
We simulate the formation and evolution of young star clusters using smoothed-particle hydrodynamics (SPH) and direct N-body methods. We start by performing SPH simulations of the giant molecular cloud with a turbulent velocity field, a mass of 10^4 to 10^6 M_sun, and a density between 17 and 1700 cm^-3. We continue the SPH simulations for a free-fall time scale, and analyze the resulting structure of the collapsed cloud. We subsequently replace a density-selected subset of SPH particles with stars. As a consequence, the local star formation efficiency exceeds 30 per cent, whereas globally only a few per cent of the gas is converted to stars. The stellar distribution is very clumpy with typically a dozen bound conglomerates that consist of 100 to 10000 stars. We continue to evolve the stars dynamically using the collisional N-body method, which accurately treats all pairwise interactions, stellar collisions and stellar evolution. We analyze the results of the N-body simulations at 2 Myr and 10 Myr. From dense massive molecular clouds, massive clusters grow via hierarchical merging of smaller clusters. The shape of the cluster mass function that originates from an individual molecular cloud is consistent with a Schechter function with a power-law slope of beta = -1.73 at 2 Myr and beta = -1.67 at 10 Myr, which fits to observed cluster mass function of the Carina region. The superposition of mass functions have a power-law slope of < -2, which fits the observed mass function of star clusters in the Milky Way, M31 and M83. We further find that the mass of the most massive cluster formed in a single molecular cloud with a mass of M_g scales with 6.1 M_g^0.51 which also agrees with recent observation in M51. The molecular clouds which can form massive clusters are much denser than those typical in the Milky Way. The velocity dispersion of such molecular clouds reaches 20 km/s and it is consistent with the relative velocity of the molecular clouds observed near NGC 3603 and Westerlund 2, for which a triggered star formation by cloud-cloud collisions is suggested.
NASA Astrophysics Data System (ADS)
Spiegel, Johanna K.; Buchmann, Nina; Mayol-Bracero, Olga L.; Cuadra-Rodriguez, Luis A.; Valle Díaz, Carlos J.; Prather, Kimberly A.; Mertes, Stephan; Eugster, Werner
2014-09-01
We investigated cloud properties of warm clouds in a tropical montane cloud forest at Pico del Este (1,051 m a.s.l.) in the northeastern part of Puerto Rico to address the question of whether cloud properties in the Caribbean could potentially be affected by African dust transported across the Atlantic Ocean. We analyzed data collected during 12 days in July 2011. Cloud droplet size spectra were measured using the FM-100 fog droplet spectrometer that measured droplet size distributions in the range from 2 to 49 µm, primarily during fog events. The droplet size spectra revealed a bimodal structure, with the first peak ( D < 6 µm) being more pronounced in terms of droplet number concentrations, whereas the second peak (10 µm < D < 20 µm) was found to be the one relevant for total liquid water content (LWC) of the cloud. We identified three major clusters of characteristic droplet size spectra by means of hierarchical clustering. All clusters differed significantly from each other in droplet number concentration (), effective diameter (ED), and median volume diameter (MVD). For the cluster comprising the largest droplets and the lowest droplet number concentrations, we found evidence of inhomogeneous mixing in the cloud. Contrastingly, the other two clusters revealed microphysical behavior, which could be expected under homogeneous mixing conditions. For those conditions, an increase in cloud condensation nuclei—e.g., from processed African dust transported to the site—is supposed to lead to an increased droplet concentration. In fact, one of these two clusters showed a clear shift of cloud droplet size spectra towards smaller droplet diameters. Since this cluster occurred during periods with strong evidence for the presence of long-range transported African dust, we hypothesize a link between the observed dust episodes and cloud characteristics in the Caribbean at our site, which is similar to the anthropogenic aerosol indirect effect.
Investigation of Storage Options for Scientific Computing on Grid and Cloud Facilities
NASA Astrophysics Data System (ADS)
Garzoglio, Gabriele
2012-12-01
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on “bare metal” nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.
Investigation of storage options for scientific computing on Grid and Cloud facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storagemore » server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on bare metal nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.« less
OGLE Collection of Star Clusters. New Objects in the Outskirts of the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Sitek, M.; Szymański, M. K.; Skowron, D. M.; Udalski, A.; Kostrzewa-Rutkowska, Z.; Skowron, J.; Karczmarek, P.; Cieślar, M.; Wyrzykowski, Ł.; Kozłowski, S.; Pietrukowicz, P.; Soszyński, I.; Mróz, P.; Pawlak, M.; Poleski, R.; Ulaczyk, K.
2016-09-01
The Magellanic System (MS), consisting of the Large Magellanic Cloud (LMC), the Small Magellanic Cloud (SMC) and the Magellanic Bridge (MBR), contains diverse sample of star clusters. Their spatial distribution, ages and chemical abundances may provide important information about the history of formation of the whole System. We use deep photometric maps derived from the images collected during the fourth phase of the Optical Gravitational Lensing Experiment (OGLE-IV) to construct the most complete catalog of star clusters in the Large Magellanic Cloud using the homogeneous photometric data. In this paper we present the collection of star clusters found in the area of about 225 square degrees in the outer regions of the LMC. Our sample contains 679 visually identified star cluster candidates, 226 of which were not listed in any of the previously published catalogs. The new clusters are mainly young small open clusters or clusters similar to associations.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-12-01
Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-01-01
Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets
Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L
2014-01-01
Background As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Methods Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Results Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Conclusions Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. PMID:24464852
Ages of intermediate-age Magellanic Cloud star clusters
NASA Technical Reports Server (NTRS)
Flower, P. J.
1984-01-01
Ages of intermediate-age Large Magellanic Cloud star clusters have been estimated without locating the faint, unevolved portion of cluster main sequences. Six clusters with established color-magnitude diagrams were selected for study: SL 868, NGC 1783, NGC 1868, NGC 2121, NGC 2209, and NGC 2231. Since red giant photometry is more accurate than the necessarily fainter main-sequence photometry, the distributions of red giants on the cluster color-magnitude diagrams were compared to a grid of 33 stellar evolutionary tracks, evolved from the main sequence through core-helium exhaustion, spanning the expected mass and metallicity range for Magellanic Cloud cluster red giants. The time-dependent behavior of the luminosity of the model red giants was used to estimate cluster ages from the observed cluster red giant luminosities. Except for the possibility of SL 868 being an old globular cluster, all clusters studied were found to have ages less than 10 to the 9th yr. It is concluded that there is currently no substantial evidence for a major cluster population of large, populous clusters greater than 10 to the 9th yr old in the Large Magellanic Cloud.
Probing Massive Star Cluster Formation with ALMA
NASA Astrophysics Data System (ADS)
Johnson, Kelsey
2015-08-01
Observationally constraining the physical conditions that give rise to massive star clusters has been a long-standing challenge. Now with the ALMA Observatory coming on-line, we can finally begin to probe the birth environments of massive clusters in a variety of galaxies with sufficient angular resolution. In this talk I will give an overview of ALMA observations of galaxies in which candidate proto-super star cluster molecular clouds have been identified. These new data probe the physical conditions that give rise to super star clusters, providing information on their densities, pressures, and temperatures. In particular, the observations indicate that these clouds may be subject to external pressures of P/k > 108 K cm-3, which is consistent with the prevalence of optically observed adolescent super star clusters in interacting galaxy systems and other high pressure environments. ALMA observations also enable an assessement of the molecular cloud chemical abundances in the regions surrounding super star clusters. Molecular clouds associated with existing super star clusters are strongly correlated with HCO+ emission, but appear to have relatively low ratio of CO/HCO+ emission compared to other clouds, indicating that the super star clusters are impacting the molecular abundances in their vicinity.
Zhu, Lingyun; Li, Lianjie; Meng, Chunyan
2014-12-01
There have been problems in the existing multiple physiological parameter real-time monitoring system, such as insufficient server capacity for physiological data storage and analysis so that data consistency can not be guaranteed, poor performance in real-time, and other issues caused by the growing scale of data. We therefore pro posed a new solution which was with multiple physiological parameters and could calculate clustered background data storage and processing based on cloud computing. Through our studies, a batch processing for longitudinal analysis of patients' historical data was introduced. The process included the resource virtualization of IaaS layer for cloud platform, the construction of real-time computing platform of PaaS layer, the reception and analysis of data stream of SaaS layer, and the bottleneck problem of multi-parameter data transmission, etc. The results were to achieve in real-time physiological information transmission, storage and analysis of a large amount of data. The simulation test results showed that the remote multiple physiological parameter monitoring system based on cloud platform had obvious advantages in processing time and load balancing over the traditional server model. This architecture solved the problems including long turnaround time, poor performance of real-time analysis, lack of extensibility and other issues, which exist in the traditional remote medical services. Technical support was provided in order to facilitate a "wearable wireless sensor plus mobile wireless transmission plus cloud computing service" mode moving towards home health monitoring for multiple physiological parameter wireless monitoring.
Clustering, randomness, and regularity in cloud fields: 2. Cumulus cloud fields
NASA Astrophysics Data System (ADS)
Zhu, T.; Lee, J.; Weger, R. C.; Welch, R. M.
1992-12-01
During the last decade a major controversy has been brewing concerning the proper characterization of cumulus convection. The prevailing view has been that cumulus clouds form in clusters, in which cloud spacing is closer than that found for the overall cloud field and which maintains its identity over many cloud lifetimes. This "mutual protection hypothesis" of Randall and Huffman (1980) has been challenged by the "inhibition hypothesis" of Ramirez et al. (1990) which strongly suggests that the spatial distribution of cumuli must tend toward a regular distribution. A dilemma has resulted because observations have been reported to support both hypotheses. The present work reports a detailed analysis of cumulus cloud field spatial distributions based upon Landsat, Advanced Very High Resolution Radiometer, and Skylab data. Both nearest-neighbor and point-to-cloud cumulative distribution function statistics are investigated. The results show unequivocally that when both large and small clouds are included in the cloud field distribution, the cloud field always has a strong clustering signal. The strength of clustering is largest at cloud diameters of about 200-300 m, diminishing with increasing cloud diameter. In many cases, clusters of small clouds are found which are not closely associated with large clouds. As the small clouds are eliminated from consideration, the cloud field typically tends towards regularity. Thus it would appear that the "inhibition hypothesis" of Ramirez and Bras (1990) has been verified for the large clouds. However, these results are based upon the analysis of point processes. A more exact analysis also is made which takes into account the cloud size distributions. Since distinct clouds are by definition nonoverlapping, cloud size effects place a restriction upon the possible locations of clouds in the cloud field. The net effect of this analysis is that the large clouds appear to be randomly distributed, with only weak tendencies towards regularity. For clouds less than 1 km in diameter, the average nearest-neighbor distance is equal to 3-7 cloud diameters. For larger clouds, the ratio of cloud nearest-neighbor distance to cloud diameter increases sharply with increasing cloud diameter. This demonstrates that large clouds inhibit the growth of other large clouds in their vicinity. Nevertheless, this leads to random distributions of large clouds, not regularity.
Star clusters: age, metallicity and extinction from integrated spectra
NASA Astrophysics Data System (ADS)
González Delgado, Rosa M.; Cid Fernandes, Roberto
2010-01-01
Integrated optical spectra of star clusters in the Magellanic Clouds and a few Galactic globular clusters are fitted using high-resolution spectral models for single stellar populations. The goal is to estimate the age, metallicity and extinction of the clusters, and evaluate the degeneracies among these parameters. Several sets of evolutionary models that were computed with recent high-spectral-resolution stellar libraries (MILES, GRANADA, STELIB), are used as inputs to the starlight code to perform the fits. The comparison of the results derived from this method and previous estimates available in the literature allow us to evaluate the pros and cons of each set of models to determine star cluster properties. In addition, we quantify the uncertainties associated with the age, metallicity and extinction determinations resulting from variance in the ingredients for the analysis.
A new type of simplified fuzzy rule-based system
NASA Astrophysics Data System (ADS)
Angelov, Plamen; Yager, Ronald
2012-02-01
Over the last quarter of a century, two types of fuzzy rule-based (FRB) systems dominated, namely Mamdani and Takagi-Sugeno type. They use the same type of scalar fuzzy sets defined per input variable in their antecedent part which are aggregated at the inference stage by t-norms or co-norms representing logical AND/OR operations. In this paper, we propose a significantly simplified alternative to define the antecedent part of FRB systems by data Clouds and density distribution. This new type of FRB systems goes further in the conceptual and computational simplification while preserving the best features (flexibility, modularity, and human intelligibility) of its predecessors. The proposed concept offers alternative non-parametric form of the rules antecedents, which fully reflects the real data distribution and does not require any explicit aggregation operations and scalar membership functions to be imposed. Instead, it derives the fuzzy membership of a particular data sample to a Cloud by the data density distribution of the data associated with that Cloud. Contrast this to the clustering which is parametric data space decomposition/partitioning where the fuzzy membership to a cluster is measured by the distance to the cluster centre/prototype ignoring all the data that form that cluster or approximating their distribution. The proposed new approach takes into account fully and exactly the spatial distribution and similarity of all the real data by proposing an innovative and much simplified form of the antecedent part. In this paper, we provide several numerical examples aiming to illustrate the concept.
NASA Technical Reports Server (NTRS)
Yanai, M.; Esbensen, S.; Chu, J.
1972-01-01
The bulk properties of tropical cloud clusters, as the vertical mass flux, the excess temperature, and moisture and the liquid water content of the clouds, are determined from a combination of the observed large-scale heat and moisture budgets over an area covering the cloud cluster, and a model of a cumulus ensemble which exchanges mass, heat, vapor and liquid water with the environment through entrainment and detrainment. The method also provides an understanding of how the environmental air is heated and moistened by the cumulus convection. An estimate of the average cloud cluster properties and the heat and moisture balance of the environment, obtained from 1956 Marshall Islands data, is presented.
Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.
2013-05-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill.; Figure 1 -- Architecture.
NASA Astrophysics Data System (ADS)
Puzyrkov, Dmitry; Polyakov, Sergey; Podryga, Viktoriia; Markizov, Sergey
2018-02-01
At the present stage of computer technology development it is possible to study the properties and processes in complex systems at molecular and even atomic levels, for example, by means of molecular dynamics methods. The most interesting are problems related with the study of complex processes under real physical conditions. Solving such problems requires the use of high performance computing systems of various types, for example, GRID systems and HPC clusters. Considering the time consuming computational tasks, the need arises of software for automatic and unified monitoring of such computations. A complex computational task can be performed over different HPC systems. It requires output data synchronization between the storage chosen by a scientist and the HPC system used for computations. The design of the computational domain is also quite a problem. It requires complex software tools and algorithms for proper atomistic data generation on HPC systems. The paper describes the prototype of a cloud service, intended for design of atomistic systems of large volume for further detailed molecular dynamic calculations and computational management for this calculations, and presents the part of its concept aimed at initial data generation on the HPC systems.
Processing ARM VAP data on an AWS cluster
NASA Astrophysics Data System (ADS)
Martin, T.; Macduff, M.; Shippert, T.
2017-12-01
The Atmospheric Radiation Measurement (ARM) Data Management Facility (DMF) manages over 18,000 processes and 1.3 TB of data each day. This includes many Value Added Products (VAPs) that make use of multiple instruments to produce the derived products that are scientifically relevant. A thermodynamic and cloud profile VAP is being developed to provide input to the ARM Large-eddy simulation (LES) ARM Symbiotic Simulation and Observation (LASSO) project (https://www.arm.gov/capabilities/vaps/lasso-122) . This algorithm is CPU intensive and the processing requirements exceeded the available DMF computing capacity. Amazon Web Service (AWS) along with CfnCluster was investigated to see how it would perform. This cluster environment is cost effective and scales dynamically based on demand. We were able to take advantage of autoscaling which allowed the cluster to grow and shrink based on the size of the processing queue. We also were able to take advantage of the Amazon Web Services spot market to further reduce the cost. Our test was very successful and found that cloud resources can be used to efficiently and effectively process time series data. This poster will present the resources and methodology used to successfully run the algorithm.
Star Clusters in the Magellanic Clouds
NASA Astrophysics Data System (ADS)
Gallagher, J. S., III
2014-09-01
The Magellanic Clouds (MC) are prime locations for studies of star clusters covering a full range in age and mass. This contribution briefly reviews selected properties of Magellanic star clusters, by focusing first on young systems that show evidence for hierarchical star formation. The structures and chemical abundance patterns of older intermediate age star clusters in the Small Magellanic Cloud (SMC) are a second topic. These suggest a complex history has affected the chemical enrichment in the SMC and that low tidal stresses in the SMC foster star cluster survival.
The JASMIN Cloud: specialised and hybrid to meet the needs of the Environmental Sciences Community
NASA Astrophysics Data System (ADS)
Kershaw, Philip; Lawrence, Bryan; Churchill, Jonathan; Pritchard, Matt
2014-05-01
Cloud computing provides enormous opportunities for the research community. The large public cloud providers provide near-limitless scaling capability. However, adapting Cloud to scientific workloads is not without its problems. The commodity nature of the public cloud infrastructure can be at odds with the specialist requirements of the research community. Issues such as trust, ownership of data, WAN bandwidth and costing models make additional barriers to more widespread adoption. Alongside the application of public cloud for scientific applications, a number of private cloud initiatives are underway in the research community of which the JASMIN Cloud is one example. Here, cloud service models are being effectively super-imposed over more established services such as data centres, compute cluster facilities and Grids. These have the potential to deliver the specialist infrastructure needed for the science community coupled with the benefits of a Cloud service model. The JASMIN facility based at the Rutherford Appleton Laboratory was established in 2012 to support the data analysis requirements of the climate and Earth Observation community. In its first year of operation, the 5PB of available storage capacity was filled and the hosted compute capability used extensively. JASMIN has modelled the concept of a centralised large-volume data analysis facility. Key characteristics have enabled success: peta-scale fast disk connected via low latency networks to compute resources and the use of virtualisation for effective management of the resources for a range of users. A second phase is now underway funded through NERC's (Natural Environment Research Council) Big Data initiative. This will see significant expansion to the resources available with a doubling of disk-based storage to 12PB and an increase of compute capacity by a factor of ten to over 3000 processing cores. This expansion is accompanied by a broadening in the scope for JASMIN, as a service available to the entire UK environmental science community. Experience with the first phase demonstrated the range of user needs. A trade-off is needed between access privileges to resources, flexibility of use and security. This has influenced the form and types of service under development for the new phase. JASMIN will deploy a specialised private cloud organised into "Managed" and "Unmanaged" components. In the Managed Cloud, users have direct access to the storage and compute resources for optimal performance but for reasons of security, via a more restrictive PaaS (Platform-as-a-Service) interface. The Unmanaged Cloud is deployed in an isolated part of the network but co-located with the rest of the infrastructure. This enables greater liberty to tenants - full IaaS (Infrastructure-as-a-Service) capability to provision customised infrastructure - whilst at the same time protecting more sensitive parts of the system from direct access using these elevated privileges. The private cloud will be augmented with cloud-bursting capability so that it can exploit the resources available from public clouds, making it effectively a hybrid solution. A single interface will overlay the functionality of both the private cloud and external interfaces to public cloud providers giving users the flexibility to migrate resources between infrastructures as requirements dictate.
NASA Astrophysics Data System (ADS)
Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.
2016-06-01
Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.
Automated extraction and analysis of rock discontinuity characteristics from 3D point clouds
NASA Astrophysics Data System (ADS)
Bianchetti, Matteo; Villa, Alberto; Agliardi, Federico; Crosta, Giovanni B.
2016-04-01
A reliable characterization of fractured rock masses requires an exhaustive geometrical description of discontinuities, including orientation, spacing, and size. These are required to describe discontinuum rock mass structure, perform Discrete Fracture Network and DEM modelling, or provide input for rock mass classification or equivalent continuum estimate of rock mass properties. Although several advanced methodologies have been developed in the last decades, a complete characterization of discontinuity geometry in practice is still challenging, due to scale-dependent variability of fracture patterns and difficult accessibility to large outcrops. Recent advances in remote survey techniques, such as terrestrial laser scanning and digital photogrammetry, allow a fast and accurate acquisition of dense 3D point clouds, which promoted the development of several semi-automatic approaches to extract discontinuity features. Nevertheless, these often need user supervision on algorithm parameters which can be difficult to assess. To overcome this problem, we developed an original Matlab tool, allowing fast, fully automatic extraction and analysis of discontinuity features with no requirements on point cloud accuracy, density and homogeneity. The tool consists of a set of algorithms which: (i) process raw 3D point clouds, (ii) automatically characterize discontinuity sets, (iii) identify individual discontinuity surfaces, and (iv) analyse their spacing and persistence. The tool operates in either a supervised or unsupervised mode, starting from an automatic preliminary exploration data analysis. The identification and geometrical characterization of discontinuity features is divided in steps. First, coplanar surfaces are identified in the whole point cloud using K-Nearest Neighbor and Principal Component Analysis algorithms optimized on point cloud accuracy and specified typical facet size. Then, discontinuity set orientation is calculated using Kernel Density Estimation and principal vector similarity criteria. Poles to points are assigned to individual discontinuity objects using easy custom vector clustering and Jaccard distance approaches, and each object is segmented into planar clusters using an improved version of the DBSCAN algorithm. Modal set orientations are then recomputed by cluster-based orientation statistics to avoid the effects of biases related to cluster size and density heterogeneity of the point cloud. Finally, spacing values are measured between individual discontinuity clusters along scanlines parallel to modal pole vectors, whereas individual feature size (persistence) is measured using 3D convex hull bounding boxes. Spacing and size are provided both as raw population data and as summary statistics. The tool is optimized for parallel computing on 64bit systems, and a Graphic User Interface (GUI) has been developed to manage data processing, provide several outputs, including reclassified point clouds, tables, plots, derived fracture intensity parameters, and export to modelling software tools. We present test applications performed both on synthetic 3D data (simple 3D solids) and real case studies, validating the results with existing geomechanical datasets.
ALMA Detects CO(3-2) within a Super Star Cluster in NGC 5253
NASA Astrophysics Data System (ADS)
Turner, Jean L.; Consiglio, S. Michelle; Beck, Sara C.; Goss, W. M.; Ho, Paul. T. P.; Meier, David S.; Silich, Sergiy; Zhao, Jun-Hui
2017-09-01
We present observations of CO(3-2) and 13CO(3-2) emission near the supernebula in the dwarf galaxy NGC 5253, which contains one of the best examples of a potential globular cluster in formation. The 0.″3 resolution images reveal an unusual molecular cloud, “Cloud D1,” that is coincident with the radio-infrared supernebula. The ˜6 pc diameter cloud has a linewidth, Δ v = 21.7 {km} {{{s}}}-1, that reflects only the gravitational potential of the star cluster residing within it. The corresponding virial mass is 2.5 × 105 {M}⊙ . The cluster appears to have a top-heavy initial mass function, with M * ≳ 1-2 {M}⊙ . Cloud D1 is optically thin in CO(3-2), probably because the gas is hot. Molecular gas mass is very uncertain but constitutes <35% of the dynamical mass within the cloud boundaries. In spite of the presence of an estimated ˜1500-2000 O stars within the small cloud, the CO appears relatively undisturbed. We propose that Cloud D1 consists of molecular clumps or cores, possibly star-forming, orbiting with more evolved stars in the core of the giant cluster.
Climbing the Ladder of Star Formation Feedback
NASA Astrophysics Data System (ADS)
Frank, Adam
2012-10-01
While much is understood about isolated star formation, the opposite is true for star formation in clusters of both low and high mass. In particular the mechanisms by which many coevally formed stars affect their parent cloud environment remains poorly characterized. Fundamental questions such as interplay between multiple outflows, ionization fronts and turbulence are just beginning to be fully articulated. Distinguishing between the nature of feedback in clusters of different mass is also critical. In high mass clusters O stars are expected to dominate energetics while in low mass clusters multiple collimated outflows may represent the dominant feedback mechanism. Thus the issue of feedback modalities in clusters of different masses represents one of the major challenges to the next generation of star formation studies. In this proposal we seek to carry forward a focused theoretical study of feedback in both low and high-mass cluster environments with direct connections to observations. Using a state-of-the-art Adaptive Mesh Refinement MHD multi-physics code {developed by our group} we propose two computational studies: {1} multiple, interacting outflows and their role in altering the properties of a parent low mass cluster {2} Poorly collimated outburst/outflows from massive star{s} and their effect on high mass cluster star forming environments. In both cases we will use initial conditions derived from high-resolution AMR MHD simulations of cloud/cluster formation. Synthetic observations derived from the simulations {in a variety of emission lines from ions to atoms to molecules} will allow for direct contact with HST and other star formation databases.
A hybrid computational strategy to address WGS variant analysis in >5000 samples.
Huang, Zhuoyi; Rustagi, Navin; Veeraraghavan, Narayanan; Carroll, Andrew; Gibbs, Richard; Boerwinkle, Eric; Venkata, Manjunath Gorentla; Yu, Fuli
2016-09-10
The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies. We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms. Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants.
Young star clusters in nearby molecular clouds
NASA Astrophysics Data System (ADS)
Getman, K. V.; Kuhn, M. A.; Feigelson, E. D.; Broos, P. S.; Bate, M. R.; Garmire, G. P.
2018-06-01
The SFiNCs (Star Formation in Nearby Clouds) project is an X-ray/infrared study of the young stellar populations in 22 star-forming regions with distances ≲ 1 kpc designed to extend our earlier MYStIX (Massive Young Star-Forming Complex Study in Infrared and X-ray) survey of more distant clusters. Our central goal is to give empirical constraints on cluster formation mechanisms. Using parametric mixture models applied homogeneously to the catalogue of SFiNCs young stars, we identify 52 SFiNCs clusters and 19 unclustered stellar structures. The procedure gives cluster properties including location, population, morphology, association with molecular clouds, absorption, age (AgeJX), and infrared spectral energy distribution (SED) slope. Absorption, SED slope, and AgeJX are age indicators. SFiNCs clusters are examined individually, and collectively with MYStIX clusters, to give the following results. (1) SFiNCs is dominated by smaller, younger, and more heavily obscured clusters than MYStIX. (2) SFiNCs cloud-associated clusters have the high ellipticities aligned with their host molecular filaments indicating morphology inherited from their parental clouds. (3) The effect of cluster expansion is evident from the radius-age, radius-absorption, and radius-SED correlations. Core radii increase dramatically from ˜0.08 to ˜0.9 pc over the age range 1-3.5 Myr. Inferred gas removal time-scales are longer than 1 Myr. (4) Rich, spatially distributed stellar populations are present in SFiNCs clouds representing early generations of star formation. An appendix compares the performance of the mixture models and non-parametric minimum spanning tree to identify clusters. This work is a foundation for future SFiNCs/MYStIX studies including disc longevity, age gradients, and dynamical modelling.
NASA Astrophysics Data System (ADS)
Furht, Borko
In the introductory chapter we define the concept of cloud computing and cloud services, and we introduce layers and types of cloud computing. We discuss the differences between cloud computing and cloud services. New technologies that enabled cloud computing are presented next. We also discuss cloud computing features, standards, and security issues. We introduce the key cloud computing platforms, their vendors, and their offerings. We discuss cloud computing challenges and the future of cloud computing.
A Local Index of Cloud Immersion in Tropical Forests Using Time-Lapse Photography
NASA Astrophysics Data System (ADS)
Bassiouni, M.; Scholl, M. A.
2015-12-01
Data on the frequency, duration and elevation of cloud immersion is essential to improve estimates of cloud water deposition in water budgets in cloud forests. Here, we present a methodology to detect local cloud immersion in remote tropical forests using time-lapse photography. A simple approach is developed to detect cloudy conditions in photographs within the canopy where image depth during clear conditions may be less than 10 meters and moving leaves and branches and changes in lighting are unpredictable. A primary innovation of this study is that cloudiness is determined from images without using a reference clear image and without minimal threshold value determination or human judgment for calibration. Five sites ranging from 600 to 1000 meters elevation along a ridge in the Luquillo Critical Zone Observatory, Puerto Rico were each equipped with a trail camera programmed to take an image every 30 minutes since March 2014. Images were classified using four selected cloud-sensitive image characteristics (SCICs) computed for small image regions: contrast, the coefficient of variation and the entropy of the luminance of each image pixel, and image colorfulness. K-means clustering provided reasonable results to discriminate cloudy from clear conditions. Preliminary results indicate that 79-94% (daytime) and 85-93% (nighttime) of validation images were classified accurately at one open and two closed canopy sites. The euclidian distances between SCICs vectors of images during cloudy conditions and the SCICs vector of the centroid of the cluster of clear images show potential to quantify cloud density in addition to immersion. The classification method will be applied to determine spatial and temporal patterns of cloud immersion in the study area. The presented approach offers promising applications to increase observations of low-lying clouds at remote mountain sites where standard instruments to measure visibility and cloud base may not be practical.
Clustering Molecular Dynamics Trajectories for Optimizing Docking Experiments
De Paris, Renata; Quevedo, Christian V.; Ruiz, Duncan D.; Norberto de Souza, Osmar; Barros, Rodrigo C.
2015-01-01
Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR) model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand. PMID:25873944
Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets.
Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L
2014-01-01
As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required. Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample. Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
a Super Voxel-Based Riemannian Graph for Multi Scale Segmentation of LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Li, Minglei
2018-04-01
Automatically segmenting LiDAR points into respective independent partitions has become a topic of great importance in photogrammetry, remote sensing and computer vision. In this paper, we cast the problem of point cloud segmentation as a graph optimization problem by constructing a Riemannian graph. The scale space of the observed scene is explored by an octree-based over-segmentation with different depths. The over-segmentation produces many super voxels which restrict the structure of the scene and will be used as nodes of the graph. The Kruskal coordinates are used to compute edge weights that are proportional to the geodesic distance between nodes. Then we compute the edge-weight matrix in which the elements reflect the sectional curvatures associated with the geodesic paths between super voxel nodes on the scene surface. The final segmentation results are generated by clustering similar super voxels and cutting off the weak edges in the graph. The performance of this method was evaluated on LiDAR point clouds for both indoor and outdoor scenes. Additionally, extensive comparisons to state of the art techniques show that our algorithm outperforms on many metrics.
NASA Astrophysics Data System (ADS)
Sano, Hidetoshi; Enokiya, Rei; Hayashi, Katsuhiro; Yamagishi, Mitsuyoshi; Saeki, Shun; Okawa, Kazuki; Tsuge, Kisetsu; Tsutsumi, Daichi; Kohno, Mikito; Hattori, Yusuke; Yoshiike, Satoshi; Fujita, Shinji; Nishimura, Atsushi; Ohama, Akio; Tachihara, Kengo; Torii, Kazufumi; Hasegawa, Yutaka; Kimura, Kimihiro; Ogawa, Hideo; Wong, Graeme F.; Braiding, Catherine; Rowell, Gavin; Burton, Michael G.; Fukui, Yasuo
2018-02-01
A collision between two molecular clouds is one possible candidate for high-mass star formation. The H II region RCW 36, located in the Vela molecular ridge, contains a young star cluster (˜ 1 Myr old) and two O-type stars. We present new CO observations of RCW 36 made with NANTEN2, Mopra, and ASTE using 12CO(J = 1-0, 2-1, 3-2) and 13CO(J = 2-1) emission lines. We have discovered two molecular clouds lying at the velocities VLSR ˜ 5.5 and 9 km s-1. Both clouds are likely to be physically associated with the star cluster, as verified by the good spatial correspondence among the two clouds, infrared filaments, and the star cluster. We also found a high intensity ratio of ˜ 0.6-1.2 for CO J = 3-2/1-0 toward both clouds, indicating that the gas temperature has been increased due to heating by the O-type stars. We propose that the O-type stars in RCW 36 were formed by a collision between the two clouds, with a relative velocity separation of 5 km s-1. The complementary spatial distributions and the velocity separation of the two clouds are in good agreement with observational signatures expected for O-type star formation triggered by a cloud-cloud collision. We also found a displacement between the complementary spatial distributions of the two clouds, which we estimate to be 0.3 pc assuming the collision angle to be 45° relative to the line-of-sight. We estimate the collision timescale to be ˜ 105 yr. It is probable that the cluster age found by Ellerbroek et al. (2013b, A&A, 558, A102) is dominated by the low-mass members which were not formed under the triggering by cloud-cloud collision, and that the O-type stars in the center of the cluster are explained by the collisional triggering independently from the low-mass star formation.
NASA Astrophysics Data System (ADS)
Sano, Hidetoshi; Enokiya, Rei; Hayashi, Katsuhiro; Yamagishi, Mitsuyoshi; Saeki, Shun; Okawa, Kazuki; Tsuge, Kisetsu; Tsutsumi, Daichi; Kohno, Mikito; Hattori, Yusuke; Yoshiike, Satoshi; Fujita, Shinji; Nishimura, Atsushi; Ohama, Akio; Tachihara, Kengo; Torii, Kazufumi; Hasegawa, Yutaka; Kimura, Kimihiro; Ogawa, Hideo; Wong, Graeme F.; Braiding, Catherine; Rowell, Gavin; Burton, Michael G.; Fukui, Yasuo
2018-05-01
A collision between two molecular clouds is one possible candidate for high-mass star formation. The H II region RCW 36, located in the Vela molecular ridge, contains a young star cluster (˜ 1 Myr old) and two O-type stars. We present new CO observations of RCW 36 made with NANTEN2, Mopra, and ASTE using 12CO(J = 1-0, 2-1, 3-2) and 13CO(J = 2-1) emission lines. We have discovered two molecular clouds lying at the velocities VLSR ˜ 5.5 and 9 km s-1. Both clouds are likely to be physically associated with the star cluster, as verified by the good spatial correspondence among the two clouds, infrared filaments, and the star cluster. We also found a high intensity ratio of ˜ 0.6-1.2 for CO J = 3-2/1-0 toward both clouds, indicating that the gas temperature has been increased due to heating by the O-type stars. We propose that the O-type stars in RCW 36 were formed by a collision between the two clouds, with a relative velocity separation of 5 km s-1. The complementary spatial distributions and the velocity separation of the two clouds are in good agreement with observational signatures expected for O-type star formation triggered by a cloud-cloud collision. We also found a displacement between the complementary spatial distributions of the two clouds, which we estimate to be 0.3 pc assuming the collision angle to be 45° relative to the line-of-sight. We estimate the collision timescale to be ˜ 105 yr. It is probable that the cluster age found by Ellerbroek et al. (2013b, A&A, 558, A102) is dominated by the low-mass members which were not formed under the triggering by cloud-cloud collision, and that the O-type stars in the center of the cluster are explained by the collisional triggering independently from the low-mass star formation.
NASA Astrophysics Data System (ADS)
Ascenso, Joana
The past decade has seen an increase of star formation studies made at the molecular cloud scale, motivated mostly by the deployment of a wealth of sensitive infrared telescopes and instruments. Embedded clusters, long recognised as the basic units of coherent star formation in molecular clouds, are now seen to inhabit preferentially cluster complexes tens of parsecs across. This chapter gives an overview of some important properties of the embedded clusters in these complexes and of the complexes themselves, along with the implications of viewing star formation as a molecular-cloud scale process rather than an isolated process at the scale of clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Megeath, S. T.; Kryukova, E.; Gutermuth, R.
2016-01-15
We analyze the spatial distribution of dusty young stellar objects (YSOs) identified in the Spitzer Survey of the Orion Molecular clouds, augmenting these data with Chandra X-ray observations to correct for incompleteness in dense clustered regions. We also devise a scheme to correct for spatially varying incompleteness when X-ray data are not available. The local surface densities of the YSOs range from 1 pc{sup −2} to over 10,000 pc{sup −2}, with protostars tending to be in higher density regions. This range of densities is similar to other surveyed molecular clouds with clusters, but broader than clouds without clusters. By identifyingmore » clusters and groups as continuous regions with surface densities ≥10 pc{sup −2}, we find that 59% of the YSOs are in the largest cluster, the Orion Nebula Cluster (ONC), while 13% of the YSOs are found in a distributed population. A lower fraction of protostars in the distributed population is evidence that it is somewhat older than the groups and clusters. An examination of the structural properties of the clusters and groups shows that the peak surface densities of the clusters increase approximately linearly with the number of members. Furthermore, all clusters with more than 70 members exhibit asymmetric and/or highly elongated structures. The ONC becomes azimuthally symmetric in the inner 0.1 pc, suggesting that the cluster is only ∼2 Myr in age. We find that the star formation efficiency (SFE) of the Orion B cloud is unusually low, and that the SFEs of individual groups and clusters are an order of magnitude higher than those of the clouds. Finally, we discuss the relationship between the young low mass stars in the Orion clouds and the Orion OB 1 association, and we determine upper limits to the fraction of disks that may be affected by UV radiation from OB stars or dynamical interactions in dense, clustered regions.« less
Scientific Cluster Deployment and Recovery - Using puppet to simplify cluster management
NASA Astrophysics Data System (ADS)
Hendrix, Val; Benjamin, Doug; Yao, Yushu
2012-12-01
Deployment, maintenance and recovery of a scientific cluster, which has complex, specialized services, can be a time consuming task requiring the assistance of Linux system administrators, network engineers as well as domain experts. Universities and small institutions that have a part-time FTE with limited time for and knowledge of the administration of such clusters can be strained by such maintenance tasks. This current work is the result of an effort to maintain a data analysis cluster (DAC) with minimal effort by a local system administrator. The realized benefit is the scientist, who is the local system administrator, is able to focus on the data analysis instead of the intricacies of managing a cluster. Our work provides a cluster deployment and recovery process (CDRP) based on the puppet configuration engine allowing a part-time FTE to easily deploy and recover entire clusters with minimal effort. Puppet is a configuration management system (CMS) used widely in computing centers for the automatic management of resources. Domain experts use Puppet's declarative language to define reusable modules for service configuration and deployment. Our CDRP has three actors: domain experts, a cluster designer and a cluster manager. The domain experts first write the puppet modules for the cluster services. A cluster designer would then define a cluster. This includes the creation of cluster roles, mapping the services to those roles and determining the relationships between the services. Finally, a cluster manager would acquire the resources (machines, networking), enter the cluster input parameters (hostnames, IP addresses) and automatically generate deployment scripts used by puppet to configure it to act as a designated role. In the event of a machine failure, the originally generated deployment scripts along with puppet can be used to easily reconfigure a new machine. The cluster definition produced in our CDRP is an integral part of automating cluster deployment in a cloud environment. Our future cloud efforts will further build on this work.
NASA Astrophysics Data System (ADS)
Bitsakis, Theodoros; González-Lópezlira, R. A.; Bonfini, P.; Bruzual, G.; Maravelias, G.; Zaritsky, D.; Charlot, S.; Ramírez-Siordia, V. H.
2018-02-01
We present a new study of the spatial distribution and ages of the star clusters in the Small Magellanic Cloud (SMC). To detect and estimate the ages of the star clusters we rely on the new fully automated method developed by Bitsakis et al. Our code detects 1319 star clusters in the central 18 deg2 of the SMC we surveyed (1108 of which have never been reported before). The age distribution of those clusters suggests enhanced cluster formation around 240 Myr ago. It also implies significant differences in the cluster distribution of the bar with respect to the rest of the galaxy, with the younger clusters being predominantly located in the bar. Having used the same setup, and data from the same surveys as for our previous study of the LMC, we are able to robustly compare the cluster properties between the two galaxies. Our results suggest that the bulk of the clusters in both galaxies were formed approximately 300 Myr ago, probably during a direct collision between the two galaxies. On the other hand, the locations of the young (≤50 Myr) clusters in both Magellanic Clouds, found where their bars join the H I arms, suggest that cluster formation in those regions is a result of internal dynamical processes. Finally, we discuss the potential causes of the apparent outside-in quenching of cluster formation that we observe in the SMC. Our findings are consistent with an evolutionary scheme where the interactions between the Magellanic Clouds constitute the major mechanism driving their overall evolution.
NASA Technical Reports Server (NTRS)
Mighell, Kenneth J.; Sarajedini, Ata; French, Rica S.
1998-01-01
We present our analysis of archival Hubble Space Telescope Wide Field Planetary Camera 2 (WFPC2) observations in F45OW ( approximately B) and F555W (approximately V) of the intermediate-age populous star clusters NGC 121, NGC 339, NGC 361, NGC 416, and Kron 3 in the Small Magellanic Cloud. We use published photometry of two other SMC populous star clusters, Lindsay 1 and Lindsay 113, to investigate the age sequence of these seven populous star clusters in order to improve our understanding of the formation chronology of the SMC. We analyzed the V vs B-V and M(sub V) vs (B-V)(sub 0) color-magnitude diagrams of these populous Small Magellanic Cloud star clusters using a variety of techniques and determined their ages, metallicities, and reddenings. These new data enable us to improve the age-metallicity relation of star clusters in the Small Magellanic Cloud. In particular, we find that a closed-box continuous star-formation model does not reproduce the age-metallicity relation adequately. However, a theoretical model punctuated by bursts of star formation is in better agreement with the observational data presented herein.
Molgenis-impute: imputation pipeline in a box.
Kanterakis, Alexandros; Deelen, Patrick; van Dijk, Freerk; Byelas, Heorhiy; Dijkstra, Martijn; Swertz, Morris A
2015-08-19
Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Here we present MOLGENIS-impute, an 'imputation in a box' solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation.
NASA Technical Reports Server (NTRS)
Goldenberg, Stanley B.; Houze, Robert A., Jr.; Churchill, Dean D.
1990-01-01
The horizontal precipitation structure of cloud clusters observed over the South China Sea during the Winter Monsoon Experiment (WMONEX) is analyzed using a convective-stratiform technique (CST) developed by Adler and Negri (1988). The technique was modified by altering the method for identifying convective cells in the satellite data, accounting for the extremely cold cloud tops characteristic of the WMONEX region, and modifying the threshold infrared temperature for the boundary of the stratiform rain area. The precipitation analysis was extended to the entire history of the cloud cluster by applying the modified CST to IR imagery from geosynchronous-satellite observations. The ship and aircraft data from the later period of the cluster's lifetime make it possible to check the locations of convective and stratiform precipitation identified by the CST using in situ observations. The extended CST is considered to be effective for determining the climatology of the convective-stratiform structure of tropical cloud clusters.
An AZTEC/ASTE 1.1mm Survey Of The Young, Dense, Nearby Star-forming Region, Serpens South
NASA Astrophysics Data System (ADS)
Gutermuth, Robert A.; Bourke, T.; Matthews, B.; Dunham, M.; Allen, L.; Myers, P.; Jorgensen, J.; Wilson, G.; Yun, M.; Hughes, D.; Aretxaga, I.; Ryohei, K.; Kotaro, K.; Scott, K.; Austermann, J.
2010-01-01
The Serpens South embedded cluster, recently discovered by the Spitzer Gould Belt Legacy Survey, stands out among over 100 clusters and groups surveyed by Spitzer as the densest (>430 pc-2) and youngest (77% Class I protostars) clustered star forming region known within the nearest 400 pc. In order to better characterize the primordial structure of the cluster's natal cloud, we have made a 1.1mm dust continuum map of Serpens South from the AzTEC instrument on the 10m Atacama Submillimeter Telescope Experiment (ASTE). The projected morphology of the emission is best described by a central dense hub with numerous 0.5 pc-long filaments radiating away from the center. Large scale flux features that are typically removed via modern sky subtraction techniques are recovered using a novel iterative flux retrieval algorithm. Using standard assumptions (emissivity, dust-to-gas ratio, and T=10K), we compute the total mass of the Serpens South cloud core and filaments to be 480 Msun. We construct separate large and small scale structure maps via wavelet decomposition, and deploy a watershed structure isolation technique separately to each map in order to isolate all empirically observed substructure. This technique confirms our qualitative observation that the filaments north of the hub are notably less clumpy than those to the south, while the total mass is similar between the two regions. Both regions have relatively small numbers of young stellar objects, thus we speculate that we have caught this cloud in the act of fragmenting into pre-stellar cores.
A holistic image segmentation framework for cloud detection and extraction
NASA Astrophysics Data System (ADS)
Shen, Dan; Xu, Haotian; Blasch, Erik; Horvath, Gregory; Pham, Khanh; Zheng, Yufeng; Ling, Haibin; Chen, Genshe
2013-05-01
Atmospheric clouds are commonly encountered phenomena affecting visual tracking from air-borne or space-borne sensors. Generally clouds are difficult to detect and extract because they are complex in shape and interact with sunlight in a complex fashion. In this paper, we propose a clustering game theoretic image segmentation based approach to identify, extract, and patch clouds. In our framework, the first step is to decompose a given image containing clouds. The problem of image segmentation is considered as a "clustering game". Within this context, the notion of a cluster is equivalent to a classical equilibrium concept from game theory, as the game equilibrium reflects both the internal and external (e.g., two-player) cluster conditions. To obtain the evolutionary stable strategies, we explore three evolutionary dynamics: fictitious play, replicator dynamics, and infection and immunization dynamics (InImDyn). Secondly, we use the boundary and shape features to refine the cloud segments. This step can lower the false alarm rate. In the third step, we remove the detected clouds and patch the empty spots by performing background recovery. We demonstrate our cloud detection framework on a video clip provides supportive results.
Spatial Distribution of Large Cloud Drops
NASA Technical Reports Server (NTRS)
Marshak, A.; Knyazikhin, Y.; Larsen, M.; Wiscombe, W.
2004-01-01
By analyzing aircraft measurements of individual drop sizes in clouds, we have shown in a companion paper (Knyazikhin et al., 2004) that the probability of finding a drop of radius r at a linear scale l decreases as l(sup D(r)) where 0 less than or equal to D(r) less than or equal to 1. This paper shows striking examples of the spatial distribution of large cloud drops using models that simulate the observed power laws. In contrast to currently used models that assume homogeneity and therefore a Poisson distribution of cloud drops, these models show strong drop clustering, the more so the larger the drops. The degree of clustering is determined by the observed exponents D(r). The strong clustering of large drops arises naturally from the observed power-law statistics. This clustering has vital consequences for rain physics explaining how rain can form so fast. It also helps explain why remotely sensed cloud drop size is generally biased and why clouds absorb more sunlight than conventional radiative transfer models predict.
Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, Brian; Manipon, Gerald; Hua, Hook; Fetzer, Eric
2014-05-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map-reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in a hybrid Cloud (private eucalyptus & public Amazon). Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept and prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be perform
The FOSS GIS Workbench on the GFZ Load Sharing Facility compute cluster
NASA Astrophysics Data System (ADS)
Löwe, P.; Klump, J.; Thaler, J.
2012-04-01
Compute clusters can be used as GIS workbenches, their wealth of resources allow us to take on geocomputation tasks which exceed the limitations of smaller systems. To harness these capabilities requires a Geographic Information System (GIS), able to utilize the available cluster configuration/architecture and a sufficient degree of user friendliness to allow for wide application. In this paper we report on the first successful porting of GRASS GIS, the oldest and largest Free Open Source (FOSS) GIS project, onto a compute cluster using Platform Computing's Load Sharing Facility (LSF). In 2008, GRASS6.3 was installed on the GFZ compute cluster, which at that time comprised 32 nodes. The interaction with the GIS was limited to the command line interface, which required further development to encapsulate the GRASS GIS business layer to facilitate its use by users not familiar with GRASS GIS. During the summer of 2011, multiple versions of GRASS GIS (v 6.4, 6.5 and 7.0) were installed on the upgraded GFZ compute cluster, now consisting of 234 nodes with 480 CPUs providing 3084 cores. The GFZ compute cluster currently offers 19 different processing queues with varying hardware capabilities and priorities, allowing for fine-grained scheduling and load balancing. After successful testing of core GIS functionalities, including the graphical user interface, mechanisms were developed to deploy scripted geocomputation tasks onto dedicated processing queues. The mechanisms are based on earlier work by NETELER et al. (2008). A first application of the new GIS functionality was the generation of maps of simulated tsunamis in the Mediterranean Sea for the Tsunami Atlas of the FP-7 TRIDEC Project (www.tridec-online.eu). For this, up to 500 processing nodes were used in parallel. Further trials included the processing of geometrically complex problems, requiring significant amounts of processing time. The GIS cluster successfully completed all these tasks, with processing times lasting up to full 20 CPU days. The deployment of GRASS GIS on a compute cluster allows our users to tackle GIS tasks previously out of reach of single workstations. In addition, this GRASS GIS cluster implementation will be made available to other users at GFZ in the course of 2012. It will thus become a research utility in the sense of "Software as a Service" (SaaS) and can be seen as our first step towards building a GFZ corporate cloud service.
Running Neuroimaging Applications on Amazon Web Services: How, When, and at What Cost?
Madhyastha, Tara M; Koh, Natalie; Day, Trevor K M; Hernández-Fernández, Moises; Kelley, Austin; Peterson, Daniel J; Rajan, Sabreena; Woelfer, Karl A; Wolf, Jonathan; Grabowski, Thomas J
2017-01-01
The contribution of this paper is to identify and describe current best practices for using Amazon Web Services (AWS) to execute neuroimaging workflows "in the cloud." Neuroimaging offers a vast set of techniques by which to interrogate the structure and function of the living brain. However, many of the scientists for whom neuroimaging is an extremely important tool have limited training in parallel computation. At the same time, the field is experiencing a surge in computational demands, driven by a combination of data-sharing efforts, improvements in scanner technology that allow acquisition of images with higher image resolution, and by the desire to use statistical techniques that stress processing requirements. Most neuroimaging workflows can be executed as independent parallel jobs and are therefore excellent candidates for running on AWS, but the overhead of learning to do so and determining whether it is worth the cost can be prohibitive. In this paper we describe how to identify neuroimaging workloads that are appropriate for running on AWS, how to benchmark execution time, and how to estimate cost of running on AWS. By benchmarking common neuroimaging applications, we show that cloud computing can be a viable alternative to on-premises hardware. We present guidelines that neuroimaging labs can use to provide a cluster-on-demand type of service that should be familiar to users, and scripts to estimate cost and create such a cluster.
NASA Technical Reports Server (NTRS)
Putnam, William M.
2011-01-01
Earth system models like the Goddard Earth Observing System model (GEOS-5) have been pushing the limits of large clusters of multi-core microprocessors, producing breath-taking fidelity in resolving cloud systems at a global scale. GPU computing presents an opportunity for improving the efficiency of these leading edge models. A GPU implementation of GEOS-5 will facilitate the use of cloud-system resolving resolutions in data assimilation and weather prediction, at resolutions near 3.5 km, improving our ability to extract detailed information from high-resolution satellite observations and ultimately produce better weather and climate predictions
OceanXtremes: Scalable Anomaly Detection in Oceanographic Time-Series
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Armstrong, E. M.; Chin, T. M.; Gill, K. M.; Greguska, F. R., III; Huang, T.; Jacob, J. C.; Quach, N.
2016-12-01
The oceanographic community must meet the challenge to rapidly identify features and anomalies in complex and voluminous observations to further science and improve decision support. Given this data-intensive reality, we are developing an anomaly detection system, called OceanXtremes, powered by an intelligent, elastic Cloud-based analytic service backend that enables execution of domain-specific, multi-scale anomaly and feature detection algorithms across the entire archive of 15 to 30-year ocean science datasets.Our parallel analytics engine is extending the NEXUS system and exploits multiple open-source technologies: Apache Cassandra as a distributed spatial "tile" cache, Apache Spark for in-memory parallel computation, and Apache Solr for spatial search and storing pre-computed tile statistics and other metadata. OceanXtremes provides these key capabilities: Parallel generation (Spark on a compute cluster) of 15 to 30-year Ocean Climatologies (e.g. sea surface temperature or SST) in hours or overnight, using simple pixel averages or customizable Gaussian-weighted "smoothing" over latitude, longitude, and time; Parallel pre-computation, tiling, and caching of anomaly fields (daily variables minus a chosen climatology) with pre-computed tile statistics; Parallel detection (over the time-series of tiles) of anomalies or phenomena by regional area-averages exceeding a specified threshold (e.g. high SST in El Nino or SST "blob" regions), or more complex, custom data mining algorithms; Shared discovery and exploration of ocean phenomena and anomalies (facet search using Solr), along with unexpected correlations between key measured variables; Scalable execution for all capabilities on a hybrid Cloud, using our on-premise OpenStack Cloud cluster or at Amazon. The key idea is that the parallel data-mining operations will be run "near" the ocean data archives (a local "network" hop) so that we can efficiently access the thousands of files making up a three decade time-series. The presentation will cover the architecture of OceanXtremes, parallelization of the climatology computation and anomaly detection algorithms using Spark, example results for SST and other time-series, and parallel performance metrics.
Data Characterization Using Artificial-Star Tests: Performance Evaluation
NASA Astrophysics Data System (ADS)
Hu, Yi; Deng, Licai; de Grijs, Richard; Liu, Qiang
2011-01-01
Traditional artificial-star tests are widely applied to photometry in crowded stellar fields. However, to obtain reliable binary fractions (and their uncertainties) of remote, dense, and rich star clusters, one needs to recover huge numbers of artificial stars. Hence, this will consume much computation time for data reduction of the images to which the artificial stars must be added. In this article, we present a new method applicable to data sets characterized by stable, well-defined, point-spread functions, in which we add artificial stars to the retrieved-data catalog instead of to the raw images. Taking the young Large Magellanic Cloud cluster NGC 1818 as an example, we compare results from both methods and show that they are equivalent, while our new method saves significant computational time.
Optimizing R with SparkR on a commodity cluster for biomedical research.
Sedlmayr, Martin; Würfl, Tobias; Maier, Christian; Häberle, Lothar; Fasching, Peter; Prokosch, Hans-Ulrich; Christoph, Jan
2016-12-01
Medical researchers are challenged today by the enormous amount of data collected in healthcare. Analysis methods such as genome-wide association studies (GWAS) are often computationally intensive and thus require enormous resources to be performed in a reasonable amount of time. While dedicated clusters and public clouds may deliver the desired performance, their use requires upfront financial efforts or anonymous data, which is often not possible for preliminary or occasional tasks. We explored the possibilities to build a private, flexible cluster for processing scripts in R based on commodity, non-dedicated hardware of our department. For this, a GWAS-calculation in R on a single desktop computer, a Message Passing Interface (MPI)-cluster, and a SparkR-cluster were compared with regards to the performance, scalability, quality, and simplicity. The original script had a projected runtime of three years on a single desktop computer. Optimizing the script in R already yielded a significant reduction in computing time (2 weeks). By using R-MPI and SparkR, we were able to parallelize the computation and reduce the time to less than three hours (2.6 h) on already available, standard office computers. While MPI is a proven approach in high-performance clusters, it requires rather static, dedicated nodes. SparkR and its Hadoop siblings allow for a dynamic, elastic environment with automated failure handling. SparkR also scales better with the number of nodes in the cluster than MPI due to optimized data communication. R is a popular environment for clinical data analysis. The new SparkR solution offers elastic resources and allows supporting big data analysis using R even on non-dedicated resources with minimal change to the original code. To unleash the full potential, additional efforts should be invested to customize and improve the algorithms, especially with regards to data distribution. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fukui, Y.; Torii, K.; Ohama, A.
We present distributions of two molecular clouds having velocities of 2 and 14 km s{sup −1} toward RCW 38, the youngest super star cluster in the Milky Way, in the {sup 12}CO J = 1–0 and 3–2 and {sup 13}CO J = 1–0 transitions. The two clouds are likely physically associated with the cluster as verified by the high intensity ratio of the J = 3–2 emission to the J = 1–0 emission, the bridging feature connecting the two clouds in velocity, and their morphological correspondence with the infrared dust emission. The velocity difference is too large for the cloudsmore » to be gravitationally bound. We frame a hypothesis that the two clouds are colliding with each other by chance to trigger formation of the ∼20 O stars that are localized within ∼0.5 pc of the cluster center in the 2 km s{sup −1} cloud. We suggest that the collision is currently continuing toward part of the 2 km s{sup −1} cloud where the bridging feature is localized. This is the third super star cluster alongside Westerlund 2 and NGC 3603 where cloud–cloud collision has triggered the cluster formation. RCW 38 is the youngest super star cluster in the Milky Way, holding a possible sign of on-going O star formation, and is a promising site where we may be able to witness the moment of O star formation.« less
Star cluster formation in a turbulent molecular cloud self-regulated by photoionization feedback
NASA Astrophysics Data System (ADS)
Gavagnin, Elena; Bleuler, Andreas; Rosdahl, Joakim; Teyssier, Romain
2017-12-01
Most stars in the Galaxy are believed to be formed within star clusters from collapsing molecular clouds. However, the complete process of star formation, from the parent cloud to a gas-free star cluster, is still poorly understood. We perform radiation-hydrodynamical simulations of the collapse of a turbulent molecular cloud using the RAMSES-RT code. Stars are modelled using sink particles, from which we self-consistently follow the propagation of the ionizing radiation. We study how different feedback models affect the gas expulsion from the cloud and how they shape the final properties of the emerging star cluster. We find that the star formation efficiency is lower for stronger feedback models. Feedback also changes the high-mass end of the stellar mass function. Stronger feedback also allows the establishment of a lower density star cluster, which can maintain a virial or sub-virial state. In the absence of feedback, the star formation efficiency is very high, as well as the final stellar density. As a result, high-energy close encounters make the cluster evaporate quickly. Other indicators, such as mass segregation, statistics of multiple systems and escaping stars confirm this picture. Observations of young star clusters are in best agreement with our strong feedback simulation.
On-demand Simulation of Atmospheric Transport Processes on the AlpEnDAC Cloud
NASA Astrophysics Data System (ADS)
Hachinger, S.; Harsch, C.; Meyer-Arnek, J.; Frank, A.; Heller, H.; Giemsa, E.
2016-12-01
The "Alpine Environmental Data Analysis Centre" (AlpEnDAC) develops a data-analysis platform for high-altitude research facilities within the "Virtual Alpine Observatory" project (VAO). This platform, with its web portal, will support use cases going much beyond data management: On user request, the data are augmented with "on-demand" simulation results, such as air-parcel trajectories for tracing down the source of pollutants when they appear in high concentration. The respective back-end mechanism uses the Compute Cloud of the Leibniz Supercomputing Centre (LRZ) to transparently calculate results requested by the user, as far as they have not yet been stored in AlpEnDAC. The queuing-system operation model common in supercomputing is replaced by a model in which Virtual Machines (VMs) on the cloud are automatically created/destroyed, providing the necessary computing power immediately on demand. From a security point of view, this allows to perform simulations in a sandbox defined by the VM configuration, without direct access to a computing cluster. Within few minutes, the user receives conveniently visualized results. The AlpEnDAC infrastructure is distributed among two participating institutes [front-end at German Aerospace Centre (DLR), simulation back-end at LRZ], requiring an efficient mechanism for synchronization of measured and augmented data. We discuss our iRODS-based solution for these data-management tasks as well as the general AlpEnDAC framework. Our cloud-based offerings aim at making scientific computing for our users much more convenient and flexible than it has been, and to allow scientists without a broad background in scientific computing to benefit from complex numerical simulations.
Large-scale virtual screening on public cloud resources with Apache Spark.
Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola
2017-01-01
Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.
Automated cloud screening of AVHRR imagery using split-and-merge clustering
NASA Technical Reports Server (NTRS)
Gallaudet, Timothy C.; Simpson, James J.
1991-01-01
Previous methods to segment clouds from ocean in AVHRR imagery have shown varying degrees of success, with nighttime approaches being the most limited. An improved method of automatic image segmentation, the principal component transformation split-and-merge clustering (PCTSMC) algorithm, is presented and applied to cloud screening of both nighttime and daytime AVHRR data. The method combines spectral differencing, the principal component transformation, and split-and-merge clustering to sample objectively the natural classes in the data. This segmentation method is then augmented by supervised classification techniques to screen clouds from the imagery. Comparisons with other nighttime methods demonstrate its improved capability in this application. The sensitivity of the method to clustering parameters is presented; the results show that the method is insensitive to the split-and-merge thresholds.
Discovery of a loose star cluster in the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Piatti, Andrés E.
2016-06-01
We present results for an up-to-date uncatalogued star cluster projected towards the Eastern side of the Large Magellanic Cloud (LMC) outer disc. The new object was discovered from a search of loose star cluster in the Magellanic Clouds' (MCs) outskirts using kernel density estimators on Washington CT1 deep images. Contrarily to what would be commonly expected, the star cluster resulted to be a young object (log(t yr-1) = 8.45) with a slightly subsolar metal content (Z = 0.013) and a total mass of 650 M⊙. Its core, half-mass and tidal radii also are within the frequent values of LMC star clusters. However, the new star cluster is placed at the Small Magellanic Cloud distance and at 11.3 kpc from the LMC centre. We speculate with the possibility that it was born in the inner body of the LMC and soon after expelled into the intergalactic space during the recent Milky Way/MCs interaction. Nevertheless, radial velocity and chemical abundance measurements are needed to further understand its origin, as well as extensive search for loose star clusters in order to constrain the effectiveness of star cluster scattering during galaxy interactions.
High performance data transfer
NASA Astrophysics Data System (ADS)
Cottrell, R.; Fang, C.; Hanushevsky, A.; Kreuger, W.; Yang, W.
2017-10-01
The exponentially increasing need for high speed data transfer is driven by big data, and cloud computing together with the needs of data intensive science, High Performance Computing (HPC), defense, the oil and gas industry etc. We report on the Zettar ZX software. This has been developed since 2013 to meet these growing needs by providing high performance data transfer and encryption in a scalable, balanced, easy to deploy and use way while minimizing power and space utilization. In collaboration with several commercial vendors, Proofs of Concept (PoC) consisting of clusters have been put together using off-the- shelf components to test the ZX scalability and ability to balance services using multiple cores, and links. The PoCs are based on SSD flash storage that is managed by a parallel file system. Each cluster occupies 4 rack units. Using the PoCs, between clusters we have achieved almost 200Gbps memory to memory over two 100Gbps links, and 70Gbps parallel file to parallel file with encryption over a 5000 mile 100Gbps link.
The Cloud Area Padovana: from pilot to production
NASA Astrophysics Data System (ADS)
Andreetto, P.; Costa, F.; Crescente, A.; Dorigo, A.; Fantinel, S.; Fanzago, F.; Sgaravatto, M.; Traldi, S.; Verlato, M.; Zangrando, L.
2017-10-01
The Cloud Area Padovana has been running for almost two years. This is an OpenStack-based scientific cloud, spread across two different sites: the INFN Padova Unit and the INFN Legnaro National Labs. The hardware resources have been scaled horizontally and vertically, by upgrading some hypervisors and by adding new ones: currently it provides about 1100 cores. Some in-house developments were also integrated in the OpenStack dashboard, such as a tool for user and project registrations with direct support for the INFN-AAI Identity Provider as a new option for the user authentication. In collaboration with the EU-funded Indigo DataCloud project, the integration with Docker-based containers has been experimented with and will be available in production soon. This computing facility now satisfies the computational and storage demands of more than 70 users affiliated with about 20 research projects. We present here the architecture of this Cloud infrastructure, the tools and procedures used to operate it. We also focus on the lessons learnt in these two years, describing the problems that were found and the corrective actions that had to be applied. We also discuss about the chosen strategy for upgrades, which combines the need to promptly integrate the OpenStack new developments, the demand to reduce the downtimes of the infrastructure, and the need to limit the effort requested for such updates. We also discuss how this Cloud infrastructure is being used. In particular we focus on two big physics experiments which are intensively exploiting this computing facility: CMS and SPES. CMS deployed on the cloud a complex computational infrastructure, composed of several user interfaces for job submission in the Grid environment/local batch queues or for interactive processes; this is fully integrated with the local Tier-2 facility. To avoid a static allocation of the resources, an elastic cluster, based on cernVM, has been configured: it allows to automatically create and delete virtual machines according to the user needs. SPES, using a client-server system called TraceWin, exploits INFN’s virtual resources performing a very large number of simulations on about a thousand nodes elastically managed.
Leveraging the Cloud for Robust and Efficient Lunar Image Processing
NASA Technical Reports Server (NTRS)
Chang, George; Malhotra, Shan; Wolgast, Paul
2011-01-01
The Lunar Mapping and Modeling Project (LMMP) is tasked to aggregate lunar data, from the Apollo era to the latest instruments on the LRO spacecraft, into a central repository accessible by scientists and the general public. A critical function of this task is to provide users with the best solution for browsing the vast amounts of imagery available. The image files LMMP manages range from a few gigabytes to hundreds of gigabytes in size with new data arriving every day. Despite this ever-increasing amount of data, LMMP must make the data readily available in a timely manner for users to view and analyze. This is accomplished by tiling large images into smaller images using Hadoop, a distributed computing software platform implementation of the MapReduce framework, running on a small cluster of machines locally. Additionally, the software is implemented to use Amazon's Elastic Compute Cloud (EC2) facility. We also developed a hybrid solution to serve images to users by leveraging cloud storage using Amazon's Simple Storage Service (S3) for public data while keeping private information on our own data servers. By using Cloud Computing, we improve upon our local solution by reducing the need to manage our own hardware and computing infrastructure, thereby reducing costs. Further, by using a hybrid of local and cloud storage, we are able to provide data to our users more efficiently and securely. 12 This paper examines the use of a distributed approach with Hadoop to tile images, an approach that provides significant improvements in image processing time, from hours to minutes. This paper describes the constraints imposed on the solution and the resulting techniques developed for the hybrid solution of a customized Hadoop infrastructure over local and cloud resources in managing this ever-growing data set. It examines the performance trade-offs of using the more plentiful resources of the cloud, such as those provided by S3, against the bandwidth limitations such use encounters with remote resources. As part of this discussion this paper will outline some of the technologies employed, the reasons for their selection, the resulting performance metrics and the direction the project is headed based upon the demonstrated capabilities thus far.
Short-term Power Load Forecasting Based on Balanced KNN
NASA Astrophysics Data System (ADS)
Lv, Xianlong; Cheng, Xingong; YanShuang; Tang, Yan-mei
2018-03-01
To improve the accuracy of load forecasting, a short-term load forecasting model based on balanced KNN algorithm is proposed; According to the load characteristics, the historical data of massive power load are divided into scenes by the K-means algorithm; In view of unbalanced load scenes, the balanced KNN algorithm is proposed to classify the scene accurately; The local weighted linear regression algorithm is used to fitting and predict the load; Adopting the Apache Hadoop programming framework of cloud computing, the proposed algorithm model is parallelized and improved to enhance its ability of dealing with massive and high-dimension data. The analysis of the household electricity consumption data for a residential district is done by 23-nodes cloud computing cluster, and experimental results show that the load forecasting accuracy and execution time by the proposed model are the better than those of traditional forecasting algorithm.
2010-04-29
Cloud Computing The answer, my friend, is blowing in the wind. The answer is blowing in the wind. 1Bingue ‐ Cook Cloud Computing STSC 2010... Cloud Computing STSC 2010 Objectives • Define the cloud • Risks of cloud computing f l d i• Essence o c ou comput ng • Deployed clouds in DoD 3Bingue...Cook Cloud Computing STSC 2010 Definitions of Cloud Computing Cloud computing is a model for enabling b d d ku
MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud.
Expósito, Roberto R; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan
2017-09-01
This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool. Source code in Java and Hadoop as well as a user's guide are freely available under the GNU GPLv3 license at http://mardre.des.udc.es . rreye@udc.es. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Astrophysics Data System (ADS)
Jensen, Sigurd S.; Haugbølle, Troels
2018-02-01
Hertzsprung-Russell diagrams of star-forming regions show a large luminosity spread. This is incompatible with well-defined isochrones based on classic non-accreting protostellar evolution models. Protostars do not evolve in isolation of their environment, but grow through accretion of gas. In addition, while an age can be defined for a star-forming region, the ages of individual stars in the region will vary. We show how the combined effect of a protostellar age spread, a consequence of sustained star formation in the molecular cloud, and time-varying protostellar accretion for individual protostars can explain the observed luminosity spread. We use a global magnetohydrodynamic simulation including a sub-scale sink particle model of a star-forming region to follow the accretion process of each star. The accretion profiles are used to compute stellar evolution models for each star, incorporating a model of how the accretion energy is distributed to the disc, radiated away at the accretion shock, or incorporated into the outer layers of the protostar. Using a modelled cluster age of 5 Myr, we naturally reproduce the luminosity spread and find good agreement with observations of the Collinder 69 cluster, and the Orion Nebular Cluster. It is shown how stars in binary and multiple systems can be externally forced creating recurrent episodic accretion events. We find that in a realistic global molecular cloud model massive stars build up mass over relatively long time-scales. This leads to an important conceptual change compared to the classic picture of non-accreting stellar evolution segmented into low-mass Hayashi tracks and high-mass Henyey tracks.
Hodor, Paul; Chawla, Amandeep; Clark, Andrew; Neal, Lauren
2016-01-15
: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. hodor_paul@bah.com. © The Author 2015. Published by Oxford University Press.
Hodor, Paul; Chawla, Amandeep; Clark, Andrew; Neal, Lauren
2016-01-01
Summary: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Availability and implementation: Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. Contact: hodor_paul@bah.com PMID:26428290
An Overview of Cloud Computing in Distributed Systems
NASA Astrophysics Data System (ADS)
Divakarla, Usha; Kumari, Geetha
2010-11-01
Cloud computing is the emerging trend in the field of distributed computing. Cloud computing evolved from grid computing and distributed computing. Cloud plays an important role in huge organizations in maintaining huge data with limited resources. Cloud also helps in resource sharing through some specific virtual machines provided by the cloud service provider. This paper gives an overview of the cloud organization and some of the basic security issues pertaining to the cloud.
NASA Astrophysics Data System (ADS)
Portegies Zwart, S. F.; Chen, H.-C.
2008-06-01
We reconstruct the initial two-body relaxation time at the half mass radius for a sample of young ⪉ 300 Myr star clusters in the Large Magellanic cloud. We achieve this by simulating star clusters with 12288 to 131072 stars using direct N-body integration. The equations of motion of all stars are calculated with high precision direct N-body simulations which include the effects of the evolution of single stars and binaries. We find that the initial relaxation times of the sample of observed clusters in the Large Magellanic Cloud ranges from about 200 Myr to about 2 Gyr. The reconstructed initial half-mass relaxation times for these clusters have a much narrower distribution than the currently observed distribution, which ranges over more than two orders of magnitude.
Analysis on the security of cloud computing
NASA Astrophysics Data System (ADS)
He, Zhonglin; He, Yuhua
2011-02-01
Cloud computing is a new technology, which is the fusion of computer technology and Internet development. It will lead the revolution of IT and information field. However, in cloud computing data and application software is stored at large data centers, and the management of data and service is not completely trustable, resulting in safety problems, which is the difficult point to improve the quality of cloud service. This paper briefly introduces the concept of cloud computing. Considering the characteristics of cloud computing, it constructs the security architecture of cloud computing. At the same time, with an eye toward the security threats cloud computing faces, several corresponding strategies are provided from the aspect of cloud computing users and service providers.
VizieR Online Data Catalog: M33 molecular clouds and young stellar clusters (Corbelli+, 2017)
NASA Astrophysics Data System (ADS)
Corbelli, E.; Braine, J.; Bandiera, R.; Brouillet, N.; Combes, F.; Druard, C.; Gratier, P.; Mata, J.; Schuster, K.; Xilouris, M.; Palla, F.
2017-04-01
Table 5 : Physical parameters for the 566 molecular clouds identified through the IRAM 30m CO J=2-1 survey of the star forming disk of M33. For each cloud the cloud type and the following properties are listed: celestial coordinates, galactocentric radius, cloud deconvolved effective radius and its uncertainty, CO(2-1) line velocity dispersion from CPROPS and its uncertainty, line velocity dispersion from a Gaussian fit, CO luminous mass and its uncertainty, and virial mass from a Gaussian fit. In the last column the identification number of the young stellar cluster candidates associated with the molecular cloud are listed. Notes: We identify up to four young stellar cluster candidates (YSCCs) associated with each molecular cloud and we list them according to the identification number of Sharma et al. (2011, Cat. J/A+A/545/A96) given also in Table 6. Table 6 : Physical parameters for the 630 young stellar cluster candidates identified via their mid-infrared emission in the star forming disk of M33. For each YSCC we list the type of source, the identified number of the molecular clouds associated with it (if any) and the corresponding cloud classes. In addition, for each YSCC we give the celestial coordinates, the bolometric, total infrared, FUV and Halpha luminosities, the estimated mass and age, the visual extinction, the galactocentric radius, the source size, and its flux at 24μm. (2 data files).
Dense cloud cores revealed by CO in the low metallicity dwarf galaxy WLM.
Rubio, Monica; Elmegreen, Bruce G; Hunter, Deidre A; Brinks, Elias; Cortés, Juan R; Cigan, Phil
2015-09-10
Understanding stellar birth requires observations of the clouds in which they form. These clouds are dense and self-gravitating, and in all existing observations they are molecular, with H2 the dominant species and carbon monoxide (CO) the best available tracer. When the abundances of carbon and oxygen are low compared with that of hydrogen, and the opacity from dust is also low, as in primeval galaxies and local dwarf irregular galaxies, CO forms slowly and is easily destroyed, so it is difficult for it to accumulate inside dense clouds. Here we report interferometric observations of CO clouds in the local group dwarf irregular galaxy Wolf-Lundmark-Melotte (WLM), which has a metallicity that is 13 per cent of the solar value and 50 per cent lower than the previous CO detection threshold. The clouds are tiny compared to the surrounding atomic and H2 envelopes, but they have typical densities and column densities for CO clouds in the Milky Way. The normal CO density explains why star clusters forming in dwarf irregulars have similar densities to star clusters in giant spiral galaxies. The low cloud masses suggest that these clusters will also be low mass, unless some galaxy-scale compression occurs, such as an impact from a cosmic cloud or other galaxy. If the massive metal-poor globular clusters in the halo of the Milky Way formed in dwarf galaxies, as is commonly believed, then they were probably triggered by such an impact.
NASA Astrophysics Data System (ADS)
Morikawa, Y.; Murata, K. T.; Watari, S.; Kato, H.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Shimojo, S.
2010-12-01
Main methodologies of Solar-Terrestrial Physics (STP) so far are theoretical, experimental and observational, and computer simulation approaches. Recently "informatics" is expected as a new (fourth) approach to the STP studies. Informatics is a methodology to analyze large-scale data (observation data and computer simulation data) to obtain new findings using a variety of data processing techniques. At NICT (National Institute of Information and Communications Technology, Japan) we are now developing a new research environment named "OneSpaceNet". The OneSpaceNet is a cloud-computing environment specialized for science works, which connects many researchers with high-speed network (JGN: Japan Gigabit Network). The JGN is a wide-area back-born network operated by NICT; it provides 10G network and many access points (AP) over Japan. The OneSpaceNet also provides with rich computer resources for research studies, such as super-computers, large-scale data storage area, licensed applications, visualization devices (like tiled display wall: TDW), database/DBMS, cluster computers (4-8 nodes) for data processing and communication devices. What is amazing in use of the science cloud is that a user simply prepares a terminal (low-cost PC). Once connecting the PC to JGN2plus, the user can make full use of the rich resources of the science cloud. Using communication devices, such as video-conference system, streaming and reflector servers, and media-players, the users on the OneSpaceNet can make research communications as if they belong to a same (one) laboratory: they are members of a virtual laboratory. The specification of the computer resources on the OneSpaceNet is as follows: The size of data storage we have developed so far is almost 1PB. The number of the data files managed on the cloud storage is getting larger and now more than 40,000,000. What is notable is that the disks forming the large-scale storage are distributed to 5 data centers over Japan (but the storage system performs as one disk). There are three supercomputers allocated on the cloud, one from Tokyo, one from Osaka and the other from Nagoya. One's simulation job data on any supercomputers are saved on the cloud data storage (same directory); it is a kind of virtual computing environment. The tiled display wall has 36 panels acting as one display; the pixel (resolution) size of it is as large as 18000x4300. This size is enough to preview or analyze the large-scale computer simulation data. It also allows us to take a look of multiple (e.g., 100 pictures) on one screen together with many researchers. In our talk we also present a brief report of the initial results using the OneSpaceNet for Global MHD simulations as an example of successful use of our science cloud; (i) Ultra-high time resolution visualization of Global MHD simulations on the large-scale storage and parallel processing system on the cloud, (ii) Database of real-time Global MHD simulation and statistic analyses of the data, and (iii) 3D Web service of Global MHD simulations.
Washington photometry of 14 intermediate-age to old star clusters in the Small Magellanic Cloud
NASA Astrophysics Data System (ADS)
Piatti, Andrés E.; Clariá, Juan J.; Bica, Eduardo; Geisler, Doug; Ahumada, Andrea V.; Girardi, Léo
2011-10-01
We present CCD photometry in the Washington system C, T1 and T2 passbands down to T1˜ 23 in the fields of L3, L28, HW 66, L100, HW 79, IC 1708, L106, L108, L109, NGC 643, L112, HW 84, HW 85 and HW 86, 14 Small Magellanic Cloud (SMC) clusters, most of them poorly studied objects. We measured T1 magnitudes and C-T1 and T1-T2 colours for a total of 213 516 stars spread throughout cluster areas of 14.7 × 14.7 arcmin2 each. We carried out an in-depth analysis of the field star contamination of the colour-magnitude diagrams (CMDs) and statistically cleaned the cluster CMDs. Based on the best fits of isochrones computed by the Padova group to the (T1, C-T1) CMDs, as well as from the δ(T1) index and the standard giant branch procedure, we derived ages and metallicities for the cluster sample. With the exception of IC 1708, a relatively metal-poor Hyades-age cluster, the remaining 13 objects are between intermediate and old age (from 1.0 to 6.3 Gyr), their [Fe/H] values ranging from -1.4 to -0.7 dex. By combining these results with others available in the literature, we compiled a sample of 43 well-known SMC clusters older than 1 Gyr, with which we produced a revised age distribution. We found that the present clusters' age distribution reveals two primary excesses of clusters at t˜ 2 and 5 Gyr, which engraves the SMC with clear signs of enhanced formation episodes at both ages. In addition, we found that from the birth of the SMC cluster system until approximately the first 4 Gyr of its lifetime, the cluster formation resembles that of a constant formation rate scenario.
Running Neuroimaging Applications on Amazon Web Services: How, When, and at What Cost?
Madhyastha, Tara M.; Koh, Natalie; Day, Trevor K. M.; Hernández-Fernández, Moises; Kelley, Austin; Peterson, Daniel J.; Rajan, Sabreena; Woelfer, Karl A.; Wolf, Jonathan; Grabowski, Thomas J.
2017-01-01
The contribution of this paper is to identify and describe current best practices for using Amazon Web Services (AWS) to execute neuroimaging workflows “in the cloud.” Neuroimaging offers a vast set of techniques by which to interrogate the structure and function of the living brain. However, many of the scientists for whom neuroimaging is an extremely important tool have limited training in parallel computation. At the same time, the field is experiencing a surge in computational demands, driven by a combination of data-sharing efforts, improvements in scanner technology that allow acquisition of images with higher image resolution, and by the desire to use statistical techniques that stress processing requirements. Most neuroimaging workflows can be executed as independent parallel jobs and are therefore excellent candidates for running on AWS, but the overhead of learning to do so and determining whether it is worth the cost can be prohibitive. In this paper we describe how to identify neuroimaging workloads that are appropriate for running on AWS, how to benchmark execution time, and how to estimate cost of running on AWS. By benchmarking common neuroimaging applications, we show that cloud computing can be a viable alternative to on-premises hardware. We present guidelines that neuroimaging labs can use to provide a cluster-on-demand type of service that should be familiar to users, and scripts to estimate cost and create such a cluster. PMID:29163119
Towards a Multi-Mission, Airborne Science Data System Environment
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Hardman, S.; Law, E.; Freeborn, D.; Kay-Im, E.; Lau, G.; Oswald, J.
2011-12-01
NASA earth science instruments are increasingly relying on airborne missions. However, traditionally, there has been limited common infrastructure support available to principal investigators in the area of science data systems. As a result, each investigator has been required to develop their own computing infrastructures for the science data system. Typically there is little software reuse and many projects lack sufficient resources to provide a robust infrastructure to capture, process, distribute and archive the observations acquired from airborne flights. At NASA's Jet Propulsion Laboratory (JPL), we have been developing a multi-mission data system infrastructure for airborne instruments called the Airborne Cloud Computing Environment (ACCE). ACCE encompasses the end-to-end lifecycle covering planning, provisioning of data system capabilities, and support for scientific analysis in order to improve the quality, cost effectiveness, and capabilities to enable new scientific discovery and research in earth observation. This includes improving data system interoperability across each instrument. A principal characteristic is being able to provide an agile infrastructure that is architected to allow for a variety of configurations of the infrastructure from locally installed compute and storage services to provisioning those services via the "cloud" from cloud computer vendors such as Amazon.com. Investigators often have different needs that require a flexible configuration. The data system infrastructure is built on the Apache's Object Oriented Data Technology (OODT) suite of components which has been used for a number of spaceborne missions and provides a rich set of open source software components and services for constructing science processing and data management systems. In 2010, a partnership was formed between the ACCE team and the Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) mission to support the data processing and data management needs. A principal goal is to provide support for the Fourier Transform Spectrometer (FTS) instrument which will produce over 700,000 soundings over the life of their three-year mission. The cost to purchase and operate a cluster-based system in order to generate Level 2 Full Physics products from this data was prohibitive. Through an evaluation of cloud computing solutions, Amazon's Elastic Compute Cloud (EC2) was selected for the CARVE deployment. As the ACCE infrastructure is developed and extended to form an infrastructure for airborne missions, the experience of working with CARVE has provided a number of lessons learned and has proven to be important in reinforcing the unique aspects of airborne missions and the importance of the ACCE infrastructure in developing a cost effective, flexible multi-mission capability that leverages emerging capabilities in cloud computing, workflow management, and distributed computing.
Future of Department of Defense Cloud Computing Amid Cultural Confusion
2013-03-01
enterprise cloud - computing environment and transition to a public cloud service provider. Services have started the development of individual cloud - computing environments...endorsing cloud computing . It addresses related issues in matters of service culture changes and how strategic leaders will dictate the future of cloud ...through data center consolidation and individual Service provided cloud computing .
A curvature-based weighted fuzzy c-means algorithm for point clouds de-noising
NASA Astrophysics Data System (ADS)
Cui, Xin; Li, Shipeng; Yan, Xiutian; He, Xinhua
2018-04-01
In order to remove the noise of three-dimensional scattered point cloud and smooth the data without damnify the sharp geometric feature simultaneity, a novel algorithm is proposed in this paper. The feature-preserving weight is added to fuzzy c-means algorithm which invented a curvature weighted fuzzy c-means clustering algorithm. Firstly, the large-scale outliers are removed by the statistics of r radius neighboring points. Then, the algorithm estimates the curvature of the point cloud data by using conicoid parabolic fitting method and calculates the curvature feature value. Finally, the proposed clustering algorithm is adapted to calculate the weighted cluster centers. The cluster centers are regarded as the new points. The experimental results show that this approach is efficient to different scale and intensities of noise in point cloud with a high precision, and perform a feature-preserving nature at the same time. Also it is robust enough to different noise model.
THE JCMT GOULD BELT SURVEY: DENSE CORE CLUSTERS IN ORION A
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lane, J.; Kirk, H.; Johnstone, D.
The Orion A molecular cloud is one of the most well-studied nearby star-forming regions, and includes regions of both highly clustered and more dispersed star formation across its full extent. Here, we analyze dense, star-forming cores identified in the 850 and 450 μ m SCUBA-2 maps from the JCMT Gould Belt Legacy Survey. We identify dense cores in a uniform manner across the Orion A cloud and analyze their clustering properties. Using two independent lines of analysis, we find evidence that clusters of dense cores tend to be mass segregated, suggesting that stellar clusters may have some amount of primordial mass segregationmore » already imprinted in them at an early stage. We also demonstrate that the dense core clusters have a tendency to be elongated, perhaps indicating a formation mechanism linked to the filamentary structure within molecular clouds.« less
Modeling Jet and Outflow Feedback during Star Cluster Formation
NASA Astrophysics Data System (ADS)
Federrath, Christoph; Schrön, Martin; Banerjee, Robi; Klessen, Ralf S.
2014-08-01
Powerful jets and outflows are launched from the protostellar disks around newborn stars. These outflows carry enough mass and momentum to transform the structure of their parent molecular cloud and to potentially control star formation itself. Despite their importance, we have not been able to fully quantify the impact of jets and outflows during the formation of a star cluster. The main problem lies in limited computing power. We would have to resolve the magnetic jet-launching mechanism close to the protostar and at the same time follow the evolution of a parsec-size cloud for a million years. Current computer power and codes fall orders of magnitude short of achieving this. In order to overcome this problem, we implement a subgrid-scale (SGS) model for launching jets and outflows, which demonstrably converges and reproduces the mass, linear and angular momentum transfer, and the speed of real jets, with ~1000 times lower resolution than would be required without the SGS model. We apply the new SGS model to turbulent, magnetized star cluster formation and show that jets and outflows (1) eject about one-fourth of their parent molecular clump in high-speed jets, quickly reaching distances of more than a parsec, (2) reduce the star formation rate by about a factor of two, and (3) lead to the formation of ~1.5 times as many stars compared to the no-outflow case. Most importantly, we find that jets and outflows reduce the average star mass by a factor of ~ three and may thus be essential for understanding the characteristic mass of the stellar initial mass function.
NASA Astrophysics Data System (ADS)
Ogloblina, Daria; Schmidt, Steffen J.; Adams, Nikolaus A.
2018-06-01
Cavitation is a process where a liquid evaporates due to a pressure drop and re-condenses violently. Noise, material erosion and altered system dynamics characterize for such a process for which shock waves, rarefaction waves and vapor generation are typical phenomena. The current paper presents novel results for collapsing vapour-bubble clusters in a liquid environment close to a wall obtained by computational fluid mechanics (CFD) simulations. The driving pressure initially is 10 MPa in the liquid. Computations are carried out by using a fully compressible single-fluid flow model in combination with a conservative finite volume method (FVM). The investigated bubble clusters (referred to as "clouds") differ by their initial vapor volume fractions, initial stand-off distances to the wall and by initial bubble radii. The effects of collapse focusing due to bubble-bubble interaction are analysed by investigating the intensities and positions of individual bubble collapses, as well as by the resulting shock-induced pressure field at the wall. Stronger interaction of the bubbles leads to an intensification of the collapse strength for individual bubbles, collapse focusing towards the center of the cloud and enhanced re-evaporation. The obtained results reveal collapse features which are common for all cases, as well as case-specific differences during collapse-rebound cycles. Simultaneous measurements of maximum pressures at the wall and within the flow field and of the vapor volume evolution show that not only the primary collapse but also subsequent collapses are potentially relevant for erosion.
NASA Technical Reports Server (NTRS)
Lin, Douglas N. C.; Murray, Stephen D.
1991-01-01
Based upon the observed properties of globular clusters and dwarf galaxies in the Local Group, we present important theoretical constraints on star formation in these systems. These constraints indicate that protoglobular cluster clouds had long dormant periods and a brief epoch of violent star formation. Collisions between protocluster clouds triggered fragmentation into individual stars. Most protocluster clouds dispersed into the Galactic halo during the star formation epoch. In contrast, the large spread in stellar metallicity in dwarf galaxies suggests that star formation in their pregenitors was self-regulated: we propose the protocluster clouds formed from thermal instability in the protogalactic clouds and show that a population of massive stars is needed to provide sufficient UV flux to prevent the collapsing protogalactic clouds from fragmenting into individual stars. Based upon these constraints, we propose a unified scenario to describe the early epochs of star formation in the Galactic halo as well as the thick and thin components of the Galactic disk.
A two-step initial mass function:. Consequences of clustered star formation for binary properties
NASA Astrophysics Data System (ADS)
Durisen, R. H.; Sterzik, M. F.; Pickett, B. K.
2001-06-01
If stars originate in transient bound clusters of moderate size, these clusters will decay due to dynamic interactions in which a hard binary forms and ejects most or all the other stars. When the cluster members are chosen at random from a reasonable initial mass function (IMF), the resulting binary characteristics do not match current observations. We find a significant improvement in the trends of binary properties from this scenario when an additional constraint is taken into account, namely that there is a distribution of total cluster masses set by the masses of the cloud cores from which the clusters form. Two distinct steps then determine final stellar masses - the choice of a cluster mass and the formation of the individual stars. We refer to this as a ``two-step'' IMF. Simple statistical arguments are used in this paper to show that a two-step IMF, combined with typical results from dynamic few-body system decay, tends to give better agreement between computed binary characteristics and observations than a one-step mass selection process.
The Czech National Grid Infrastructure
NASA Astrophysics Data System (ADS)
Chudoba, J.; Křenková, I.; Mulač, M.; Ruda, M.; Sitera, J.
2017-10-01
The Czech National Grid Infrastructure is operated by MetaCentrum, a CESNET department responsible for coordinating and managing activities related to distributed computing. CESNET as the Czech National Research and Education Network (NREN) provides many e-infrastructure services, which are used by 94% of the scientific and research community in the Czech Republic. Computing and storage resources owned by different organizations are connected by fast enough network to provide transparent access to all resources. We describe in more detail the computing infrastructure, which is based on several different technologies and covers grid, cloud and map-reduce environment. While the largest part of CPUs is still accessible via distributed torque servers, providing environment for long batch jobs, part of infrastructure is available via standard EGI tools in EGI, subset of NGI resources is provided into EGI FedCloud environment with cloud interface and there is also Hadoop cluster provided by the same e-infrastructure.A broad spectrum of computing servers is offered; users can choose from standard 2 CPU servers to large SMP machines with up to 6 TB of RAM or servers with GPU cards. Different groups have different priorities on various resources, resource owners can even have an exclusive access. The software is distributed via AFS. Storage servers offering up to tens of terabytes of disk space to individual users are connected via NFS4 on top of GPFS and access to long term HSM storage with peta-byte capacity is also provided. Overview of available resources and recent statistics of usage will be given.
A Cloud-Computing Service for Environmental Geophysics and Seismic Data Processing
NASA Astrophysics Data System (ADS)
Heilmann, B. Z.; Maggi, P.; Piras, A.; Satta, G.; Deidda, G. P.; Bonomi, E.
2012-04-01
Cloud computing is establishing worldwide as a new high performance computing paradigm that offers formidable possibilities to industry and science. The presented cloud-computing portal, part of the Grida3 project, provides an innovative approach to seismic data processing by combining open-source state-of-the-art processing software and cloud-computing technology, making possible the effective use of distributed computation and data management with administratively distant resources. We substituted the user-side demanding hardware and software requirements by remote access to high-performance grid-computing facilities. As a result, data processing can be done quasi in real-time being ubiquitously controlled via Internet by a user-friendly web-browser interface. Besides the obvious advantages over locally installed seismic-processing packages, the presented cloud-computing solution creates completely new possibilities for scientific education, collaboration, and presentation of reproducible results. The web-browser interface of our portal is based on the commercially supported grid portal EnginFrame, an open framework based on Java, XML, and Web Services. We selected the hosted applications with the objective to allow the construction of typical 2D time-domain seismic-imaging workflows as used for environmental studies and, originally, for hydrocarbon exploration. For data visualization and pre-processing, we chose the free software package Seismic Un*x. We ported tools for trace balancing, amplitude gaining, muting, frequency filtering, dip filtering, deconvolution and rendering, with a customized choice of options as services onto the cloud-computing portal. For structural imaging and velocity-model building, we developed a grid version of the Common-Reflection-Surface stack, a data-driven imaging method that requires no user interaction at run time such as manual picking in prestack volumes or velocity spectra. Due to its high level of automation, CRS stacking can benefit largely from the hardware parallelism provided by the cloud deployment. The resulting output, post-stack section, coherence, and NMO-velocity panels are used to generate a smooth migration-velocity model. Residual static corrections are calculated as a by-product of the stack and can be applied iteratively. As a final step, a time migrated subsurface image is obtained by a parallelized Kirchhoff time migration scheme. Processing can be done step-by-step or using a graphical workflow editor that can launch a series of pipelined tasks. The status of the submitted jobs is monitored by a dedicated service. All results are stored in project directories, where they can be downloaded of viewed directly in the browser. Currently, the portal has access to three research clusters having a total number of 70 nodes with 4 cores each. They are shared with four other cloud-computing applications bundled within the GRIDA3 project. To demonstrate the functionality of our "seismic cloud lab", we will present results obtained for three different types of data, all taken from hydrogeophysical studies: (1) a seismic reflection data set, made of compressional waves from explosive sources, recorded in Muravera, Sardinia; (2) a shear-wave data set from, Sardinia; (3) a multi-offset Ground-Penetrating-Radar data set from Larreule, France. The presented work was funded by the government of the Autonomous Region of Sardinia and by the Italian Ministry of Research and Education.
Rotation in young massive star clusters
NASA Astrophysics Data System (ADS)
Mapelli, Michela
2017-05-01
Hydrodynamical simulations of turbulent molecular clouds show that star clusters form from the hierarchical merger of several sub-clumps. We run smoothed-particle hydrodynamics simulations of turbulence-supported molecular clouds with mass ranging from 1700 to 43 000 M⊙. We study the kinematic evolution of the main cluster that forms in each cloud. We find that the parent gas acquires significant rotation, because of large-scale torques during the process of hierarchical assembly. The stellar component of the embedded star cluster inherits the rotation signature from the parent gas. Only star clusters with final mass < few × 100 M⊙ do not show any clear indication of rotation. Our simulated star clusters have high ellipticity (˜0.4-0.5 at t = 4 Myr) and are subvirial (Qvir ≲ 0.4). The signature of rotation is stronger than radial motions due to subvirial collapse. Our results suggest that rotation is common in embedded massive (≳1000 M⊙) star clusters. This might provide a key observational test for the hierarchical assembly scenario.
Cloud Computing for radiologists.
Kharat, Amit T; Safvi, Amjad; Thind, Ss; Singh, Amarjit
2012-07-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future.
Cloud Computing for radiologists
Kharat, Amit T; Safvi, Amjad; Thind, SS; Singh, Amarjit
2012-01-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future. PMID:23599560
NASA Astrophysics Data System (ADS)
Jeon, Young-Beom; Nemec, James M.; Walker, Alistair R.; Kunder, Andrea M.
2014-06-01
Homogeneous B, V photometry is presented for 19,324 stars in and around 5 Magellanic Cloud globular clusters: NGC 1466, NGC 1841, NGC 2210, NGC 2257, and Reticulum. The photometry is derived from eight nights of CCD imaging with the Cerro Tololo Inter-American Observatory 0.9 m SMARTS telescope. Instrumental magnitudes were transformed to the Johnson B, V system using accurate calibration relations based on a large sample of Landolt-Stetson equatorial standard stars, which were observed on the same nights as the cluster stars. Residual analysis of the equatorial standards used for the calibration, and validation of the new photometry using Stetson's sample of secondary standards in the vicinities of the five Large Magellanic Cloud clusters, shows excellent agreement with our values in both magnitudes and colors. Color-magnitude diagrams reaching to the main-sequence turnoffs at V ~ 22 mag, sigma-magnitude diagrams, and various other summaries are presented for each cluster to illustrate the range and quality of the new photometry. The photometry should prove useful for future studies of the Magellanic Cloud globular clusters, particularly studies of their variable stars.
Multi-Scale Voxel Segmentation for Terrestrial Lidar Data within Marshes
NASA Astrophysics Data System (ADS)
Nguyen, C. T.; Starek, M. J.; Tissot, P.; Gibeaut, J. C.
2016-12-01
The resilience of marshes to a rising sea is dependent on their elevation response. Terrestrial laser scanning (TLS) is a detailed topographic approach for accurate, dense surface measurement with high potential for monitoring of marsh surface elevation response. The dense point cloud provides a 3D representation of the surface, which includes both terrain and non-terrain objects. Extraction of topographic information requires filtering of the data into like-groups or classes, therefore, methods must be incorporated to identify structure in the data prior to creation of an end product. A voxel representation of three-dimensional space provides quantitative visualization and analysis for pattern recognition. The objectives of this study are threefold: 1) apply a multi-scale voxel approach to effectively extract geometric features from the TLS point cloud data, 2) investigate the utility of K-means and Self Organizing Map (SOM) clustering algorithms for segmentation, and 3) utilize a variety of validity indices to measure the quality of the result. TLS data were collected at a marsh site along the central Texas Gulf Coast using a Riegl VZ 400 TLS. The site consists of both exposed and vegetated surface regions. To characterize structure of the point cloud, octree segmentation is applied to create a tree data structure of voxels containing the points. The flexibility of voxels in size and point density makes this algorithm a promising candidate to locally extract statistical and geometric features of the terrain including surface normal and curvature. The characteristics of the voxel itself such as the volume and point density are also computed and assigned to each point as are laser pulse characteristics. The features extracted from the voxelization are then used as input for clustering of the points using the K-means and SOM clustering algorithms. Optimal number of clusters are then determined based on evaluation of cluster separability criterions. Results for different combinations of the feature space vector and differences between K-means and SOM clustering will be presented. The developed method provides a novel approach for compressing TLS scene complexity in marshes, such as for vegetation biomass studies or erosion monitoring.
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2016-01-01
During inactive phases of Madden-Julian oscillation (MJO), there are plenty of deep but small convective systems and far fewer deep and large ones. During active phases of MJO, a manifestation of an increase in the occurrence of large and deep cloud clusters results from an amplification of large-scale motions by stronger convective heating. This study is designed to quantitatively examine the roles of small and large cloud clusters during the MJO life cycle. We analyze the cloud object data from Aqua CERES observations for tropical deep convective (DC) and cirrostratus (CS) cloud object types according to the real-time multivariate MJO index. The cloud object is a contiguous region of the earth with a single dominant cloud-system type. The size distributions, defined as the footprint numbers as a function of cloud object diameters, for particular MJO phases depart greatly from the combined (8-phase) distribution at large cloud-object diameters due to the reduced/increased numbers of cloud objects related to changes in the large-scale environments. The medium diameter corresponding to the combined distribution is determined and used to partition all cloud objects into "small" and "large" groups of a particular phase. The two groups corresponding to the combined distribution have nearly equal numbers of footprints. The medium diameters are 502 km for DC and 310 km for cirrostratus. The range of the variation between two extreme phases (typically, the most active and depressed phases) for the small group is 6-11% in terms of the numbers of cloud objects and the total footprint numbers. The corresponding range for the large group is 19-44%. In terms of the probability density functions of radiative and cloud physical properties, there are virtually no differences between the MJO phases for the small group, but there are significant differences for the large groups for both DC and CS types. These results suggest that the intreseasonal variation signals reside at the large cloud clusters while the small cloud clusters represent the background noises resulting from various types of the tropical waves with different wavenumbers and propagation directions/speeds.
Bao, Riyue; Hernandez, Kyle; Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge
2015-01-01
Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.
NASA Astrophysics Data System (ADS)
Corsaro, Enrico; Lee, Yueh-Ning; García, Rafael A.; Hennebelle, Patrick; Mathur, Savita; Beck, Paul G.; Mathis, Stephane; Stello, Dennis; Bouvier, Jérôme
2017-10-01
Stars originate by the gravitational collapse of a turbulent molecular cloud of a diffuse medium, and are often observed to form clusters. Stellar clusters therefore play an important role in our understanding of star formation and of the dynamical processes at play. However, investigating the cluster formation is diffcult because the density of the molecular cloud undergoes a change of many orders of magnitude. Hierarchical-step approaches to decompose the problem into different stages are therefore required, as well as reliable assumptions on the initial conditions in the clouds. We report for the first time the use of the full potential of NASA Kepler asteroseismic observations coupled with 3D numerical simulations, to put strong constraints on the early formation stages of open clusters. Thanks to a Bayesian peak bagging analysis of about 50 red giant members of NGC 6791 and NGC 6819, the two most populated open clusters observed in the nominal Kepler mission, we derive a complete set of detailed oscillation mode properties for each star, with thousands of oscillation modes characterized. We therefore show how these asteroseismic properties lead us to a discovery about the rotation history of stellar clusters. Finally, our observational findings will be compared with hydrodynamical simulations for stellar cluster formation to constrain the physical processes of turbulence, rotation, and magnetic fields that are in action during the collapse of the progenitor cloud into a proto-cluster.
Application of the SRI cloud-tracking technique to rapid-scan GOES observations
NASA Technical Reports Server (NTRS)
Wolf, D. E.; Endlich, R. M.
1980-01-01
An automatic cloud tracking system was applied to multilayer clouds associated with severe storms. The method was tested using rapid scan observations of Hurricane Eloise obtained by the GOES satellite on 22 September 1975. Cloud tracking was performed using clustering based either on visible or infrared data. The clusters were tracked using two different techniques. The data of 4 km and 8 km resolution of the automatic system yielded comparable in accuracy and coverage to those obtained by NASA analysts using the Atmospheric and Oceanographic Information Processing System.
Design Patterns to Achieve 300x Speedup for Oceanographic Analytics in the Cloud
NASA Astrophysics Data System (ADS)
Jacob, J. C.; Greguska, F. R., III; Huang, T.; Quach, N.; Wilson, B. D.
2017-12-01
We describe how we achieve super-linear speedup over standard approaches for oceanographic analytics on a cluster computer and the Amazon Web Services (AWS) cloud. NEXUS is an open source platform for big data analytics in the cloud that enables this performance through a combination of horizontally scalable data parallelism with Apache Spark and rapid data search, subset, and retrieval with tiled array storage in cloud-aware NoSQL databases like Solr and Cassandra. NEXUS is the engine behind several public portals at NASA and OceanWorks is a newly funded project for the ocean community that will mature and extend this capability for improved data discovery, subset, quality screening, analysis, matchup of satellite and in situ measurements, and visualization. We review the Python language API for Spark and how to use it to quickly convert existing programs to use Spark to run with cloud-scale parallelism, and discuss strategies to improve performance. We explain how partitioning the data over space, time, or both leads to algorithmic design patterns for Spark analytics that can be applied to many different algorithms. We use NEXUS analytics as examples, including area-averaged time series, time averaged map, and correlation map.
2012-05-01
cloud computing 17 NASA Nebula Platform • Cloud computing pilot program at NASA Ames • Integrates open-source components into seamless, self...Mission support • Education and public outreach (NASA Nebula , 2010) 18 NSF Supported Cloud Research • Support for Cloud Computing in...Mell, P. & Grance, T. (2011). The NIST Definition of Cloud Computing. NIST Special Publication 800-145 • NASA Nebula (2010). Retrieved from
Fast "swarm of detectors" and their application in cosmic rays
NASA Astrophysics Data System (ADS)
Shoziyoev, G. P.; Shoziyoev, Sh. P.
2017-06-01
New opportunities in science appeared with the latest technology of the 21st century. This paper points to creating a new architecture for detection systems of different characteristics in astrophysics and geophysics using the latest technologies related to multicopter cluster systems, alternative energy sources, cluster technologies, cloud computing and big data. The idea of a quick-deployable scaleable dynamic system of a controlled drone with a small set of different detectors for detecting various components of extensive air showers in cosmic rays and in geophysics is very attractive. Development of this type of new system also allows to give a multiplier effect for the development of various sciences and research methods to observe natural phenomena.
Monte Carlo verification of radiotherapy treatments with CloudMC.
Miras, Hector; Jiménez, Rubén; Perales, Álvaro; Terrón, José Antonio; Bertolet, Alejandro; Ortiz, Antonio; Macías, José
2018-06-27
A new implementation has been made on CloudMC, a cloud-based platform presented in a previous work, in order to provide services for radiotherapy treatment verification by means of Monte Carlo in a fast, easy and economical way. A description of the architecture of the application and the new developments implemented is presented together with the results of the tests carried out to validate its performance. CloudMC has been developed over Microsoft Azure cloud. It is based on a map/reduce implementation for Monte Carlo calculations distribution over a dynamic cluster of virtual machines in order to reduce calculation time. CloudMC has been updated with new methods to read and process the information related to radiotherapy treatment verification: CT image set, treatment plan, structures and dose distribution files in DICOM format. Some tests have been designed in order to determine, for the different tasks, the most suitable type of virtual machines from those available in Azure. Finally, the performance of Monte Carlo verification in CloudMC is studied through three real cases that involve different treatment techniques, linac models and Monte Carlo codes. Considering computational and economic factors, D1_v2 and G1 virtual machines were selected as the default type for the Worker Roles and the Reducer Role respectively. Calculation times up to 33 min and costs of 16 € were achieved for the verification cases presented when a statistical uncertainty below 2% (2σ) was required. The costs were reduced to 3-6 € when uncertainty requirements are relaxed to 4%. Advantages like high computational power, scalability, easy access and pay-per-usage model, make Monte Carlo cloud-based solutions, like the one presented in this work, an important step forward to solve the long-lived problem of truly introducing the Monte Carlo algorithms in the daily routine of the radiotherapy planning process.
Prototype methodology for obtaining cloud seeding guidance from HRRR model data
NASA Astrophysics Data System (ADS)
Dawson, N.; Blestrud, D.; Kunkel, M. L.; Waller, B.; Ceratto, J.
2017-12-01
Weather model data, along with real time observations, are critical to determine whether atmospheric conditions are prime for super-cooled liquid water during cloud seeding operations. Cloud seeding groups can either use operational forecast models, or run their own model on a computer cluster. A custom weather model provides the most flexibility, but is also expensive. For programs with smaller budgets, openly-available operational forecasting models are the de facto method for obtaining forecast data. The new High-Resolution Rapid Refresh (HRRR) model (3 x 3 km grid size), developed by the Earth System Research Laboratory (ESRL), provides hourly model runs with 18 forecast hours per run. While the model cannot be fine-tuned for a specific area or edited to provide cloud-seeding-specific output, model output is openly available on a near-real-time basis. This presentation focuses on a prototype methodology for using HRRR model data to create maps which aid in near-real-time cloud seeding decision making. The R programming language is utilized to run a script on a Windows® desktop/laptop computer either on a schedule (such as every half hour) or manually. The latest HRRR model run is downloaded from NOAA's Operational Model Archive and Distribution System (NOMADS). A GRIB-filter service, provided by NOMADS, is used to obtain surface and mandatory pressure level data for a subset domain which greatly cuts down on the amount of data transfer. Then, a set of criteria, identified by the Idaho Power Atmospheric Science Group, is used to create guidance maps. These criteria include atmospheric stability (lapse rates), dew point depression, air temperature, and wet bulb temperature. The maps highlight potential areas where super-cooled liquid water may exist, reasons as to why cloud seeding should not be attempted, and wind speed at flight level.
From Head to Sword: The Clustering Properties of Stars in Orion
NASA Astrophysics Data System (ADS)
Gomez, Mercedes; Lada, Charles J.
1998-04-01
We investigate the structure in the spatial distributions of optically selected samples of young stars in the Head (lambda Orionis) and in the Sword (Orion A) regions of the constellation of Orion with the aid of stellar surface density maps and the two-point angular correlation function. The distributions of young stars in both regions are found to be nonrandom and highly clustered. Stellar surface density maps reveal three distinct clusters in the lambda Ori region. The two-point correlation function displays significant features at angular scales that correspond to the radii and separations of the three clusters identified in the surface density maps. Most young stars in the lambda Ori region (~80%) are presently found within these three clusters, consistent with the idea that the majority of young stars in this region were formed in dense protostellar clusters that have significantly expanded since their formation. Over a scale of ~0.05d-0.5d the correlation function is well described by a single power law that increases smoothly with decreasing angular scale. This suggests that, within the clusters, the stars either are themselves hierarchically clustered or have a volume density distribution that falls steeply with radius. The relative lack of Hα emission-line stars in the one cluster in this region that contains OB stars suggests a timescale for emission-line activity of less than 4 Myr around late-type stars in the cluster and may indicate that the lifetimes of protoplanetary disks around young stellar objects are reduced in clusters containing O stars. The spatial distribution of young stars in the Orion A region is considerably more complex. The angular correlation function of the OB stars (which are mostly foreground to the Orion A molecular cloud) is very similar to that of the Hα stars (which are located mostly within the molecular cloud) and significantly different from that of the young stars in the lambda Ori region. This suggests that, although spatially separated, both populations in the Orion A region may have originated from a similar fragmentation process. Stellar surface density maps and modeling of the angular correlation function suggest that somewhat less than half of the OB and Hα stars in the Orion A cloud are presently within well-defined stellar clusters. Although all the OB stars could have originated in rich clusters, a significant fraction of the Hα stars appear to have formed outside such clusters in a more spatially dispersed manner. The close similarity of the angular correlation functions of the OB and Hα stars toward the molecular cloud, in conjunction with the earlier indications of a relatively high star formation rate and high gas pressure in this cloud, is consistent with the idea that older, foreground OB stars triggered the current episode of star formation in the Orion A cloud. One of the OB clusters (Upper Sword) that is foreground to the cloud does not appear to be associated with any of the clusterings of emission-line stars, again suggesting a timescale (<4 Myr) for emission-line activity and disk lifetimes around late-type stars born in OB clusters.
Do Clouds Compute? A Framework for Estimating the Value of Cloud Computing
NASA Astrophysics Data System (ADS)
Klems, Markus; Nimis, Jens; Tai, Stefan
On-demand provisioning of scalable and reliable compute services, along with a cost model that charges consumers based on actual service usage, has been an objective in distributed computing research and industry for a while. Cloud Computing promises to deliver on this objective: consumers are able to rent infrastructure in the Cloud as needed, deploy applications and store data, and access them via Web protocols on a pay-per-use basis. The acceptance of Cloud Computing, however, depends on the ability for Cloud Computing providers and consumers to implement a model for business value co-creation. Therefore, a systematic approach to measure costs and benefits of Cloud Computing is needed. In this paper, we discuss the need for valuation of Cloud Computing, identify key components, and structure these components in a framework. The framework assists decision makers in estimating Cloud Computing costs and to compare these costs to conventional IT solutions. We demonstrate by means of representative use cases how our framework can be applied to real world scenarios.
Influence of gravity on inertial particle clustering in turbulence
NASA Astrophysics Data System (ADS)
Lu, J.; Nordsiek, H.; Saw, E. W.; Fugal, J. P.; Shaw, R. A.
2008-11-01
We report results from experiments aimed at studying inertial particles in homogeneous, isotropic turbulence, under the influence of gravitational settling. Conditions are selected to investigate the transition from negligible role of gravity to gravitationally dominated, as is expected to occur in atmospheric clouds. We measure droplet clustering, relative velocities, and the distribution of collision angles in this range. The experiments are carried out in a laboratory chamber with nearly homogeneous, isotropic turbulence. The turbulence is characterized using LDV and 2-frame holographic particle tracking velocimetry. We seed the flow with particles of various Stokes and Froude numbers and use digital holography to obtain 3D particle positions and velocities. From particle positions, we investigate the impact of gravity on inertial clustering through the calculation of the radial distribution function and we compare to computational results and other recent experiments.
Cloud Computing and Its Applications in GIS
NASA Astrophysics Data System (ADS)
Kang, Cao
2011-12-01
Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)
IBM Cloud Computing Powering a Smarter Planet
NASA Astrophysics Data System (ADS)
Zhu, Jinzy; Fang, Xing; Guo, Zhe; Niu, Meng Hua; Cao, Fan; Yue, Shuang; Liu, Qin Yu
With increasing need for intelligent systems supporting the world's businesses, Cloud Computing has emerged as a dominant trend to provide a dynamic infrastructure to make such intelligence possible. The article introduced how to build a smarter planet with cloud computing technology. First, it introduced why we need cloud, and the evolution of cloud technology. Secondly, it analyzed the value of cloud computing and how to apply cloud technology. Finally, it predicted the future of cloud in the smarter planet.
Cloud Computing Security Issue: Survey
NASA Astrophysics Data System (ADS)
Kamal, Shailza; Kaur, Rajpreet
2011-12-01
Cloud computing is the growing field in IT industry since 2007 proposed by IBM. Another company like Google, Amazon, and Microsoft provides further products to cloud computing. The cloud computing is the internet based computing that shared recourses, information on demand. It provides the services like SaaS, IaaS and PaaS. The services and recourses are shared by virtualization that run multiple operation applications on cloud computing. This discussion gives the survey on the challenges on security issues during cloud computing and describes some standards and protocols that presents how security can be managed.
NASA Astrophysics Data System (ADS)
Yin, Gang; Zhang, Yingtang; Fan, Hongbo; Ren, Guoquan; Li, Zhining
2017-12-01
We have developed a method for automatically detecting UXO-like targets based on magnetic anomaly inversion and self-adaptive fuzzy c-means clustering. Magnetic anomaly inversion methods are used to estimate the initial locations of multiple UXO-like sources. Although these initial locations have some errors with respect to the real positions, they form dense clouds around the actual positions of the magnetic sources. Then we use the self-adaptive fuzzy c-means clustering algorithm to cluster these initial locations. The estimated number of cluster centroids represents the number of targets and the cluster centroids are regarded as the locations of magnetic targets. Effectiveness of the method has been demonstrated using synthetic datasets. Computational results show that the proposed method can be applied to the case of several UXO-like targets that are randomly scattered within in a confined, shallow subsurface, volume. A field test was carried out to test the validity of the proposed method and the experimental results show that the prearranged magnets can be detected unambiguously and located precisely.
REEF: Retainable Evaluator Execution Framework
Weimer, Markus; Chen, Yingda; Chun, Byung-Gon; Condie, Tyson; Curino, Carlo; Douglas, Chris; Lee, Yunseong; Majestro, Tony; Malkhi, Dahlia; Matusevych, Sergiy; Myers, Brandon; Narayanamurthy, Shravan; Ramakrishnan, Raghu; Rao, Sriram; Sears, Russell; Sezgin, Beysim; Wang, Julia
2015-01-01
Resource Managers like Apache YARN have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. This flexibility comes at a high cost in terms of developer effort, as each application must repeatedly tackle the same challenges (e.g., fault-tolerance, task scheduling and coordination) and re-implement common mechanisms (e.g., caching, bulk-data transfers). This paper presents REEF, a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource re-use for data caching, and state management abstractions that greatly ease the development of elastic data processing work-flows on cloud platforms that support a Resource Manager service. REEF is being used to develop several commercial offerings such as the Azure Stream Analytics service. Furthermore, we demonstrate REEF development of a distributed shell application, a machine learning algorithm, and a port of the CORFU [4] system. REEF is also currently an Apache Incubator project that has attracted contributors from several instititutions.1 PMID:26819493
T-Check in System-of-Systems Technologies: Cloud Computing
2010-09-01
T-Check in System-of-Systems Technologies: Cloud Computing Harrison D. Strowd Grace A. Lewis September 2010 TECHNICAL NOTE CMU/SEI-2010... Cloud Computing 1 1.2 Types of Cloud Computing 2 1.3 Drivers and Barriers to Cloud Computing Adoption 5 2 Using the T-Check Method 7 2.1 T-Check...Hypothesis 3 25 3.4.2 Deployment View of the Solution for Testing Hypothesis 3 27 3.5 Selecting Cloud Computing Providers 30 3.6 Implementing the T-Check
The nature, origin and evolution of embedded star clusters
NASA Technical Reports Server (NTRS)
Lada, Charles J.; Lada, Elizabeth A.
1991-01-01
The recent development of imaging infrared array cameras has enabled the first systematic studies of embedded protoclusters in the galaxy. Initial investigations suggest that rich embedded clusters are quite numerous and that a significant fraction of all stars formed in the galaxy may begin their lives in such stellar systems. These clusters contain extremely young stellar objects and are important laboratories for star formation research. However, observational and theoretical considerations suggest that most embedded clusters do not survive emergence from molecular clouds as bound clusters. Understanding the origin, nature, and evolution of embedded clusters requires understanding the intimate physical relation between embedded clusters and the dense molecular cloud cores from which they form.
2010-07-01
Cloud computing , an emerging form of computing in which users have access to scalable, on-demand capabilities that are provided through Internet... cloud computing , (2) the information security implications of using cloud computing services in the Federal Government, and (3) federal guidance and...efforts to address information security when using cloud computing . The complete report is titled Information Security: Federal Guidance Needed to
Disruption of Giant Molecular Clouds by Massive Star Clusters
NASA Astrophysics Data System (ADS)
Harper-Clark, Elizabeth
The lifetime of a Giant Molecular Cloud (GMC) and the total mass of stars that form within it are crucial to the understanding of star formation rates across a whole galaxy. In particular, the stars within a GMC may dictate its disruption and the quenching of further star formation. Indeed, observations show that the Milky Way contains GMCs with extensive expanding bubbles while the most massive stars are still alive. Simulating entire GMCs is challenging, due to the large variety of physics that needs to be included, and the computational power required to accurately simulate a GMC over tens of millions of years. Using the radiative-magneto-hydrodynamic code Enzo, I have run many simulations of GMCs. I obtain robust results for the fraction of gas converted into stars and the lifetimes of the GMCs: (A) In simulations with no stellar outputs (or "feedback''), clusters form at a rate of 30% of GMC mass per free fall time; the GMCs were not disrupted but contained forming stars. (B) Including ionization gas pressure or radiation pressure into the simulations, both separately and together, the star formation was quenched at between 5% and 21% of the original GMC mass. The clouds were fully disrupted within two dynamical times after the first cluster formed. The radiation pressure contributed the most to the disruption of the GMC and fully quenched star formation even without ionization. (C) Simulations that included supernovae showed that they are not dynamically important to GMC disruption and have only minor effects on subsequent star formation. (D) The inclusion of a few micro Gauss magnetic field across the cloud slightly reduced the star formation rate but accelerated GMC disruption by reducing bubble shell disruption and leaking. These simulations show that new born stars quench further star formation and completely disrupt the parent GMC. The low star formation rate and the short lifetimes of GMCs shown here can explain the low star formation rate across the whole galaxy.
Retrieval of cloud cover parameters from multispectral satellite images
NASA Technical Reports Server (NTRS)
Arking, A.; Childs, J. D.
1985-01-01
A technique is described for extracting cloud cover parameters from multispectral satellite radiometric measurements. Utilizing three channels from the AVHRR (Advanced Very High Resolution Radiometer) on NOAA polar orbiting satellites, it is shown that one can retrieve four parameters for each pixel: cloud fraction within the FOV, optical thickness, cloud-top temperature and a microphysical model parameter. The last parameter is an index representing the properties of the cloud particle and is determined primarily by the radiance at 3.7 microns. The other three parameters are extracted from the visible and 11 micron infrared radiances, utilizing the information contained in the two-dimensional scatter plot of the measured radiances. The solution is essentially one in which the distributions of optical thickness and cloud-top temperature are maximally clustered for each region, with cloud fraction for each pixel adjusted to achieve maximal clustering.
Risk in the Clouds?: Security Issues Facing Government Use of Cloud Computing
NASA Astrophysics Data System (ADS)
Wyld, David C.
Cloud computing is poised to become one of the most important and fundamental shifts in how computing is consumed and used. Forecasts show that government will play a lead role in adopting cloud computing - for data storage, applications, and processing power, as IT executives seek to maximize their returns on limited procurement budgets in these challenging economic times. After an overview of the cloud computing concept, this article explores the security issues facing public sector use of cloud computing and looks to the risk and benefits of shifting to cloud-based models. It concludes with an analysis of the challenges that lie ahead for government use of cloud resources.
A Review Study on Cloud Computing Issues
NASA Astrophysics Data System (ADS)
Kanaan Kadhim, Qusay; Yusof, Robiah; Sadeq Mahdi, Hamid; Al-shami, Sayed Samer Ali; Rahayu Selamat, Siti
2018-05-01
Cloud computing is the most promising current implementation of utility computing in the business world, because it provides some key features over classic utility computing, such as elasticity to allow clients dynamically scale-up and scale-down the resources in execution time. Nevertheless, cloud computing is still in its premature stage and experiences lack of standardization. The security issues are the main challenges to cloud computing adoption. Thus, critical industries such as government organizations (ministries) are reluctant to trust cloud computing due to the fear of losing their sensitive data, as it resides on the cloud with no knowledge of data location and lack of transparency of Cloud Service Providers (CSPs) mechanisms used to secure their data and applications which have created a barrier against adopting this agile computing paradigm. This study aims to review and classify the issues that surround the implementation of cloud computing which a hot area that needs to be addressed by future research.
CO near the Pleiades: Encounter of a star cluster with a small molecular cloud
NASA Technical Reports Server (NTRS)
Bally, J.; White, R. E.
1986-01-01
Although there is a large amount of interstellar matter near the Pleiades star cluster, the observed dust and gas is not a remnant of the placental molecular cloud from which the star cluster was formed. Carbon monoxide (CO) associated with the visible reflection nebulae was discovered by Cohen (1975). Its radial velocity differs from that of the cluster by many times the cluster escape velocity, which implies that the cloud-cluster association is the result of a chance encounter. This circumstance and the proximity of the Pleiades to the sun creates an unique opportunity for study of interstellar processes at high spatial resolution. To study the molecular component of the gas, a 1.7 square degree field was mapped with the AT&T Bell Laboratories 7-meter antenna (1.7' beam) on a 1' grid in the J=1.0 C(12)O line, obtaining over 6,000 spectra with 50 kHz resolution. The cloud core was mapped in the J=1-0 line of C(13)O. Further observations include an unsuccessful search for CS (J=2-1) at AT&T BL, and some C(12)O J=2-1 spectra obtained at the Millimeter Wave Observatory of the University of Texas.
DID THE INFANT R136 AND NGC 3603 CLUSTERS UNDERGO RESIDUAL GAS EXPULSION?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banerjee, Sambaran; Kroupa, Pavel, E-mail: sambaran@astro.uni-bonn.de, E-mail: pavel@astro.uni-bonn.de
2013-02-10
Based on kinematic data observed for very young, massive clusters that appear to be in dynamical equilibrium, it has recently been argued that such young systems are examples of where the early residual gas expulsion did not happen or had no dynamical effect. The intriguing scenario of a star cluster forming through a single starburst has thereby been challenged. Choosing the case of the R136 cluster of the Large Magellanic Cloud, the most cited one in this context, we perform direct N-body computations that mimic the early evolution of this cluster including the gas-removal phase (on a thermal timescale). Ourmore » calculations show that under plausible initial conditions which are consistent with observational data, a large fraction (>60%) of a gas-expelled, expanding R136-like cluster is bound to regain dynamical equilibrium by its current age. Therefore, the recent measurements of velocity dispersion in the inner regions of R136, which indicate that the cluster is in dynamical equilibrium, are consistent with an earlier substantial gas expulsion of R136 followed by a rapid re-virialization (in Almost-Equal-To 1 Myr). Additionally, we find that the less massive Galactic NGC 3603 Young Cluster (NYC), with a substantially longer re-virialization time, is likely to be found to have deviated from dynamical equilibrium at its present age ( Almost-Equal-To 1 Myr). The recently obtained stellar proper motions in the central part of the NYC indeed suggest this and are consistent with the computed models. This work significantly extends previous models of the Orion Nebula Cluster which already demonstrated that the re-virialization time of young post-gas-expulsion clusters decreases with increasing pre-expulsion density.« less
Did the Infant R136 and NGC 3603 Clusters Undergo Residual Gas Expulsion?
NASA Astrophysics Data System (ADS)
Banerjee, Sambaran; Kroupa, Pavel
2013-02-01
Based on kinematic data observed for very young, massive clusters that appear to be in dynamical equilibrium, it has recently been argued that such young systems are examples of where the early residual gas expulsion did not happen or had no dynamical effect. The intriguing scenario of a star cluster forming through a single starburst has thereby been challenged. Choosing the case of the R136 cluster of the Large Magellanic Cloud, the most cited one in this context, we perform direct N-body computations that mimic the early evolution of this cluster including the gas-removal phase (on a thermal timescale). Our calculations show that under plausible initial conditions which are consistent with observational data, a large fraction (>60%) of a gas-expelled, expanding R136-like cluster is bound to regain dynamical equilibrium by its current age. Therefore, the recent measurements of velocity dispersion in the inner regions of R136, which indicate that the cluster is in dynamical equilibrium, are consistent with an earlier substantial gas expulsion of R136 followed by a rapid re-virialization (in ≈1 Myr). Additionally, we find that the less massive Galactic NGC 3603 Young Cluster (NYC), with a substantially longer re-virialization time, is likely to be found to have deviated from dynamical equilibrium at its present age (≈1 Myr). The recently obtained stellar proper motions in the central part of the NYC indeed suggest this and are consistent with the computed models. This work significantly extends previous models of the Orion Nebula Cluster which already demonstrated that the re-virialization time of young post-gas-expulsion clusters decreases with increasing pre-expulsion density.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-04
...--Intersection of Cloud Computing and Mobility Forum and Workshop AGENCY: National Institute of Standards and.../intersection-of-cloud-and-mobility.cfm . SUPPLEMENTARY INFORMATION: NIST hosted six prior Cloud Computing Forum... interoperability, portability, and security, discuss the Federal Government's experience with cloud computing...
Embracing the Cloud: Six Ways to Look at the Shift to Cloud Computing
ERIC Educational Resources Information Center
Ullman, David F.; Haggerty, Blake
2010-01-01
Cloud computing is the latest paradigm shift for the delivery of IT services. Where previous paradigms (centralized, decentralized, distributed) were based on fairly straightforward approaches to technology and its management, cloud computing is radical in comparison. The literature on cloud computing, however, suffers from many divergent…
Promoting Interests in Atmospheric Science at a Liberal Arts Institution
NASA Astrophysics Data System (ADS)
Roussev, S.; Sherengos, P. M.; Limpasuvan, V.; Xue, M.
2007-12-01
Coastal Carolina University (CCU) students in Computer Science participated in a project to set up an operational weather forecast for the local community. The project involved the construction of two computing clusters and the automation of daily forecasting. Funded by NSF-MRI, two high-performance clusters were successfully established to run the University of Oklahoma's Advance Regional Prediction System (ARPS). Daily weather predictions are made over South Carolina and North Carolina at 3-km horizontal resolution (roughly 1.9 miles) using initial and boundary condition data provided by UNIDATA. At this high resolution, the model is cloud- resolving, thus providing detailed picture of heavy thunderstorms and precipitation. Forecast results are displayed on CCU's website (https://marc.coastal.edu/HPC) to complement observations at the National Weather Service in Wilmington N.C. Present efforts include providing forecasts at 1-km resolution (or finer), comparisons with other models like Weather Research and Forecasting (WRF) model, and the examination of local phenomena (like water spouts and tornadoes). Through these activities the students learn about shell scripting, cluster operating systems, and web design. More importantly, students are introduced to Atmospheric Science, the processes involved in making weather forecasts, and the interpretation of their forecasts. Simulations generated by the forecasts will be integrated into the contents of CCU's course like Fluid Dynamics, Atmospheric Sciences, Atmospheric Physics, and Remote Sensing. Operated jointly between the departments of Applied Physics and Computer Science, the clusters are expected to be used by CCU faculty and students for future research and inquiry-based projects in Computer Science, Applied Physics, and Marine Science.
NASA Astrophysics Data System (ADS)
Yagi, Masafumi; Yoshida, Michitoshi; Komiyama, Yutaka; Kashikawa, Nobunari; Furusawa, Hisanori; Okamura, Sadanori; Graham, Alister W.; Miller, Neal A.; Carter, David; Mobasher, Bahram; Jogee, Shardha
2010-12-01
We present images of extended Hα clouds associated with 14 member galaxies in the Coma cluster obtained from deep narrowband imaging observations with the Suprime-Cam at the Subaru Telescope. The parent galaxies of the extended Hα clouds are distributed farther than 0.2 Mpc from the peak of the X-ray emission of the cluster. Most of the galaxies are bluer than g - r ≈ 0.5 and they account for 57% of the blue (g - r < 0.5) bright (r < 17.8 mag) galaxies in the central region of the Coma cluster. They reside near the red- and blueshifted edges of the radial velocity distribution of Coma cluster member galaxies. Our findings suggest that most of the parent galaxies were recently captured by the Coma cluster potential and are now infalling toward the cluster center with their disk gas being stripped off and producing the observed Hα clouds. Based on data collected at the Subaru Telescope, which is operated by the National Astronomical Observatory of Japan.
The Research of the Parallel Computing Development from the Angle of Cloud Computing
NASA Astrophysics Data System (ADS)
Peng, Zhensheng; Gong, Qingge; Duan, Yanyu; Wang, Yun
2017-10-01
Cloud computing is the development of parallel computing, distributed computing and grid computing. The development of cloud computing makes parallel computing come into people’s lives. Firstly, this paper expounds the concept of cloud computing and introduces two several traditional parallel programming model. Secondly, it analyzes and studies the principles, advantages and disadvantages of OpenMP, MPI and Map Reduce respectively. Finally, it takes MPI, OpenMP models compared to Map Reduce from the angle of cloud computing. The results of this paper are intended to provide a reference for the development of parallel computing.
Magnetic fields in the Perseus Spiral Arm and in Infrared Dark Clouds
NASA Astrophysics Data System (ADS)
Hoq, Sadia
2017-04-01
The magnetic (B) field is ubiquitous throughout the Milky Way. Several fundamental questions about the B-field in the cool, star-forming interstellar medium (ISM) remain unanswered. In this dissertation, near-infrared (NIR) polarimetric observations are used to study the large-scale Galactic B-field in the cool ISM in a spiral arm and to determine the role of B-fields in the formation of Infrared Dark Clouds (IRDCs). NIR polarimetry of 31 star clusters, located in and around the Perseus spiral arm, were obtained to determine the orientation of the plane-of-sky B-field in the outer Galaxy, and whether the presence of a spiral arm influenced B-field properties. Cluster distances, which provide upper limits to the B-field probed by observations, were estimated by developing a maximum likelihood method to fit theoretical stellar isochrones to stars in cluster color-magnitude diagrams (CMDs). Using the distance estimates, the cluster locations relative to the Perseus arm were found. The cluster polarization percentages and orientations were compared between clusters foreground to the arm and clusters inside or behind the arm. The cluster polarization orientations are predominantly parallel to the Galactic plane. Clusters inside and behind the arm have larger polarization percentages, likely a result of more polarizing material along the line of sight. The cluster polarization data were also compared to optical, inner Galaxy NIR, and Planck submm polarimetry data, and showed agreement with all three data sets. The polarimetric properties of one IRDC, G28.23, were determined using deep NIR observations. The polarization orientations relative to the cloud major axis were found to change directions with distance from the cloud axis. The B-field strength was estimated to be 10 to 100microG. Despite these large inferred B-field strengths, the B-field was found not to be the dominant force in the formation of the IRDC, though the B-field morphology was influenced by the cloud. Using NIR observations, the B-field of 27 IRDCs were studied. The relative polarization orientations with respect to the cloud major axes were found. No preferential relative orientation was found, implying that the B-field did not greatly influence the formation of this sample of IRDCs.
Cloud computing basics for librarians.
Hoy, Matthew B
2012-01-01
"Cloud computing" is the name for the recent trend of moving software and computing resources to an online, shared-service model. This article briefly defines cloud computing, discusses different models, explores the advantages and disadvantages, and describes some of the ways cloud computing can be used in libraries. Examples of cloud services are included at the end of the article. Copyright © Taylor & Francis Group, LLC
The remote sensing image segmentation mean shift algorithm parallel processing based on MapReduce
NASA Astrophysics Data System (ADS)
Chen, Xi; Zhou, Liqing
2015-12-01
With the development of satellite remote sensing technology and the remote sensing image data, traditional remote sensing image segmentation technology cannot meet the massive remote sensing image processing and storage requirements. This article put cloud computing and parallel computing technology in remote sensing image segmentation process, and build a cheap and efficient computer cluster system that uses parallel processing to achieve MeanShift algorithm of remote sensing image segmentation based on the MapReduce model, not only to ensure the quality of remote sensing image segmentation, improved split speed, and better meet the real-time requirements. The remote sensing image segmentation MeanShift algorithm parallel processing algorithm based on MapReduce shows certain significance and a realization of value.
Survey of MapReduce frame operation in bioinformatics.
Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke
2014-07-01
Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Multi-wavelength study of NGC 281 A
NASA Technical Reports Server (NTRS)
Henning, TH.; Martin, K.; Reimann, H.-G.; Launhardt, R.; Leisawitz, D.; Zinnecker, H.
1994-01-01
We present a study of the molecular cloud NGC 281 A and the associated compact and young star cluster NGC 281 (AS 179). Optical photometry leads to a new distance of 3500 pc for the star cluster which is in good agreement with the kinematical distance of the adjacent molecular cloud NGC 281 A. The exciting star HD 5005 of the optical nebulosity is a Trapezium system with O6 III as photometric spectral type for the component HD 5005 AB. For the age of the star cluster we estimated a value of about 3 x 10(exp 6) yr. The (12)CO (2 to 1), (13)CO (2 to 1), and (12)CO (3 to 2) emission shows that the molecular cloud NGC 281 A consists of two cloud fragments. The western fragment is more compact and massive than the eastern fragment and contains an NH3 core. This core is associated with the IRAS source 00494+5617, an H2O maser, and 1.3 millimeter dust continuum radiation. Both cloud fragments contain altogether 22 IRAS point sources which mostly share the properties of young stellar objects. They have luminosities between 150 and 8800 solar luminosity. The maxima of the 60 and 100 micrometers HIRES maps correspond to the maxima of the (12)CO (3 to 2) emission. The NGC 281 A region shares many properties with the Orion Trapezium-BN/KL region the main differences being a larger separation between the cluster centroid and the new site of star formation as well as a lower mass and luminosity of the molecular cloud and the infrared cluster.
NASA Astrophysics Data System (ADS)
Pichardo, Bárbara; Moreno, Edmundo; Allen, Christine; Bedin, Luigi R.; Bellini, Andrea; Pasquini, Luca
2012-03-01
Using the most recent proper-motion determination of the old, solar-metallicity, Galactic open cluster M67 in orbital computations in a non-axisymmetric model of the Milky Way, including a bar and three-dimensional spiral arms, we explore the possibility that the Sun once belonged to this cluster. We have performed Monte Carlo numerical simulations to generate the present-day orbital conditions of the Sun and M67, and all the parameters in the Galactic model. We compute 3.5 × 105 pairs of orbits Sun-M67 looking for close encounters in the past with a minimum distance approach within the tidal radius of M67. In these encounters we find that the relative velocity between the Sun and M67 is larger than 20 km s-1. If the Sun had been ejected from M67 with this high velocity by means of a three-body encounter, this interaction would have either destroyed an initial circumstellar disk around the Sun or dispersed its already formed planets. We also find a very low probability, much lower than 10-7, that the Sun was ejected from M67 by an encounter of this cluster with a giant molecular cloud. This study also excludes the possibility that the Sun and M67 were born in the same molecular cloud. Our dynamical results convincingly demonstrate that M67 could not have been the birth cluster of our solar system. This work relies partly on observations of the Large Binocular Telescope (LBT). The LBT is an international collaboration among institutions in the United States, Italy, and Germany. LBT Corporation partners are The Ohio State University; The University of Arizona on behalf of the Arizona university system; Istituto Nazionale di Astrofisica, Italy; LBT Beteiligungsgesellschaft, Germany, representing the Max-Planck Society, the Astrophysical Institute Potsdam, and Heidelberg University; and The Research Corporation, on behalf of The University of Notre Dame, University of Minnesota, and University of Virginia.
NASA Technical Reports Server (NTRS)
Churchill, Dean D.; Houze, Robert A., Jr.
1991-01-01
A twi-dimensional kinematic model has been used to diagnose the thermodynamic, water vapor, and hydrometeor fields of the stratiform clouds associated with a mesoscale tropical cloud cluster. The model incorporates ice- and water-cloud microphysics, visible and infrared radiation, and convective adjustment. It is intended to determine the relative contributions of radiation, mycrophysics, and turbulence to diabatic heating, and the effects that radiation has on the water budget of the cluster in the absence of dynamical interactions. The model has been initialized with thermodynamic fields and wind velocities diagnosed from a GATE tropical squall line. It is found that radiation does not directly affect the water budget of the stratiform region, and any radiative effect on hydrometeors must involve interaction with dynamics.
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2015-01-01
During inactive phases of Madden-Julian Oscillation (MJO), there are plenty of deep but small convective systems and far fewer deep and large ones. During active phases of MJO, a manifestation of an increase in the occurrence of large and deep cloud clusters results from an amplification of large-scale motions by stronger convective heating. This study is designed to quantitatively examine the roles of small and large cloud clusters during the MJO life cycle. We analyze the cloud object data from Aqua CERES (Clouds and the Earth's Radiant Energy System) observations between July 2006 and June 2010 for tropical deep convective (DC) and cirrostratus (CS) cloud object types according to the real-time multivariate MJO index, which assigns the tropics to one of the eight MJO phases each day. The cloud object is a contiguous region of the earth with a single dominant cloud-system type. The criteria for defining these cloud types are overcast footprints and cloud top pressures less than 400 hPa, but DC has higher cloud optical depths (=10) than those of CS (<10). The size distributions, defined as the footprint numbers as a function of cloud object diameters, for particular MJO phases depart greatly from the combined (8-phase) distribution at large cloud-object diameters due to the reduced/increased numbers of cloud objects related to changes in the large-scale environments. The medium diameter corresponding to the combined distribution is determined and used to partition all cloud objects into "small" and "large" groups of a particular phase. The two groups corresponding to the combined distribution have nearly equal numbers of footprints. The medium diameters are 502 km for DC and 310 km for cirrostratus. The range of the variation between two extreme phases (typically, the most active and depressed phases) for the small group is 6-11% in terms of the numbers of cloud objects and the total footprint numbers. The corresponding range for the large group is 19-44%. In terms of the probability density functions of radiative and cloud physical properties, there are virtually no differences between the MJO phases for the small group, but there are significant differences for the large groups for both DC and CS types. These results suggest that the intreseasonal variation signals reside at the large cloud clusters while the small cloud clusters represent the background noises resulting from various types of the tropical waves with different wavenumbers and propagation speeds/directions.
A Novel College Network Resource Management Method using Cloud Computing
NASA Astrophysics Data System (ADS)
Lin, Chen
At present information construction of college mainly has construction of college networks and management information system; there are many problems during the process of information. Cloud computing is development of distributed processing, parallel processing and grid computing, which make data stored on the cloud, make software and services placed in the cloud and build on top of various standards and protocols, you can get it through all kinds of equipments. This article introduces cloud computing and function of cloud computing, then analyzes the exiting problems of college network resource management, the cloud computing technology and methods are applied in the construction of college information sharing platform.
From cosmos to connectomes: the evolution of data-intensive science.
Burns, Randal; Vogelstein, Joshua T; Szalay, Alexander S
2014-09-17
The analysis of data requires computation: originally by hand and more recently by computers. Different models of computing are designed and optimized for different kinds of data. In data-intensive science, the scale and complexity of data exceeds the comfort zone of local data stores on scientific workstations. Thus, cloud computing emerges as the preeminent model, utilizing data centers and high-performance clusters, enabling remote users to access and query subsets of the data efficiently. We examine how data-intensive computational systems originally built for cosmology, the Sloan Digital Sky Survey (SDSS), are now being used in connectomics, at the Open Connectome Project. We list lessons learned and outline the top challenges we expect to face. Success in computational connectomics would drastically reduce the time between idea and discovery, as SDSS did in cosmology. Copyright © 2014 Elsevier Inc. All rights reserved.
Towards a comprehensive knowledge of the star cluster population in the Small Magellanic Cloud
NASA Astrophysics Data System (ADS)
Piatti, A. E.
2018-07-01
The Small Magellanic Cloud (SMC) has recently been found to harbour an increase of more than 200 per cent in its known cluster population. Here, we provide solid evidence that this unprecedented number of clusters could be greatly overestimated. On the one hand, the fully automatic procedure used to identify such an enormous cluster candidate sample did not recover ˜50 per cent, on average, of the known relatively bright clusters located in the SMC main body. On the other hand, the number of new cluster candidates per time unit as a function of time is noticeably different from the intrinsic SMC cluster frequency (CF), which should not be the case if these new detections were genuine physical systems. We found additionally that the SMC CF varies spatially, in such a way that it resembles an outside-in process coupled with the effects of a relatively recent interaction with the Large Magellanic Cloud. By assuming that clusters and field stars share the same formation history, we showed for the first time that the cluster dissolution rate also depends on position in the galaxy. The cluster dissolution becomes higher as the concentration of galaxy mass increases or if external tidal forces are present.
Eleven quick tips for architecting biomedical informatics workflows with cloud computing.
Cole, Brian S; Moore, Jason H
2018-03-01
Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world's largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction.
Eleven quick tips for architecting biomedical informatics workflows with cloud computing
Moore, Jason H.
2018-01-01
Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world’s largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction. PMID:29596416
NASA Astrophysics Data System (ADS)
Dong, Yumin; Xiao, Shufen; Ma, Hongyang; Chen, Libo
2016-12-01
Cloud computing and big data have become the developing engine of current information technology (IT) as a result of the rapid development of IT. However, security protection has become increasingly important for cloud computing and big data, and has become a problem that must be solved to develop cloud computing. The theft of identity authentication information remains a serious threat to the security of cloud computing. In this process, attackers intrude into cloud computing services through identity authentication information, thereby threatening the security of data from multiple perspectives. Therefore, this study proposes a model for cloud computing protection and management based on quantum authentication, introduces the principle of quantum authentication, and deduces the quantum authentication process. In theory, quantum authentication technology can be applied in cloud computing for security protection. This technology cannot be cloned; thus, it is more secure and reliable than classical methods.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Stocker, John C.; Golomb, Andrew M.
2011-01-01
Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Emergency navigation without an infrastructure.
Gelenbe, Erol; Bi, Huibo
2014-08-18
Emergency navigation systems for buildings and other built environments, such as sport arenas or shopping centres, typically rely on simple sensor networks to detect emergencies and, then, provide automatic signs to direct the evacuees. The major drawbacks of such static wireless sensor network (WSN)-based emergency navigation systems are the very limited computing capacity, which makes adaptivity very difficult, and the restricted battery power, due to the low cost of sensor nodes for unattended operation. If static wireless sensor networks and cloud-computing can be integrated, then intensive computations that are needed to determine optimal evacuation routes in the presence of time-varying hazards can be offloaded to the cloud, but the disadvantages of limited battery life-time at the client side, as well as the high likelihood of system malfunction during an emergency still remain. By making use of the powerful sensing ability of smart phones, which are increasingly ubiquitous, this paper presents a cloud-enabled indoor emergency navigation framework to direct evacuees in a coordinated fashion and to improve the reliability and resilience for both communication and localization. By combining social potential fields (SPF) and a cognitive packet network (CPN)-based algorithm, evacuees are guided to exits in dynamic loose clusters. Rather than relying on a conventional telecommunications infrastructure, we suggest an ad hoc cognitive packet network (AHCPN)-based protocol to adaptively search optimal communication routes between portable devices and the network egress nodes that provide access to cloud servers, in a manner that spares the remaining battery power of smart phones and minimizes the time latency. Experimental results through detailed simulations indicate that smart human motion and smart network management can increase the survival rate of evacuees and reduce the number of drained smart phones in an evacuation process.
Emergency Navigation without an Infrastructure
Gelenbe, Erol; Bi, Huibo
2014-01-01
Emergency navigation systems for buildings and other built environments, such as sport arenas or shopping centres, typically rely on simple sensor networks to detect emergencies and, then, provide automatic signs to direct the evacuees. The major drawbacks of such static wireless sensor network (WSN)-based emergency navigation systems are the very limited computing capacity, which makes adaptivity very difficult, and the restricted battery power, due to the low cost of sensor nodes for unattended operation. If static wireless sensor networks and cloud-computing can be integrated, then intensive computations that are needed to determine optimal evacuation routes in the presence of time-varying hazards can be offloaded to the cloud, but the disadvantages of limited battery life-time at the client side, as well as the high likelihood of system malfunction during an emergency still remain. By making use of the powerful sensing ability of smart phones, which are increasingly ubiquitous, this paper presents a cloud-enabled indoor emergency navigation framework to direct evacuees in a coordinated fashion and to improve the reliability and resilience for both communication and localization. By combining social potential fields (SPF) and a cognitive packet network (CPN)-based algorithm, evacuees are guided to exits in dynamic loose clusters. Rather than relying on a conventional telecommunications infrastructure, we suggest an ad hoc cognitive packet network (AHCPN)-based protocol to adaptively search optimal communication routes between portable devices and the network egress nodes that provide access to cloud servers, in a manner that spares the remaining battery power of smart phones and minimizes the time latency. Experimental results through detailed simulations indicate that smart human motion and smart network management can increase the survival rate of evacuees and reduce the number of drained smart phones in an evacuation process. PMID:25196014
Establishing a Cloud Computing Success Model for Hospitals in Taiwan.
Lian, Jiunn-Woei
2017-01-01
The purpose of this study is to understand the critical quality-related factors that affect cloud computing success of hospitals in Taiwan. In this study, private cloud computing is the major research target. The chief information officers participated in a questionnaire survey. The results indicate that the integration of trust into the information systems success model will have acceptable explanatory power to understand cloud computing success in the hospital. Moreover, information quality and system quality directly affect cloud computing satisfaction, whereas service quality indirectly affects the satisfaction through trust. In other words, trust serves as the mediator between service quality and satisfaction. This cloud computing success model will help hospitals evaluate or achieve success after adopting private cloud computing health care services.
Establishing a Cloud Computing Success Model for Hospitals in Taiwan
Lian, Jiunn-Woei
2017-01-01
The purpose of this study is to understand the critical quality-related factors that affect cloud computing success of hospitals in Taiwan. In this study, private cloud computing is the major research target. The chief information officers participated in a questionnaire survey. The results indicate that the integration of trust into the information systems success model will have acceptable explanatory power to understand cloud computing success in the hospital. Moreover, information quality and system quality directly affect cloud computing satisfaction, whereas service quality indirectly affects the satisfaction through trust. In other words, trust serves as the mediator between service quality and satisfaction. This cloud computing success model will help hospitals evaluate or achieve success after adopting private cloud computing health care services. PMID:28112020
Implementation of cloud computing in higher education
NASA Astrophysics Data System (ADS)
Asniar; Budiawan, R.
2016-04-01
Cloud computing research is a new trend in distributed computing, where people have developed service and SOA (Service Oriented Architecture) based application. This technology is very useful to be implemented, especially for higher education. This research is studied the need and feasibility for the suitability of cloud computing in higher education then propose the model of cloud computing service in higher education in Indonesia that can be implemented in order to support academic activities. Literature study is used as the research methodology to get a proposed model of cloud computing in higher education. Finally, SaaS and IaaS are cloud computing service that proposed to be implemented in higher education in Indonesia and cloud hybrid is the service model that can be recommended.
Research on Key Technologies of Cloud Computing
NASA Astrophysics Data System (ADS)
Zhang, Shufen; Yan, Hongcan; Chen, Xuebin
With the development of multi-core processors, virtualization, distributed storage, broadband Internet and automatic management, a new type of computing mode named cloud computing is produced. It distributes computation task on the resource pool which consists of massive computers, so the application systems can obtain the computing power, the storage space and software service according to its demand. It can concentrate all the computing resources and manage them automatically by the software without intervene. This makes application offers not to annoy for tedious details and more absorbed in his business. It will be advantageous to innovation and reduce cost. It's the ultimate goal of cloud computing to provide calculation, services and applications as a public facility for the public, So that people can use the computer resources just like using water, electricity, gas and telephone. Currently, the understanding of cloud computing is developing and changing constantly, cloud computing still has no unanimous definition. This paper describes three main service forms of cloud computing: SAAS, PAAS, IAAS, compared the definition of cloud computing which is given by Google, Amazon, IBM and other companies, summarized the basic characteristics of cloud computing, and emphasized on the key technologies such as data storage, data management, virtualization and programming model.
The Many Colors and Shapes of Cloud
NASA Astrophysics Data System (ADS)
Yeh, James T.
While many enterprises and business entities are deploying and exploiting Cloud Computing, the academic institutes and researchers are also busy trying to wrestle this beast and put a leash on this possible paradigm changing computing model. Many have argued that Cloud Computing is nothing more than a name change of Utility Computing. Others have argued that Cloud Computing is a revolutionary change of the computing architecture. So it has been difficult to put a boundary of what is in Cloud Computing, and what is not. I assert that it is equally difficult to find a group of people who would agree on even the definition of Cloud Computing. In actuality, may be all that arguments are not necessary, as Clouds have many shapes and colors. In this presentation, the speaker will attempt to illustrate that the shape and the color of the cloud depend very much on the business goals one intends to achieve. It will be a very rich territory for both the businesses to take the advantage of the benefits of Cloud Computing and the academia to integrate the technology research and business research.
NASA Astrophysics Data System (ADS)
Panitkin, Sergey; Barreiro Megino, Fernando; Caballero Bejar, Jose; Benjamin, Doug; Di Girolamo, Alessandro; Gable, Ian; Hendrix, Val; Hover, John; Kucharczyk, Katarzyna; Medrano Llamas, Ramon; Love, Peter; Ohman, Henrik; Paterson, Michael; Sobie, Randall; Taylor, Ryan; Walker, Rodney; Zaytsev, Alexander; Atlas Collaboration
2014-06-01
The computing model of the ATLAS experiment was designed around the concept of grid computing and, since the start of data taking, this model has proven very successful. However, new cloud computing technologies bring attractive features to improve the operations and elasticity of scientific distributed computing. ATLAS sees grid and cloud computing as complementary technologies that will coexist at different levels of resource abstraction, and two years ago created an R&D working group to investigate the different integration scenarios. The ATLAS Cloud Computing R&D has been able to demonstrate the feasibility of offloading work from grid to cloud sites and, as of today, is able to integrate transparently various cloud resources into the PanDA workload management system. The ATLAS Cloud Computing R&D is operating various PanDA queues on private and public resources and has provided several hundred thousand CPU days to the experiment. As a result, the ATLAS Cloud Computing R&D group has gained a significant insight into the cloud computing landscape and has identified points that still need to be addressed in order to fully utilize this technology. This contribution will explain the cloud integration models that are being evaluated and will discuss ATLAS' learning during the collaboration with leading commercial and academic cloud providers.
The Education Value of Cloud Computing
ERIC Educational Resources Information Center
Katzan, Harry, Jr.
2010-01-01
Cloud computing is a technique for supplying computer facilities and providing access to software via the Internet. Cloud computing represents a contextual shift in how computers are provisioned and accessed. One of the defining characteristics of cloud software service is the transfer of control from the client domain to the service provider.…
Cloud Computing. Technology Briefing. Number 1
ERIC Educational Resources Information Center
Alberta Education, 2013
2013-01-01
Cloud computing is Internet-based computing in which shared resources, software and information are delivered as a service that computers or mobile devices can access on demand. Cloud computing is already used extensively in education. Free or low-cost cloud-based services are used daily by learners and educators to support learning, social…
Can cloud computing benefit health services? - a SWOT analysis.
Kuo, Mu-Hsing; Kushniruk, Andre; Borycki, Elizabeth
2011-01-01
In this paper, we discuss cloud computing, the current state of cloud computing in healthcare, and the challenges and opportunities of adopting cloud computing in healthcare. A Strengths, Weaknesses, Opportunities and Threats (SWOT) analysis was used to evaluate the feasibility of adopting this computing model in healthcare. The paper concludes that cloud computing could have huge benefits for healthcare but there are a number of issues that will need to be addressed before its widespread use in healthcare.
State of the Art of Network Security Perspectives in Cloud Computing
NASA Astrophysics Data System (ADS)
Oh, Tae Hwan; Lim, Shinyoung; Choi, Young B.; Park, Kwang-Roh; Lee, Heejo; Choi, Hyunsang
Cloud computing is now regarded as one of social phenomenon that satisfy customers' needs. It is possible that the customers' needs and the primary principle of economy - gain maximum benefits from minimum investment - reflects realization of cloud computing. We are living in the connected society with flood of information and without connected computers to the Internet, our activities and work of daily living will be impossible. Cloud computing is able to provide customers with custom-tailored features of application software and user's environment based on the customer's needs by adopting on-demand outsourcing of computing resources through the Internet. It also provides cloud computing users with high-end computing power and expensive application software package, and accordingly the users will access their data and the application software where they are located at the remote system. As the cloud computing system is connected to the Internet, network security issues of cloud computing are considered as mandatory prior to real world service. In this paper, survey and issues on the network security in cloud computing are discussed from the perspective of real world service environments.
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away
NASA Astrophysics Data System (ADS)
Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.
2012-09-01
By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data, and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.
If It's in the Cloud, Get It on Paper: Cloud Computing Contract Issues
ERIC Educational Resources Information Center
Trappler, Thomas J.
2010-01-01
Much recent discussion has focused on the pros and cons of cloud computing. Some institutions are attracted to cloud computing benefits such as rapid deployment, flexible scalability, and low initial start-up cost, while others are concerned about cloud computing risks such as those related to data location, level of service, and security…
Introducing the Cloud in an Introductory IT Course
ERIC Educational Resources Information Center
Woods, David M.
2018-01-01
Cloud computing is a rapidly emerging topic, but should it be included in an introductory IT course? The magnitude of cloud computing use, especially cloud infrastructure, along with students' limited knowledge of the topic support adding cloud content to the IT curriculum. There are several arguments that support including cloud computing in an…
Enabling Earth Science Through Cloud Computing
NASA Technical Reports Server (NTRS)
Hardman, Sean; Riofrio, Andres; Shams, Khawaja; Freeborn, Dana; Springer, Paul; Chafin, Brian
2012-01-01
Cloud Computing holds tremendous potential for missions across the National Aeronautics and Space Administration. Several flight missions are already benefiting from an investment in cloud computing for mission critical pipelines and services through faster processing time, higher availability, and drastically lower costs available on cloud systems. However, these processes do not currently extend to general scientific algorithms relevant to earth science missions. The members of the Airborne Cloud Computing Environment task at the Jet Propulsion Laboratory have worked closely with the Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) mission to integrate cloud computing into their science data processing pipeline. This paper details the efforts involved in deploying a science data system for the CARVE mission, evaluating and integrating cloud computing solutions with the system and porting their science algorithms for execution in a cloud environment.
Enhancing Security by System-Level Virtualization in Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Sun, Dawei; Chang, Guiran; Tan, Chunguang; Wang, Xingwei
Many trends are opening up the era of cloud computing, which will reshape the IT industry. Virtualization techniques have become an indispensable ingredient for almost all cloud computing system. By the virtual environments, cloud provider is able to run varieties of operating systems as needed by each cloud user. Virtualization can improve reliability, security, and availability of applications by using consolidation, isolation, and fault tolerance. In addition, it is possible to balance the workloads by using live migration techniques. In this paper, the definition of cloud computing is given; and then the service and deployment models are introduced. An analysis of security issues and challenges in implementation of cloud computing is identified. Moreover, a system-level virtualization case is established to enhance the security of cloud computing environments.
Cloud classification from satellite data using a fuzzy sets algorithm: A polar example
NASA Technical Reports Server (NTRS)
Key, J. R.; Maslanik, J. A.; Barry, R. G.
1988-01-01
Where spatial boundaries between phenomena are diffuse, classification methods which construct mutually exclusive clusters seem inappropriate. The Fuzzy c-means (FCM) algorithm assigns each observation to all clusters, with membership values as a function of distance to the cluster center. The FCM algorithm is applied to AVHRR data for the purpose of classifying polar clouds and surfaces. Careful analysis of the fuzzy sets can provide information on which spectral channels are best suited to the classification of particular features, and can help determine likely areas of misclassification. General agreement in the resulting classes and cloud fraction was found between the FCM algorithm, a manual classification, and an unsupervised maximum likelihood classifier.
Military clouds: utilization of cloud computing systems at the battlefield
NASA Astrophysics Data System (ADS)
Süleyman, Sarıkürk; Volkan, Karaca; İbrahim, Kocaman; Ahmet, Şirzai
2012-05-01
Cloud computing is known as a novel information technology (IT) concept, which involves facilitated and rapid access to networks, servers, data saving media, applications and services via Internet with minimum hardware requirements. Use of information systems and technologies at the battlefield is not new. Information superiority is a force multiplier and is crucial to mission success. Recent advances in information systems and technologies provide new means to decision makers and users in order to gain information superiority. These developments in information technologies lead to a new term, which is known as network centric capability. Similar to network centric capable systems, cloud computing systems are operational today. In the near future extensive use of military clouds at the battlefield is predicted. Integrating cloud computing logic to network centric applications will increase the flexibility, cost-effectiveness, efficiency and accessibility of network-centric capabilities. In this paper, cloud computing and network centric capability concepts are defined. Some commercial cloud computing products and applications are mentioned. Network centric capable applications are covered. Cloud computing supported battlefield applications are analyzed. The effects of cloud computing systems on network centric capability and on the information domain in future warfare are discussed. Battlefield opportunities and novelties which might be introduced to network centric capability by cloud computing systems are researched. The role of military clouds in future warfare is proposed in this paper. It was concluded that military clouds will be indispensible components of the future battlefield. Military clouds have the potential of improving network centric capabilities, increasing situational awareness at the battlefield and facilitating the settlement of information superiority.
NASA Astrophysics Data System (ADS)
Aneri, Parikh; Sumathy, S.
2017-11-01
Cloud computing provides services over the internet and provides application resources and data to the users based on their demand. Base of the Cloud Computing is consumer provider model. Cloud provider provides resources which consumer can access using cloud computing model in order to build their application based on their demand. Cloud data center is a bulk of resources on shared pool architecture for cloud user to access. Virtualization is the heart of the Cloud computing model, it provides virtual machine as per application specific configuration and those applications are free to choose their own configuration. On one hand, there is huge number of resources and on other hand it has to serve huge number of requests effectively. Therefore, resource allocation policy and scheduling policy play very important role in allocation and managing resources in this cloud computing model. This paper proposes the load balancing policy using Hungarian algorithm. Hungarian Algorithm provides dynamic load balancing policy with a monitor component. Monitor component helps to increase cloud resource utilization by managing the Hungarian algorithm by monitoring its state and altering its state based on artificial intelligent. CloudSim used in this proposal is an extensible toolkit and it simulates cloud computing environment.
Star formation induced by cloud-cloud collisions and galactic giant molecular cloud evolution
NASA Astrophysics Data System (ADS)
Kobayashi, Masato I. N.; Kobayashi, Hiroshi; Inutsuka, Shu-ichiro; Fukui, Yasuo
2018-05-01
Recent millimeter/submillimeter observations towards nearby galaxies have started to map the whole disk and to identify giant molecular clouds (GMCs) even in the regions between galactic spiral structures. Observed variations of GMC mass functions in different galactic environments indicates that massive GMCs preferentially reside along galactic spiral structures whereas inter-arm regions have many small GMCs. Based on the phase transition dynamics from magnetized warm neutral medium to molecular clouds, Kobayashi et al. (2017, ApJ, 836, 175) proposes a semi-analytical evolutionary description for GMC mass functions including a cloud-cloud collision (CCC) process. Their results show that CCC is less dominant in shaping the mass function of GMCs than the accretion of dense H I gas driven by the propagation of supersonic shock waves. However, their formulation does not take into account the possible enhancement of star formation by CCC. Millimeter/submillimeter observations within the Milky Way indicate the importance of CCC in the formation of star clusters and massive stars. In this article, we reformulate the time-evolution equation largely modified from Kobayashi et al. (2017, ApJ, 836, 175) so that we additionally compute star formation subsequently taking place in CCC clouds. Our results suggest that, although CCC events between smaller clouds are more frequent than the ones between massive GMCs, CCC-driven star formation is mostly driven by massive GMCs ≳ 10^{5.5} M_{⊙} (where M⊙ is the solar mass). The resultant cumulative CCC-driven star formation may amount to a few 10 percent of the total star formation in the Milky Way and nearby galaxies.
NASA Astrophysics Data System (ADS)
Aiftimiei, D. C.; Antonacci, M.; Bagnasco, S.; Boccali, T.; Bucchi, R.; Caballer, M.; Costantini, A.; Donvito, G.; Gaido, L.; Italiano, A.; Michelotto, D.; Panella, M.; Salomoni, D.; Vallero, S.
2017-10-01
One of the challenges a scientific computing center has to face is to keep delivering well consolidated computational frameworks (i.e. the batch computing farm), while conforming to modern computing paradigms. The aim is to ease system administration at all levels (from hardware to applications) and to provide a smooth end-user experience. Within the INDIGO- DataCloud project, we adopt two different approaches to implement a PaaS-level, on-demand Batch Farm Service based on HTCondor and Mesos. In the first approach, described in this paper, the various HTCondor daemons are packaged inside pre-configured Docker images and deployed as Long Running Services through Marathon, profiting from its health checks and failover capabilities. In the second approach, we are going to implement an ad-hoc HTCondor framework for Mesos. Container-to-container communication and isolation have been addressed exploring a solution based on overlay networks (based on the Calico Project). Finally, we have studied the possibility to deploy an HTCondor cluster that spans over different sites, exploiting the Condor Connection Broker component, that allows communication across a private network boundary or firewall as in case of multi-site deployments. In this paper, we are going to describe and motivate our implementation choices and to show the results of the first tests performed.
NASA Astrophysics Data System (ADS)
Michaelis, A.; Ganguly, S.; Nemani, R. R.; Votava, P.; Wang, W.; Lee, T. J.; Dungan, J. L.
2014-12-01
Sharing community-valued codes, intermediary datasets and results from individual efforts with others that are not in a direct funded collaboration can be a challenge. Cross organization collaboration is often impeded due to infrastructure security constraints, rigid financial controls, bureaucracy, and workforce nationalities, etc., which can force groups to work in a segmented fashion and/or through awkward and suboptimal web services. We show how a focused community may come together, share modeling and analysis codes, computing configurations, scientific results, knowledge and expertise on a public cloud platform; diverse groups of researchers working together at "arms length". Through the OpenNEX experimental workshop, users can view short technical "how-to" videos and explore encapsulated working environment. Workshop participants can easily instantiate Amazon Machine Images (AMI) or launch full cluster and data processing configurations within minutes. Enabling users to instantiate computing environments from configuration templates on large public cloud infrastructures, such as Amazon Web Services, may provide a mechanism for groups to easily use each others work and collaborate indirectly. Moreover, using the public cloud for this workshop allowed a single group to host a large read only data archive, making datasets of interest to the community widely available on the public cloud, enabling other groups to directly connect to the data and reduce the costs of the collaborative work by freeing other individual groups from redundantly retrieving, integrating or financing the storage of the datasets of interest.
Stellar Clustering in the Dark Filament IRDC 321.706+0.066
NASA Astrophysics Data System (ADS)
Soto King, Piera
2017-06-01
We investigate the star formation process in the infrared dark cloud IRDC 321.706+0.066, where are located three infrared clusters recently discovered by Barbá et al. (2015) using images of the VISTA Variables in the Vía Láctea public survey: La Serena 210, 211 and 212. The aim is to characterize the stellar content of the three clusters and to investigate the star formation sequence in a filamentary dark cloud. We present a new photometric analysis of VVV images, and we use data from others surveys. We confirmed the presence of the three VVV clusters. And also, we propose a new cluster
Using Cloud Computing infrastructure with CloudBioLinux, CloudMan and Galaxy
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-01-01
Cloud computing has revolutionized availability and access to computing and storage resources; making it possible to provision a large computational infrastructure with only a few clicks in a web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this protocol, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatics analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to setup the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command line interface, and the web-based Galaxy interface. PMID:22700313
Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy.
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-06-01
Cloud computing has revolutionized availability and access to computing and storage resources, making it possible to provision a large computational infrastructure with only a few clicks in a Web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this unit, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatic analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy, into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to set up the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command-line interface, and the Web-based Galaxy interface.
Formation of the young compact cluster GM 24 triggered by a cloud-cloud collision
NASA Astrophysics Data System (ADS)
Fukui, Yasuo; Kohno, Mikito; Yokoyama, Keiko; Nishimura, Atsushi; Torii, Kazufumi; Hattori, Yusuke; Sano, Hidetoshi; Ohama, Akio; Yamamoto, Hiroaki; Tachihara, Kengo
2018-05-01
High-mass star formation is an important step which controls galactic evolution. GM 24 is a heavily obscured star cluster including a single O9 star with more than ˜100 lower-mass stars within a 0.3 pc radius toward (l, b) ˜ (350.5°, 0.96°), close to the Galactic mini-starburst NGC 6334. We found two velocity components associated with the cluster by new observations of 12CO J =2-1 emission, whereas the cloud was previously considered to be single. We found that the distribution of the two components of 5 {km}s-1 separation shows complementary distribution; the two fit well with each other if a relative displacement of 3 pc is applied along the Galactic plane. A position-velocity diagram of the GM 24 cloud is explained by a model based on numerical simulations of two colliding clouds, where an intermediate velocity component created by the collision is taken into account. We estimate the collision time scale to be ˜Myr in projection of a relative motion tilted to the line of sight by 45°. The results lend further support for cloud-cloud collision as an important mechanism of high-mass star formation in the Carina-Sagittarius Arm.
Identity-Based Authentication for Cloud Computing
NASA Astrophysics Data System (ADS)
Li, Hongwei; Dai, Yuanshun; Tian, Ling; Yang, Haomiao
Cloud computing is a recently developed new technology for complex systems with massive-scale services sharing among numerous users. Therefore, authentication of both users and services is a significant issue for the trust and security of the cloud computing. SSL Authentication Protocol (SAP), once applied in cloud computing, will become so complicated that users will undergo a heavily loaded point both in computation and communication. This paper, based on the identity-based hierarchical model for cloud computing (IBHMCC) and its corresponding encryption and signature schemes, presented a new identity-based authentication protocol for cloud computing and services. Through simulation testing, it is shown that the authentication protocol is more lightweight and efficient than SAP, specially the more lightweight user side. Such merit of our model with great scalability is very suited to the massive-scale cloud.
Cloud Based Educational Systems and Its Challenges and Opportunities and Issues
ERIC Educational Resources Information Center
Paul, Prantosh Kr.; Lata Dangwal, Kiran
2014-01-01
Cloud Computing (CC) is actually is a set of hardware, software, networks, storage, services an interface combines to deliver aspects of computing as a service. Cloud Computing (CC) actually uses the central remote servers to maintain data and applications. Practically Cloud Computing (CC) is extension of Grid computing with independency and…
NASA Technical Reports Server (NTRS)
Putman, William P.
2012-01-01
Using a high-resolution non-hydrostatic version of GEOS-5 with the cubed-sphere finite-volume dynamical core, the impact of spatial and temporal resolution on cloud properties will be evaluated. There are indications from examining convective cluster development in high resolution GEOS-5 forecasts that the temporal resolution within the model may playas significant a role as horizontal resolution. Comparing modeled convective cloud clusters versus satellite observations of brightness temperature, we have found that improved. temporal resolution in GEOS-S accounts for a significant portion of the improvements in the statistical distribution of convective cloud clusters. Using satellite simulators in GEOS-S we will compare the cloud optical properties of GEOS-S at various spatial and temporal resolutions with those observed from MODIS. The potential impact of these results on tropical cyclone formation and intensity will be examined as well.
A scoping review of cloud computing in healthcare.
Griebel, Lena; Prokosch, Hans-Ulrich; Köpcke, Felix; Toddenroth, Dennis; Christoph, Jan; Leb, Ines; Engel, Igor; Sedlmayr, Martin
2015-03-19
Cloud computing is a recent and fast growing area of development in healthcare. Ubiquitous, on-demand access to virtually endless resources in combination with a pay-per-use model allow for new ways of developing, delivering and using services. Cloud computing is often used in an "OMICS-context", e.g. for computing in genomics, proteomics and molecular medicine, while other field of application still seem to be underrepresented. Thus, the objective of this scoping review was to identify the current state and hot topics in research on cloud computing in healthcare beyond this traditional domain. MEDLINE was searched in July 2013 and in December 2014 for publications containing the terms "cloud computing" and "cloud-based". Each journal and conference article was categorized and summarized independently by two researchers who consolidated their findings. 102 publications have been analyzed and 6 main topics have been found: telemedicine/teleconsultation, medical imaging, public health and patient self-management, hospital management and information systems, therapy, and secondary use of data. Commonly used features are broad network access for sharing and accessing data and rapid elasticity to dynamically adapt to computing demands. Eight articles favor the pay-for-use characteristics of cloud-based services avoiding upfront investments. Nevertheless, while 22 articles present very general potentials of cloud computing in the medical domain and 66 articles describe conceptual or prototypic projects, only 14 articles report from successful implementations. Further, in many articles cloud computing is seen as an analogy to internet-/web-based data sharing and the characteristics of the particular cloud computing approach are unfortunately not really illustrated. Even though cloud computing in healthcare is of growing interest only few successful implementations yet exist and many papers just use the term "cloud" synonymously for "using virtual machines" or "web-based" with no described benefit of the cloud paradigm. The biggest threat to the adoption in the healthcare domain is caused by involving external cloud partners: many issues of data safety and security are still to be solved. Until then, cloud computing is favored more for singular, individual features such as elasticity, pay-per-use and broad network access, rather than as cloud paradigm on its own.
Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge
2015-01-01
Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud. PMID:26271043
Modeling the Cloud to Enhance Capabilities for Crises and Catastrophe Management
2016-11-16
order for cloud computing infrastructures to be successfully deployed in real world scenarios as tools for crisis and catastrophe management, where...Statement of the Problem Studied As cloud computing becomes the dominant computational infrastructure[1] and cloud technologies make a transition to hosting...1. Formulate rigorous mathematical models representing technological capabilities and resources in cloud computing for performance modeling and
Automating NEURON Simulation Deployment in Cloud Resources.
Stockton, David B; Santamaria, Fidel
2017-01-01
Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.
Automating NEURON Simulation Deployment in Cloud Resources
Santamaria, Fidel
2016-01-01
Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the Open-Stack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon’s proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model. PMID:27655341
Homomorphic encryption experiments on IBM's cloud quantum computing platform
NASA Astrophysics Data System (ADS)
Huang, He-Liang; Zhao, You-Wei; Li, Tan; Li, Feng-Guang; Du, Yu-Tao; Fu, Xiang-Qun; Zhang, Shuo; Wang, Xiang; Bao, Wan-Su
2017-02-01
Quantum computing has undergone rapid development in recent years. Owing to limitations on scalability, personal quantum computers still seem slightly unrealistic in the near future. The first practical quantum computer for ordinary users is likely to be on the cloud. However, the adoption of cloud computing is possible only if security is ensured. Homomorphic encryption is a cryptographic protocol that allows computation to be performed on encrypted data without decrypting them, so it is well suited to cloud computing. Here, we first applied homomorphic encryption on IBM's cloud quantum computer platform. In our experiments, we successfully implemented a quantum algorithm for linear equations while protecting our privacy. This demonstration opens a feasible path to the next stage of development of cloud quantum information technology.
Capture of the Sun's Oort cloud from stars in its birth cluster.
Levison, Harold F; Duncan, Martin J; Brasser, Ramon; Kaufmann, David E
2010-07-09
Oort cloud comets are currently believed to have formed in the Sun's protoplanetary disk and to have been ejected to large heliocentric orbits by the giant planets. Detailed models of this process fail to reproduce all of the available observational constraints, however. In particular, the Oort cloud appears to be substantially more populous than the models predict. Here we present numerical simulations that show that the Sun captured comets from other stars while it was in its birth cluster. Our results imply that a substantial fraction of the Oort cloud comets, perhaps exceeding 90%, are from the protoplanetary disks of other stars.
Mobile Cloud Learning for Higher Education: A Case Study of Moodle in the Cloud
ERIC Educational Resources Information Center
Wang, Minjuan; Chen, Yong; Khan, Muhammad Jahanzaib
2014-01-01
Mobile cloud learning, a combination of mobile learning and cloud computing, is a relatively new concept that holds considerable promise for future development and delivery in the education sectors. Cloud computing helps mobile learning overcome obstacles related to mobile computing. The main focus of this paper is to explore how cloud computing…
76 FR 13984 - Cloud Computing Forum & Workshop III
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-15
... DEPARTMENT OF COMMERCE National Institute of Standards and Technology Cloud Computing Forum... public workshop. SUMMARY: NIST announces the Cloud Computing Forum & Workshop III to be held on April 7... provide information on the NIST strategic and tactical Cloud Computing program, including progress on the...
NASA Astrophysics Data System (ADS)
Fukui, Yasuo; Torii, Kazufumi; Hattori, Yusuke; Nishimura, Atsushi; Ohama, Akio; Shimajiri, Yoshito; Shima, Kazuhiro; Habe, Asao; Sano, Hidetoshi; Kohno, Mikito; Yamamoto, Hiroaki; Tachihara, Kengo; Onishi, Toshikazu
2018-06-01
The Orion Nebula Cluster toward the H II region M42 is the most outstanding young cluster at the smallest distance (410 pc) among the rich high-mass stellar clusters. By newly analyzing the archival molecular data of the 12CO(J = 1–0) emission at 21″ resolution, we identified at least three pairs of complementary distributions between two velocity components at 8 and 13 km s‑1. We present a hypothesis that the two clouds collided with each other and triggered formation of the high-mass stars, mainly toward two regions including the nearly 10 O stars in M42 and the B star, NU Ori, in M43. The timescale of the collision is estimated to be ∼0.1 Myr by a ratio of the cloud size and velocity corrected for projection, which is consistent with the age of the youngest cluster members less than 0.1 Myr. The majority of the low-mass cluster members were formed prior to the collision in the last Myr. We discuss the implications of the present hypothesis and the scenario of high-mass star formation by comparing with the other eight cases of triggered O-star formation via cloud–cloud collision.
NASA Astrophysics Data System (ADS)
Marinos, Alexandros; Briscoe, Gerard
Cloud Computing is rising fast, with its data centres growing at an unprecedented rate. However, this has come with concerns over privacy, efficiency at the expense of resilience, and environmental sustainability, because of the dependence on Cloud vendors such as Google, Amazon and Microsoft. Our response is an alternative model for the Cloud conceptualisation, providing a paradigm for Clouds in the community, utilising networked personal computers for liberation from the centralised vendor model. Community Cloud Computing (C3) offers an alternative architecture, created by combing the Cloud with paradigms from Grid Computing, principles from Digital Ecosystems, and sustainability from Green Computing, while remaining true to the original vision of the Internet. It is more technically challenging than Cloud Computing, having to deal with distributed computing issues, including heterogeneous nodes, varying quality of service, and additional security constraints. However, these are not insurmountable challenges, and with the need to retain control over our digital lives and the potential environmental consequences, it is a challenge we must pursue.
Dynamic Collaboration Infrastructure for Hydrologic Science
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Idaszak, R.; Castillo, C.; Yi, H.; Jiang, F.; Jones, N.; Goodall, J. L.
2016-12-01
Data and modeling infrastructure is becoming increasingly accessible to water scientists. HydroShare is a collaborative environment that currently offers water scientists the ability to access modeling and data infrastructure in support of data intensive modeling and analysis. It supports the sharing of and collaboration around "resources" which are social objects defined to include both data and models in a structured standardized format. Users collaborate around these objects via comments, ratings, and groups. HydroShare also supports web services and cloud based computation for the execution of hydrologic models and analysis and visualization of hydrologic data. However, the quantity and variety of data and modeling infrastructure available that can be accessed from environments like HydroShare is increasing. Storage infrastructure can range from one's local PC to campus or organizational storage to storage in the cloud. Modeling or computing infrastructure can range from one's desktop to departmental clusters to national HPC resources to grid and cloud computing resources. How does one orchestrate this vast number of data and computing infrastructure without needing to correspondingly learn each new system? A common limitation across these systems is the lack of efficient integration between data transport mechanisms and the corresponding high-level services to support large distributed data and compute operations. A scientist running a hydrology model from their desktop may require processing a large collection of files across the aforementioned storage and compute resources and various national databases. To address these community challenges a proof-of-concept prototype was created integrating HydroShare with RADII (Resource Aware Data-centric collaboration Infrastructure) to provide software infrastructure to enable the comprehensive and rapid dynamic deployment of what we refer to as "collaborative infrastructure." In this presentation we discuss the results of this proof-of-concept prototype which enabled HydroShare users to readily instantiate virtual infrastructure marshaling arbitrary combinations, varieties, and quantities of distributed data and computing infrastructure in addressing big problems in hydrology.
Forming clusters within clusters: how 30 Doradus recollapsed and gave birth again
NASA Astrophysics Data System (ADS)
Rahner, Daniel; Pellegrini, Eric W.; Glover, Simon C. O.; Klessen, Ralf S.
2018-01-01
The 30 Doradus nebula in the Large Magellanic Cloud (LMC) contains the massive starburst cluster NGC 2070 with a massive and probably younger stellar sub clump at its centre: R136. It is not clear how such a massive inner cluster could form several million years after the older stars in NGC 2070, given that stellar feedback is usually thought to expel gas and inhibit further star formation. Using the recently developed 1D feedback scheme WARPFIELD to scan a large range of cloud and cluster properties, we show that an age offset of several million years between the stellar populations is in fact to be expected given the interplay between feedback and gravity in a giant molecular cloud with a density ≳500 cm-3 due to re-accretion of gas on to the older stellar population. Neither capture of field stars nor gas retention inside the cluster have to be invoked in order to explain the observed age offset in NGC 2070 as well as the structure of the interstellar medium around it.
Cloud computing task scheduling strategy based on improved differential evolution algorithm
NASA Astrophysics Data System (ADS)
Ge, Junwei; He, Qian; Fang, Yiqiu
2017-04-01
In order to optimize the cloud computing task scheduling scheme, an improved differential evolution algorithm for cloud computing task scheduling is proposed. Firstly, the cloud computing task scheduling model, according to the model of the fitness function, and then used improved optimization calculation of the fitness function of the evolutionary algorithm, according to the evolution of generation of dynamic selection strategy through dynamic mutation strategy to ensure the global and local search ability. The performance test experiment was carried out in the CloudSim simulation platform, the experimental results show that the improved differential evolution algorithm can reduce the cloud computing task execution time and user cost saving, good implementation of the optimal scheduling of cloud computing tasks.
Globular cluster formation - The fossil record
NASA Technical Reports Server (NTRS)
Murray, Stephen D.; Lin, Douglas N. C.
1992-01-01
Properties of globular clusters which have remained unchanged since their formation are used to infer the internal pressures, cooling times, and dynamical times of the protocluster clouds immediately prior to the onset of star formation. For all globular clusters examined, it is found that the cooling times are much less than the dynamical times, implying that the protoclusters must have been maintained in thermal equilibrium by external heat sources, with fluxes consistent with those found in previous work, and giving the observed rho-T relation. Self-gravitating clouds cannot be stably heated, so that the Jeans mass forms an upper limit to the cluster masses. The observed dependence of protocluster pressure upon galactocentric position implies that the protocluster clouds were in hydrostatic equilibrium after their formation. The pressure dependence is well fitted by that expected for a quasi-statically evolving background hot gas, shock heated to its virial temperature. The observations and inferences are combined with previous theoretical work to construct a picture of globular cluster formation.
75 FR 64258 - Cloud Computing Forum & Workshop II
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-19
... DEPARTMENT OF COMMERCE National Institute of Standards and Technology Cloud Computing Forum... workshop. SUMMARY: NIST announces the Cloud Computing Forum & Workshop II to be held on November 4 and 5, 2010. This workshop will provide information on a Cloud Computing Roadmap Strategy as well as provide...
76 FR 62373 - Notice of Public Meeting-Cloud Computing Forum & Workshop IV
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-07
...--Cloud Computing Forum & Workshop IV AGENCY: National Institute of Standards and Technology (NIST), Commerce. ACTION: Notice. SUMMARY: NIST announces the Cloud Computing Forum & Workshop IV to be held on... to help develop open standards in interoperability, portability and security in cloud computing. This...
Project #OA-FY14-0126, January 15, 2014. The EPA OIG is starting fieldwork on the Council of the Inspectors General on Integrity and Efficiency (CIGIE) Cloud Computing Initiative – Status of Cloud-Computing Environments Within the Federal Government.
Intelligent cloud computing security using genetic algorithm as a computational tools
NASA Astrophysics Data System (ADS)
Razuky AL-Shaikhly, Mazin H.
2018-05-01
An essential change had occurred in the field of Information Technology which represented with cloud computing, cloud giving virtual assets by means of web yet awesome difficulties in the field of information security and security assurance. Currently main problem with cloud computing is how to improve privacy and security for cloud “cloud is critical security”. This paper attempts to solve cloud security by using intelligent system with genetic algorithm as wall to provide cloud data secure, all services provided by cloud must detect who receive and register it to create list of users (trusted or un-trusted) depend on behavior. The execution of present proposal has shown great outcome.
WE-B-BRD-01: Innovation in Radiation Therapy Planning II: Cloud Computing in RT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moore, K; Kagadis, G; Xing, L
As defined by the National Institute of Standards and Technology, cloud computing is “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” Despite the omnipresent role of computers in radiotherapy, cloud computing has yet to achieve widespread adoption in clinical or research applications, though the transition to such “on-demand” access is underway. As this transition proceeds, new opportunities for aggregate studies and efficient use of computational resources are set againstmore » new challenges in patient privacy protection, data integrity, and management of clinical informatics systems. In this Session, current and future applications of cloud computing and distributed computational resources will be discussed in the context of medical imaging, radiotherapy research, and clinical radiation oncology applications. Learning Objectives: Understand basic concepts of cloud computing. Understand how cloud computing could be used for medical imaging applications. Understand how cloud computing could be employed for radiotherapy research.4. Understand how clinical radiotherapy software applications would function in the cloud.« less
Cloud Computing with iPlant Atmosphere.
McKay, Sheldon J; Skidmore, Edwin J; LaRose, Christopher J; Mercer, Andre W; Noutsos, Christos
2013-10-15
Cloud Computing refers to distributed computing platforms that use virtualization software to provide easy access to physical computing infrastructure and data storage, typically administered through a Web interface. Cloud-based computing provides access to powerful servers, with specific software and virtual hardware configurations, while eliminating the initial capital cost of expensive computers and reducing the ongoing operating costs of system administration, maintenance contracts, power consumption, and cooling. This eliminates a significant barrier to entry into bioinformatics and high-performance computing for many researchers. This is especially true of free or modestly priced cloud computing services. The iPlant Collaborative offers a free cloud computing service, Atmosphere, which allows users to easily create and use instances on virtual servers preconfigured for their analytical needs. Atmosphere is a self-service, on-demand platform for scientific computing. This unit demonstrates how to set up, access and use cloud computing in Atmosphere. Copyright © 2013 John Wiley & Sons, Inc.
A 3D clustering approach for point clouds to detect and quantify changes at a rock glacier front
NASA Astrophysics Data System (ADS)
Micheletti, Natan; Tonini, Marj; Lane, Stuart N.
2016-04-01
Terrestrial Laser Scanners (TLS) are extensively used in geomorphology to remotely-sense landforms and surfaces of any type and to derive digital elevation models (DEMs). Modern devices are able to collect many millions of points, so that working on the resulting dataset is often troublesome in terms of computational efforts. Indeed, it is not unusual that raw point clouds are filtered prior to DEM creation, so that only a subset of points is retained and the interpolation process becomes less of a burden. Whilst this procedure is in many cases necessary, it implicates a considerable loss of valuable information. First, and even without eliminating points, the common interpolation of points to a regular grid causes a loss of potentially useful detail. Second, it inevitably causes the transition from 3D information to only 2.5D data where each (x,y) pair must have a unique z-value. Vector-based DEMs (e.g. triangulated irregular networks) partially mitigate these issues, but still require a set of parameters to be set and a considerable burden in terms of calculation and storage. Because of the reasons above, being able to perform geomorphological research directly on point clouds would be profitable. Here, we propose an approach to identify erosion and deposition patterns on a very active rock glacier front in the Swiss Alps to monitor sediment dynamics. The general aim is to set up a semiautomatic method to isolate mass movements using 3D-feature identification directly from LiDAR data. An ultra-long range LiDAR RIEGL VZ-6000 scanner was employed to acquire point clouds during three consecutive summers. In order to isolate single clusters of erosion and deposition we applied the Density-Based Scan Algorithm with Noise (DBSCAN), previously successfully employed by Tonini and Abellan (2014) in a similar case for rockfall detection. DBSCAN requires two input parameters, strongly influencing the number, shape and size of the detected clusters: the minimum number of points (i) at a maximum distance (ii) around each core-point. Under this condition, seed points are said to be density-reachable by a core point delimiting a cluster around it. A chain of intermediate seed-points can connect contiguous clusters allowing clusters of arbitrary shape to be defined. The novelty of the proposed approach consists in the implementation of the DBSCAN 3D-module, where the xyz-coordinates identify each point and the density of points within a sphere is considered. This allows detecting volumetric features with a higher accuracy, depending only on actual sampling resolution. The approach is truly 3D and exploits all TLS measurements without the need of interpolation or data reduction. Using this method, enhanced geomorphological activity during the summer of 2015 in respect to the previous two years was observed. We attribute this result to the exceptionally high temperatures of that summer, which we deem responsible for accelerating the melting process at the rock glacier front and probably also increasing creep velocities. References: - Tonini, M. and Abellan, A. (2014). Rockfall detection from terrestrial LiDAR point clouds: A clustering approach using R. Journal of Spatial Information Sciences. Number 8, pp95-110 - Hennig, C. Package fpc: Flexible procedures for clustering. https://cran.r-project.org/web/packages/fpc/index.html, 2015. Accessed 2016-01-12.
Energy Consumption Management of Virtual Cloud Computing Platform
NASA Astrophysics Data System (ADS)
Li, Lin
2017-11-01
For energy consumption management research on virtual cloud computing platforms, energy consumption management of virtual computers and cloud computing platform should be understood deeper. Only in this way can problems faced by energy consumption management be solved. In solving problems, the key to solutions points to data centers with high energy consumption, so people are in great need to use a new scientific technique. Virtualization technology and cloud computing have become powerful tools in people’s real life, work and production because they have strong strength and many advantages. Virtualization technology and cloud computing now is in a rapid developing trend. It has very high resource utilization rate. In this way, the presence of virtualization and cloud computing technologies is very necessary in the constantly developing information age. This paper has summarized, explained and further analyzed energy consumption management questions of the virtual cloud computing platform. It eventually gives people a clearer understanding of energy consumption management of virtual cloud computing platform and brings more help to various aspects of people’s live, work and son on.
NASA Astrophysics Data System (ADS)
Nishimura, Atsushi; Minamidani, Tetsuhiro; Umemoto, Tomofumi; Fujita, Shinji; Matsuo, Mitsuhiro; Hattori, Yusuke; Kohno, Mikito; Yamagishi, Mitsuyoshi; Tsuda, Yuya; Kuriki, Mika; Kuno, Nario; Torii, Kazufumi; Tsutsumi, Daichi; Okawa, Kazuki; Sano, Hidetoshi; Tachihara, Kengo; Ohama, Akio; Fukui, Yasuo
2018-05-01
We present 12CO (J = 1-0), 13CO (J = 1-0), and C18O (J = 1-0) images of the M 17 giant molecular clouds obtained as part of the FUGIN (FOREST Ultra-wide Galactic Plane Survey In Nobeyama) project. The observations cover the entire area of the M 17 SW and M 17 N clouds at the highest angular resolution (˜19″) to date, which corresponds to ˜0.18 pc at the distance of 2.0 kpc. We find that the region consists of four different velocity components: a very low velocity (VLV) clump, a low velocity component (LVC), a main velocity component (MVC), and a high velocity component (HVC). The LVC and the HVC have cavities. Ultraviolet photons radiated from NGC 6618 cluster penetrate into the N cloud up to ˜5 pc through the cavities and interact with molecular gas. This interaction is correlated with the distribution of young stellar objects in the N cloud. The LVC and the HVC are distributed complementarily after the HVC is displaced by 0.8 pc toward the east-southeast direction, suggesting that collision of the LVC and the HVC created the cavities in both clouds. The collision velocity and timescale are estimated to be 9.9 km s-1 and 1.1 × 105 yr, respectively. The high collision velocity can provide a mass accretion rate of up to 10^{-3} M_{⊙} yr-1, and the high column density (4 × 1023 cm-2) might result in massive cluster formation. The scenario of cloud-cloud collision likely explains well the stellar population and the formation history of the NGC 6618 cluster proposed by Hoffmeister et al. (2008, ApJ, 686, 310).
NASA Astrophysics Data System (ADS)
Nishimura, Atsushi; Minamidani, Tetsuhiro; Umemoto, Tomofumi; Fujita, Shinji; Matsuo, Mitsuhiro; Hattori, Yusuke; Kohno, Mikito; Yamagishi, Mitsuyoshi; Tsuda, Yuya; Kuriki, Mika; Kuno, Nario; Torii, Kazufumi; Tsutsumi, Daichi; Okawa, Kazuki; Sano, Hidetoshi; Tachihara, Kengo; Ohama, Akio; Fukui, Yasuo
2018-05-01
We present 12CO (J = 1-0), 13CO (J = 1-0), and C18O (J = 1-0) images of the M 17 giant molecular clouds obtained as part of the FUGIN (FOREST Ultra-wide Galactic Plane Survey In Nobeyama) project. The observations cover the entire area of the M 17 SW and M 17 N clouds at the highest angular resolution (˜19″) to date, which corresponds to ˜0.18 pc at the distance of 2.0 kpc. We find that the region consists of four different velocity components: a very low velocity (VLV) clump, a low velocity component (LVC), a main velocity component (MVC), and a high velocity component (HVC). The LVC and the HVC have cavities. Ultraviolet photons radiated from NGC 6618 cluster penetrate into the N cloud up to ˜5 pc through the cavities and interact with molecular gas. This interaction is correlated with the distribution of young stellar objects in the N cloud. The LVC and the HVC are distributed complementarily after the HVC is displaced by 0.8 pc toward the east-southeast direction, suggesting that collision of the LVC and the HVC created the cavities in both clouds. The collision velocity and timescale are estimated to be 9.9 km s-1 and 1.1 × 105 yr, respectively. The high collision velocity can provide a mass accretion rate of up to 10^{-3} M_{⊙}yr-1, and the high column density (4 × 1023 cm-2) might result in massive cluster formation. The scenario of cloud-cloud collision likely explains well the stellar population and the formation history of the NGC 6618 cluster proposed by Hoffmeister et al. (2008, ApJ, 686, 310).
Cloud-free resolution element statistics program
NASA Technical Reports Server (NTRS)
Liley, B.; Martin, C. D.
1971-01-01
Computer program computes number of cloud-free elements in field-of-view and percentage of total field-of-view occupied by clouds. Human error is eliminated by using visual estimation to compute cloud statistics from aerial photographs.
NASA Astrophysics Data System (ADS)
Covey, Kevin R.; Cottaar, M.; Foster, J. B.; Nidever, D. L.; Meyer, M.; Tan, J.; Da Rio, N.; Flaherty, K. M.; Stassun, K.; Frinchaboy, P. M.; Majewski, S.; APOGEE IN-SYNC Team
2014-01-01
Demographic studies of stellar clusters indicate that relatively few persist as bound structures for 100 Myrs or longer. If cluster dispersal is a 'violent' process, it could strongly influence the formation and early evolution of stellar binaries and planetary systems. Unfortunately, measuring the dynamical state of 'typical' (i.e., ~300-1000 member) young star clusters has been difficult, particularly for clusters still embedded within their parental molecular cloud. The near-infrared spectrograph for the Apache Point Observatory Galactic Evolution Experiment (APOGEE), which can measure precise radial velocities for 230 cluster stars simultaneously, is uniquely suited to diagnosing the dynamics of Galactic star formation regions. We give an overview of the INfrared Survey of Young Nebulous Clusters (IN-SYNC), an APOGEE ancillary science program that is carrying out a comparative study of young clusters in the Perseus molecular cloud: NGC 1333, a heavily embedded cluster, and IC 348, which has begun to disperse its surrounding molecular gas. These observations appear to rule out a significantly super-virial velocity dispersion in IC 348, contrary to predictions of models where a cluster's dynamics is strongly influenced by the dispersal of its primordial gas. We also summarize the properties of two newly identified spectroscopic binaries; binary systems such as these play a key role in the dynamical evolution of young clusters, and introduce velocity offsets that must be accounted for in measuring cluster velocity dispersions.
Cloud and aerosol studies using combined CPL and MAS data
NASA Astrophysics Data System (ADS)
Vaughan, Mark A.; Rodier, Sharon; Hu, Yongxiang; McGill, Matthew J.; Holz, Robert E.
2004-11-01
Current uncertainties in the role of aerosols and clouds in the Earth's climate system limit our abilities to model the climate system and predict climate change. These limitations are due primarily to difficulties of adequately measuring aerosols and clouds on a global scale. The A-train satellites (Aqua, CALIPSO, CloudSat, PARASOL, and Aura) will provide an unprecedented opportunity to address these uncertainties. The various active and passive sensors of the A-train will use a variety of measurement techniques to provide comprehensive observations of the multi-dimensional properties of clouds and aerosols. However, to fully achieve the potential of this ensemble requires a robust data analysis framework to optimally and efficiently map these individual measurements into a comprehensive set of cloud and aerosol physical properties. In this work we introduce the Multi-Instrument Data Analysis and Synthesis (MIDAS) project, whose goal is to develop a suite of physically sound and computationally efficient algorithms that will combine active and passive remote sensing data in order to produce improved assessments of aerosol and cloud radiative and microphysical properties. These algorithms include (a) the development of an intelligent feature detection algorithm that combines inputs from both active and passive sensors, and (b) identifying recognizable multi-instrument signatures related to aerosol and cloud type derived from clusters of image pixels and the associated vertical profile information. Classification of these signatures will lead to the automated identification of aerosol and cloud types. Testing of these new algorithms is done using currently existing and readily available active and passive measurements from the Cloud Physics Lidar and the MODIS Airborne Simulator, which simulate, respectively, the CALIPSO and MODIS A-train instruments.
Large and Small Magellanic Clouds age-metallicity relationships
NASA Astrophysics Data System (ADS)
Perren, G. I.; Piatti, A. E.; Vázquez, R. A.
2017-10-01
We present a new determination of the age-metallicity relation for both Magellanic Clouds, estimated through the homogeneous analysis of 239 observed star clusters. All clusters in our set were observed with the filters of the Washington photometric system. The Automated Stellar cluster Analysis package (ASteCA) was employed to derive the cluster's fundamental parameters, in particular their ages and metallicities, through an unassisted process. We find that our age-metallicity relations (AMRs) can not be fully matched to any of the estimations found in twelve previous works, and are better explained by a combination of several of them in different age intervals.
Research on Influence of Cloud Environment on Traditional Network Security
NASA Astrophysics Data System (ADS)
Ming, Xiaobo; Guo, Jinhua
2018-02-01
Cloud computing is a symbol of the progress of modern information network, cloud computing provides a lot of convenience to the Internet users, but it also brings a lot of risk to the Internet users. Second, one of the main reasons for Internet users to choose cloud computing is that the network security performance is great, it also is the cornerstone of cloud computing applications. This paper briefly explores the impact on cloud environment on traditional cybersecurity, and puts forward corresponding solutions.
77 FR 26509 - Notice of Public Meeting-Cloud Computing Forum & Workshop V
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-04
...--Cloud Computing Forum & Workshop V AGENCY: National Institute of Standards & Technology (NIST), Commerce. ACTION: Notice. SUMMARY: NIST announces the Cloud Computing Forum & Workshop V to be held on Tuesday... workshop. This workshop will provide information on the U.S. Government (USG) Cloud Computing Technology...
National electronic medical records integration on cloud computing system.
Mirza, Hebah; El-Masri, Samir
2013-01-01
Few Healthcare providers have an advanced level of Electronic Medical Record (EMR) adoption. Others have a low level and most have no EMR at all. Cloud computing technology is a new emerging technology that has been used in other industry and showed a great success. Despite the great features of Cloud computing, they haven't been utilized fairly yet in healthcare industry. This study presents an innovative Healthcare Cloud Computing system for Integrating Electronic Health Record (EHR). The proposed Cloud system applies the Cloud Computing technology on EHR system, to present a comprehensive EHR integrated environment.
Observations of cloud cluster hierarchies over the tropical western Pacific
NASA Technical Reports Server (NTRS)
Lau, K. M.; Nakazawa, T.; Sui, C. H.
1991-01-01
The structure and propagation of tropical-cloud clusters are investigated during two contrasting periods over the tropical western Pacific in order to determine possible similarities or differences and to compare with previous studies. Three fundamental periodicities are found in tropical convection in the region: 1 day, 2-3 days, and 10-15 days. It is noted that the 10-15-day time scale is closely related to the intraseasonal oscillations propagating from the Indian Ocean to the western Pacific. Large convective complexes, supercloud clusters (SSC) are found to organize in this time scale. The SCC is made up from several cloud clusters generated at 2-3-day intervals. The diurnal variation is found to be most pronounced over the maritime continent, and the amplitude of the diurnal cycle is shown to be modulated by the 2-3-day and 10-15-day oscillations.
Big Data Clustering via Community Detection and Hyperbolic Network Embedding in IoT Applications.
Karyotis, Vasileios; Tsitseklis, Konstantinos; Sotiropoulos, Konstantinos; Papavassiliou, Symeon
2018-04-15
In this paper, we present a novel data clustering framework for big sensory data produced by IoT applications. Based on a network representation of the relations among multi-dimensional data, data clustering is mapped to node clustering over the produced data graphs. To address the potential very large scale of such datasets/graphs that test the limits of state-of-the-art approaches, we map the problem of data clustering to a community detection one over the corresponding data graphs. Specifically, we propose a novel computational approach for enhancing the traditional Girvan-Newman (GN) community detection algorithm via hyperbolic network embedding. The data dependency graph is embedded in the hyperbolic space via Rigel embedding, allowing more efficient computation of edge-betweenness centrality needed in the GN algorithm. This allows for more efficient clustering of the nodes of the data graph in terms of modularity, without sacrificing considerable accuracy. In order to study the operation of our approach with respect to enhancing GN community detection, we employ various representative types of artificial complex networks, such as scale-free, small-world and random geometric topologies, and frequently-employed benchmark datasets for demonstrating its efficacy in terms of data clustering via community detection. Furthermore, we provide a proof-of-concept evaluation by applying the proposed framework over multi-dimensional datasets obtained from an operational smart-city/building IoT infrastructure provided by the Federated Interoperable Semantic IoT/cloud Testbeds and Applications (FIESTA-IoT) testbed federation. It is shown that the proposed framework can be indeed used for community detection/data clustering and exploited in various other IoT applications, such as performing more energy-efficient smart-city/building sensing.
Big Data Clustering via Community Detection and Hyperbolic Network Embedding in IoT Applications
Sotiropoulos, Konstantinos
2018-01-01
In this paper, we present a novel data clustering framework for big sensory data produced by IoT applications. Based on a network representation of the relations among multi-dimensional data, data clustering is mapped to node clustering over the produced data graphs. To address the potential very large scale of such datasets/graphs that test the limits of state-of-the-art approaches, we map the problem of data clustering to a community detection one over the corresponding data graphs. Specifically, we propose a novel computational approach for enhancing the traditional Girvan–Newman (GN) community detection algorithm via hyperbolic network embedding. The data dependency graph is embedded in the hyperbolic space via Rigel embedding, allowing more efficient computation of edge-betweenness centrality needed in the GN algorithm. This allows for more efficient clustering of the nodes of the data graph in terms of modularity, without sacrificing considerable accuracy. In order to study the operation of our approach with respect to enhancing GN community detection, we employ various representative types of artificial complex networks, such as scale-free, small-world and random geometric topologies, and frequently-employed benchmark datasets for demonstrating its efficacy in terms of data clustering via community detection. Furthermore, we provide a proof-of-concept evaluation by applying the proposed framework over multi-dimensional datasets obtained from an operational smart-city/building IoT infrastructure provided by the Federated Interoperable Semantic IoT/cloud Testbeds and Applications (FIESTA-IoT) testbed federation. It is shown that the proposed framework can be indeed used for community detection/data clustering and exploited in various other IoT applications, such as performing more energy-efficient smart-city/building sensing. PMID:29662043
Cloud computing applications for biomedical science: A perspective.
Navale, Vivek; Bourne, Philip E
2018-06-01
Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.
Cloud computing applications for biomedical science: A perspective
2018-01-01
Biomedical research has become a digital data–intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research. PMID:29902176
Research on OpenStack of open source cloud computing in colleges and universities’ computer room
NASA Astrophysics Data System (ADS)
Wang, Lei; Zhang, Dandan
2017-06-01
In recent years, the cloud computing technology has a rapid development, especially open source cloud computing. Open source cloud computing has attracted a large number of user groups by the advantages of open source and low cost, have now become a large-scale promotion and application. In this paper, firstly we briefly introduced the main functions and architecture of the open source cloud computing OpenStack tools, and then discussed deeply the core problems of computer labs in colleges and universities. Combining with this research, it is not that the specific application and deployment of university computer rooms with OpenStack tool. The experimental results show that the application of OpenStack tool can efficiently and conveniently deploy cloud of university computer room, and its performance is stable and the functional value is good.
Charlebois, Kathleen; Palmour, Nicole; Knoppers, Bartha Maria
2016-01-01
This study aims to understand the influence of the ethical and legal issues on cloud computing adoption in the field of genomics research. To do so, we adapted Diffusion of Innovation (DoI) theory to enable understanding of how key stakeholders manage the various ethical and legal issues they encounter when adopting cloud computing. Twenty semi-structured interviews were conducted with genomics researchers, patient advocates and cloud service providers. Thematic analysis generated five major themes: 1) Getting comfortable with cloud computing; 2) Weighing the advantages and the risks of cloud computing; 3) Reconciling cloud computing with data privacy; 4) Maintaining trust and 5) Anticipating the cloud by creating the conditions for cloud adoption. Our analysis highlights the tendency among genomics researchers to gradually adopt cloud technology. Efforts made by cloud service providers to promote cloud computing adoption are confronted by researchers’ perpetual cost and security concerns, along with a lack of familiarity with the technology. Further underlying those fears are researchers’ legal responsibility with respect to the data that is stored on the cloud. Alternative consent mechanisms aimed at increasing patients’ control over the use of their data also provide a means to circumvent various institutional and jurisdictional hurdles that restrict access by creating siloed databases. However, the risk of creating new, cloud-based silos may run counter to the goal in genomics research to increase data sharing on a global scale. PMID:27755563
Charlebois, Kathleen; Palmour, Nicole; Knoppers, Bartha Maria
2016-01-01
This study aims to understand the influence of the ethical and legal issues on cloud computing adoption in the field of genomics research. To do so, we adapted Diffusion of Innovation (DoI) theory to enable understanding of how key stakeholders manage the various ethical and legal issues they encounter when adopting cloud computing. Twenty semi-structured interviews were conducted with genomics researchers, patient advocates and cloud service providers. Thematic analysis generated five major themes: 1) Getting comfortable with cloud computing; 2) Weighing the advantages and the risks of cloud computing; 3) Reconciling cloud computing with data privacy; 4) Maintaining trust and 5) Anticipating the cloud by creating the conditions for cloud adoption. Our analysis highlights the tendency among genomics researchers to gradually adopt cloud technology. Efforts made by cloud service providers to promote cloud computing adoption are confronted by researchers' perpetual cost and security concerns, along with a lack of familiarity with the technology. Further underlying those fears are researchers' legal responsibility with respect to the data that is stored on the cloud. Alternative consent mechanisms aimed at increasing patients' control over the use of their data also provide a means to circumvent various institutional and jurisdictional hurdles that restrict access by creating siloed databases. However, the risk of creating new, cloud-based silos may run counter to the goal in genomics research to increase data sharing on a global scale.
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
Cloud Fingerprinting: Using Clock Skews To Determine Co Location Of Virtual Machines
2016-09-01
DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) Cloud computing has quickly revolutionized computing practices of organizations, to include the Department of... Cloud computing has quickly revolutionized computing practices of organizations, to in- clude the Department of Defense. However, security concerns...vi Table of Contents 1 Introduction 1 1.1 Proliferation of Cloud Computing . . . . . . . . . . . . . . . . . . 1 1.2 Problem Statement
Evidence for impulsive solar wind plasma penetration through the dayside magnetopause
NASA Astrophysics Data System (ADS)
Lundin, R.; Sauvaud, J.-A.; Rème, H.; Balogh, A.; Dandouras, I.; Bosqued, J. M.; Carlson, C.; Parks, G. K.; Möbius, E.; Kistler, L. M.; Klecker, B.; Amata, E.; Formisano, V.; Dunlop, M.; Eliasson, L.; Korth, A.; Lavraud, B.; McCarthy, M.
2003-02-01
This paper presents in situ observational evidence from the Cluster Ion Spectrometer (CIS) on Cluster of injected solar wind "plasma clouds" protruding into the day-side high-latitude magnetopause. The plasma clouds, presumably injected by a transient process through the day-side magnetopause, show characteristics implying a generation mechanism denoted impulsive penetration (Lemaire and Roth, 1978).
Cloudbus Toolkit for Market-Oriented Cloud Computing
NASA Astrophysics Data System (ADS)
Buyya, Rajkumar; Pandey, Suraj; Vecchiola, Christian
This keynote paper: (1) presents the 21st century vision of computing and identifies various IT paradigms promising to deliver computing as a utility; (2) defines the architecture for creating market-oriented Clouds and computing atmosphere by leveraging technologies such as virtual machines; (3) provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; (4) presents the work carried out as part of our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a Service software system containing SDK (Software Development Kit) for construction of Cloud applications and deployment on private or public Clouds, in addition to supporting market-oriented resource management; (ii) internetworking of Clouds for dynamic creation of federated computing environments for scaling of elastic applications; (iii) creation of 3rd party Cloud brokering services for building content delivery networks and e-Science applications and their deployment on capabilities of IaaS providers such as Amazon along with Grid mashups; (iv) CloudSim supporting modelling and simulation of Clouds for performance studies; (v) Energy Efficient Resource Allocation Mechanisms and Techniques for creation and management of Green Clouds; and (vi) pathways for future research.
TWO-STAGE FRAGMENTATION FOR CLUSTER FORMATION: ANALYTICAL MODEL AND OBSERVATIONAL CONSIDERATIONS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, Nicole D.; Basu, Shantanu, E-mail: nwityk@uwo.ca, E-mail: basu@uwo.ca
2012-12-10
Linear analysis of the formation of protostellar cores in planar magnetic interstellar clouds shows that molecular clouds exhibit a preferred length scale for collapse that depends on the mass-to-flux ratio and neutral-ion collision time within the cloud. We extend this linear analysis to the context of clustered star formation. By combining the results of the linear analysis with a realistic ionization profile for the cloud, we find that a molecular cloud may evolve through two fragmentation events in the evolution toward the formation of stars. Our model suggests that the initial fragmentation into clumps occurs for a transcritical cloud onmore » parsec scales while the second fragmentation can occur for transcritical and supercritical cores on subparsec scales. Comparison of our results with several star-forming regions (Perseus, Taurus, Pipe Nebula) shows support for a two-stage fragmentation model.« less
Augmenting Satellite Precipitation Estimation with Lightning Information
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahrooghy, Majid; Anantharaj, Valentine G; Younan, Nicolas H.
2013-01-01
We have used lightning information to augment the Precipitation Estimation from Remotely Sensed Imagery using an Artificial Neural Network - Cloud Classification System (PERSIANN-CCS). Co-located lightning data are used to segregate cloud patches, segmented from GOES-12 infrared data, into either electrified (EL) or non-electrified (NEL) patches. A set of features is extracted separately for the EL and NEL cloud patches. The features for the EL cloud patches include new features based on the lightning information. The cloud patches are classified and clustered using self-organizing maps (SOM). Then brightness temperature and rain rate (T-R) relationships are derived for the different clusters.more » Rain rates are estimated for the cloud patches based on their representative T-R relationship. The Equitable Threat Score (ETS) for daily precipitation estimates is improved by almost 12% for the winter season. In the summer, no significant improvements in ETS are noted.« less
Clustered star formation and the origin of stellar masses.
Pudritz, Ralph E
2002-01-04
Star clusters are ubiquitous in galaxies of all types and at all stages of their evolution. We also observe them to be forming in a wide variety of environments, ranging from nearby giant molecular clouds to the supergiant molecular clouds found in starburst and merging galaxies. The typical star in our galaxy and probably in others formed as a member of a star cluster, so star formation is an intrinsically clustered and not an isolated phenomenon. The greatest challenge regarding clustered star formation is to understand why stars have a mass spectrum that appears to be universal. This review examines the observations and models that have been proposed to explain these fundamental issues in stellar formation.
Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.
2015-01-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363
Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L
2015-02-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Proposal for a Security Management in Cloud Computing for Health Care
Dzombeta, Srdan; Brandis, Knud
2014-01-01
Cloud computing is actually one of the most popular themes of information systems research. Considering the nature of the processed information especially health care organizations need to assess and treat specific risks according to cloud computing in their information security management system. Therefore, in this paper we propose a framework that includes the most important security processes regarding cloud computing in the health care sector. Starting with a framework of general information security management processes derived from standards of the ISO 27000 family the most important information security processes for health care organizations using cloud computing will be identified considering the main risks regarding cloud computing and the type of information processed. The identified processes will help a health care organization using cloud computing to focus on the most important ISMS processes and establish and operate them at an appropriate level of maturity considering limited resources. PMID:24701137
Proposal for a security management in cloud computing for health care.
Haufe, Knut; Dzombeta, Srdan; Brandis, Knud
2014-01-01
Cloud computing is actually one of the most popular themes of information systems research. Considering the nature of the processed information especially health care organizations need to assess and treat specific risks according to cloud computing in their information security management system. Therefore, in this paper we propose a framework that includes the most important security processes regarding cloud computing in the health care sector. Starting with a framework of general information security management processes derived from standards of the ISO 27000 family the most important information security processes for health care organizations using cloud computing will be identified considering the main risks regarding cloud computing and the type of information processed. The identified processes will help a health care organization using cloud computing to focus on the most important ISMS processes and establish and operate them at an appropriate level of maturity considering limited resources.
ERIC Educational Resources Information Center
Kaestner, Rich
2012-01-01
Most school business officials have heard the term "cloud computing" bandied about and may have some idea of what the term means. In fact, they likely already leverage a cloud-computing solution somewhere within their district. But what does cloud computing really mean? This brief article puts a bit of definition behind the term and helps one…
Cloud Computing in Higher Education Sector for Sustainable Development
ERIC Educational Resources Information Center
Duan, Yuchao
2016-01-01
Cloud computing is considered a new frontier in the field of computing, as this technology comprises three major entities namely: software, hardware and network. The collective nature of all these entities is known as the Cloud. This research aims to examine the impacts of various aspects namely: cloud computing, sustainability, performance…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-01
...-1659-01] Request for Comments on NIST Special Publication 500-293, US Government Cloud Computing... Publication 500-293, US Government Cloud Computing Technology Roadmap, Release 1.0 (Draft). This document is... (USG) agencies to accelerate their adoption of cloud computing. The roadmap has been developed through...
Reviews on Security Issues and Challenges in Cloud Computing
NASA Astrophysics Data System (ADS)
An, Y. Z.; Zaaba, Z. F.; Samsudin, N. F.
2016-11-01
Cloud computing is an Internet-based computing service provided by the third party allowing share of resources and data among devices. It is widely used in many organizations nowadays and becoming more popular because it changes the way of how the Information Technology (IT) of an organization is organized and managed. It provides lots of benefits such as simplicity and lower costs, almost unlimited storage, least maintenance, easy utilization, backup and recovery, continuous availability, quality of service, automated software integration, scalability, flexibility and reliability, easy access to information, elasticity, quick deployment and lower barrier to entry. While there is increasing use of cloud computing service in this new era, the security issues of the cloud computing become a challenges. Cloud computing must be safe and secure enough to ensure the privacy of the users. This paper firstly lists out the architecture of the cloud computing, then discuss the most common security issues of using cloud and some solutions to the security issues since security is one of the most critical aspect in cloud computing due to the sensitivity of user's data.
An Objective Classification of Saturn Cloud Features from Cassini ISS Images
NASA Technical Reports Server (NTRS)
Del Genio, Anthony D.; Barbara, John M.
2016-01-01
A k -means clustering algorithm is applied to Cassini Imaging Science Subsystem continuum and methane band images of Saturn's northern hemisphere to objectively classify regional albedo features and aid in their dynamical interpretation. The procedure is based on a technique applied previously to visible- infrared images of Earth. It provides a new perspective on giant planet cloud morphology and its relationship to the dynamics and a meteorological context for the analysis of other types of simultaneous Saturn observations. The method identifies 6 clusters that exhibit distinct morphology, vertical structure, and preferred latitudes of occurrence. These correspond to areas dominated by deep convective cells; low contrast areas, some including thinner and thicker clouds possibly associated with baroclinic instability; regions with possible isolated thin cirrus clouds; darker areas due to thinner low level clouds or clearer skies due to downwelling, or due to absorbing particles; and fields of relatively shallow cumulus clouds. The spatial associations among these cloud types suggest that dynamically, there are three distinct types of latitude bands on Saturn: deep convectively disturbed latitudes in cyclonic shear regions poleward of the eastward jets; convectively suppressed regions near and surrounding the westward jets; and baro-clinically unstable latitudes near eastward jet cores and in the anti-cyclonic regions equatorward of them. These are roughly analogous to some of the features of Earth's tropics, subtropics, and midlatitudes, respectively. This classification may be more useful for dynamics purposes than the traditional belt-zone partitioning. Temporal variations of feature contrast and cluster occurrence suggest that the upper tropospheric haze in the northern hemisphere may have thickened by 2014. The results suggest that routine use of clustering may be a worthwhile complement to many different types of planetary atmospheric data analysis.
Slow Cooling in Low Metallicity Clouds: An Origin of Globular Cluster Bimodality?
NASA Astrophysics Data System (ADS)
Fernandez, Ricardo; Bryan, Greg L.
2018-05-01
We explore the relative role of small-scale fragmentation and global collapse in low-metallicity clouds, pointing out that in such clouds the cooling time may be longer than the dynamical time, allowing the cloud to collapse globally before it can fragment. This, we suggest, may help to explain the formation of the low-metallicity globular cluster population, since such dense stellar systems need a large amount of gas to be collected in a small region (without significant feedback during the collapse). To explore this further, we carry out numerical simulations of low-metallicity Bonner-Ebert stable gas clouds, demonstrating that there exists a critical metallicity (between 0.001 and 0.01 Z⊙) below which the cloud collapses globally without fragmentation. We also run simulations including a background radiative heating source, showing that this can also produce clouds that do not fragment, and that the critical metallicity - which can exceed the no-radiation case - increases with the heating rate.
A Comprehensive Review of Existing Risk Assessment Models in Cloud Computing
NASA Astrophysics Data System (ADS)
Amini, Ahmad; Jamil, Norziana
2018-05-01
Cloud computing is a popular paradigm in information technology and computing as it offers numerous advantages in terms of economical saving and minimal management effort. Although elasticity and flexibility brings tremendous benefits, it still raises many information security issues due to its unique characteristic that allows ubiquitous computing. Therefore, the vulnerabilities and threats in cloud computing have to be identified and proper risk assessment mechanism has to be in place for better cloud computing management. Various quantitative and qualitative risk assessment models have been proposed but up to our knowledge, none of them is suitable for cloud computing environment. This paper, we compare and analyse the strengths and weaknesses of existing risk assessment models. We then propose a new risk assessment model that sufficiently address all the characteristics of cloud computing, which was not appeared in the existing models.
Mass to Luminosity Ratios of Some Clusters in the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Sohn, Young-Jong; Chun, Mun-Suk
1990-12-01
Luminosity profiles and dynamical parameters of 12 globular clusters in the Large Magellanic Cloud(SB(s)m) are obtained from the concentric aperture photoelectric photometry of 3 different aged clusters and the collected photometric data of 9 clusters. The total masses of the globular clusters are the calculated using the equation M = Mrt3(4¥Ø2-k2), which is derived from the theoretical rotation curve for the exponential disk(Chun 1987). These masses lie between 0.3 x 104 and 15.8 x 104 M . From the determined total mass and luminosity ratios are also derived. The M/L ratio of a cluster increases with the cluster age; about 0.03 for the youngest clusters(SWB ¥°) and about 0.24 for the oldest clusters(SUB ¥¶). There is a difference in M/L by a factor of 10 between the galactic globular clusters and the old globular clusters in the LCM.
Filamentary flow and magnetic geometry in evolving cluster-forming molecular cloud clumps
NASA Astrophysics Data System (ADS)
Klassen, Mikhail; Pudritz, Ralph E.; Kirk, Helen
2017-02-01
We present an analysis of the relationship between the orientation of magnetic fields and filaments that form in 3D magnetohydrodynamic simulations of cluster-forming, turbulent molecular cloud clumps. We examine simulated cloud clumps with size scales of L ˜ 2-4 pc and densities of n ˜ 400-1000 cm-3 with Alfvén Mach numbers near unity. We simulated two cloud clumps of different masses, one in virial equilibrium, the other strongly gravitationally bound, but with the same initial turbulent velocity field and similar mass-to-flux ratio. We apply various techniques to analyse the filamentary and magnetic structure of the resulting cloud, including the DISPERSE filament-finding algorithm in 3D. The largest structure that forms is a 1-2 parsec-long filament, with smaller connecting sub-filaments. We find that our simulated clouds, wherein magnetic forces and turbulence are comparable, coherent orientation of the magnetic field depends on the virial parameter. Sub-virial clumps undergo strong gravitational collapse and magnetic field lines are dragged with the accretion flow. We see evidence of filament-aligned flow and accretion flow on to the filament in the sub-virial cloud. Magnetic fields oriented more parallel in the sub-virial cloud and more perpendicular in the denser, marginally bound cloud. Radiative feedback from a 16 M⊙ star forming in a cluster in one of our simulation's ultimately results in the destruction of the main filament, the formation of an H II region, and the sweeping up of magnetic fields within an expanding shell at the edges of the H II region.
Astrophysical properties of star clusters in the Magellanic Clouds homogeneously estimated by ASteCA
NASA Astrophysics Data System (ADS)
Perren, G. I.; Piatti, A. E.; Vázquez, R. A.
2017-06-01
Aims: We seek to produce a homogeneous catalog of astrophysical parameters of 239 resolved star clusters, located in the Small and Large Magellanic Clouds, observed in the Washington photometric system. Methods: The cluster sample was processed with the recently introduced Automated Stellar Cluster Analysis (ASteCA) package, which ensures both an automatized and a fully reproducible treatment, together with a statistically based analysis of their fundamental parameters and associated uncertainties. The fundamental parameters determined for each cluster with this tool, via a color-magnitude diagram (CMD) analysis, are metallicity, age, reddening, distance modulus, and total mass. Results: We generated a homogeneous catalog of structural and fundamental parameters for the studied cluster sample and performed a detailed internal error analysis along with a thorough comparison with values taken from 26 published articles. We studied the distribution of cluster fundamental parameters in both Clouds and obtained their age-metallicity relationships. Conclusions: The ASteCA package can be applied to an unsupervised determination of fundamental cluster parameters, which is a task of increasing relevance as more data becomes available through upcoming surveys. A table with the estimated fundamental parameters for the 239 clusters analyzed is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/602/A89
Impacts and Opportunities for Engineering in the Era of Cloud Computing Systems
2012-01-31
2012 UNCLASSIFIED 1 of 58 Impacts and Opportunities for Engineering in the Era of Cloud Computing Systems A Report to the U.S. Department...2.1.7 Engineering of Computational Behavior .............................................................18 2.2 How the Cloud Will Impact Systems...58 Executive Summary This report discusses the impact of cloud computing and the broader revolution in computing on systems, on the disciplines of
Cool neutral hydrogen in the direction of an anonymous OB association
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bania, T.M.
1983-08-01
H I self-absorption is seen in the direction l = 55./sup 0/6 probably physically associated with an anonymous OB association which has the Cepheid GY Sagittae as a member. The cool H I is in two clouds at least 15 pc in diameter located 3.25 kpc from the Sun. If their temperature is approx. =50 K, the cloud masses are approx. =10/sup 3/ M/sub sun/. The neutral atomic hydrogen clouds are probably warm envelopes surrounding cold molecular cloud cores because CO observations in this region show two molecular clouds nearly coincident with the absorbing H i gas. Since the OBmore » association is only approx. =10/sup 7/ years old, these clouds are likely to be part of the original cloud complex from which the stellar cluster formed. The H i clouds are part of the larger Arecibo survey of self-absorption which suggests that many of the Arecibo clouds are associated with heretofore unidentified star clusters. Even if this is generally not the case, the Arecibo objects have accurate kinematic distances and thus provide a new sample of cool H I clouds whose thermodynamic properties can be studied.« less
Moisture structure of tropical cloud systems as inferred from SSM/I
NASA Technical Reports Server (NTRS)
Robertson, Franklin R.
1989-01-01
The structure of tropical cloud systems was examined using data obtained by the Special Sensor Microwave/Imager on vertically-integrated vapor, ice, and liquid water (including precipitable water) in a cloud cluster associated with a Pacific easterly wave. The cloud cluster provided a sample of the varying signatures of bulk microphysical processes in organized tropical convection. Composition techniques were used to interpret this variability and its significance in terms of the response of convection to its thermodynamic environment. The relative intensities of the ice and liquid-water signatures should provide insight on the relative contribution of stratiform vs convective rain and the characteristics of the water budgets of mesoscale convective systems.
Cloud Computing Value Chains: Understanding Businesses and Value Creation in the Cloud
NASA Astrophysics Data System (ADS)
Mohammed, Ashraf Bany; Altmann, Jörn; Hwang, Junseok
Based on the promising developments in Cloud Computing technologies in recent years, commercial computing resource services (e.g. Amazon EC2) or software-as-a-service offerings (e.g. Salesforce. com) came into existence. However, the relatively weak business exploitation, participation, and adoption of other Cloud Computing services remain the main challenges. The vague value structures seem to be hindering business adoption and the creation of sustainable business models around its technology. Using an extensive analyze of existing Cloud business models, Cloud services, stakeholder relations, market configurations and value structures, this Chapter develops a reference model for value chains in the Cloud. Although this model is theoretically based on porter's value chain theory, the proposed Cloud value chain model is upgraded to fit the diversity of business service scenarios in the Cloud computing markets. Using this model, different service scenarios are explained. Our findings suggest new services, business opportunities, and policy practices for realizing more adoption and value creation paths in the Cloud.
Virtualization and cloud computing in dentistry.
Chow, Frank; Muftu, Ali; Shorter, Richard
2014-01-01
The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Global Software Development with Cloud Platforms
NASA Astrophysics Data System (ADS)
Yara, Pavan; Ramachandran, Ramaseshan; Balasubramanian, Gayathri; Muthuswamy, Karthik; Chandrasekar, Divya
Offshore and outsourced distributed software development models and processes are facing challenges, previously unknown, with respect to computing capacity, bandwidth, storage, security, complexity, reliability, and business uncertainty. Clouds promise to address these challenges by adopting recent advances in virtualization, parallel and distributed systems, utility computing, and software services. In this paper, we envision a cloud-based platform that addresses some of these core problems. We outline a generic cloud architecture, its design and our first implementation results for three cloud forms - a compute cloud, a storage cloud and a cloud-based software service- in the context of global distributed software development (GSD). Our ”compute cloud” provides computational services such as continuous code integration and a compile server farm, ”storage cloud” offers storage (block or file-based) services with an on-line virtual storage service, whereas the on-line virtual labs represent a useful cloud service. We note some of the use cases for clouds in GSD, the lessons learned with our prototypes and identify challenges that must be conquered before realizing the full business benefits. We believe that in the future, software practitioners will focus more on these cloud computing platforms and see clouds as a means to supporting a ecosystem of clients, developers and other key stakeholders.
Cloud Based Applications and Platforms (Presentation)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brodt-Giles, D.
2014-05-15
Presentation to the Cloud Computing East 2014 Conference, where we are highlighting our cloud computing strategy, describing the platforms on the cloud (including Smartgrid.gov), and defining our process for implementing cloud based applications.
Ages of Extragalactic Intermediate-Age Star Clusters
NASA Technical Reports Server (NTRS)
Flower, P. J.
1983-01-01
A dating technique for faint, distant star clusters observable in the local group of galaxies with the space telescope is discussed. Color-magnitude diagrams of Magellanic Cloud clusters are mentioned along with the metallicity of star clusters.
Improving ATLAS grid site reliability with functional tests using HammerCloud
NASA Astrophysics Data System (ADS)
Elmsheuser, Johannes; Legger, Federica; Medrano Llamas, Ramon; Sciacca, Gianfranco; van der Ster, Dan
2012-12-01
With the exponential growth of LHC (Large Hadron Collider) data in 2011, and more coming in 2012, distributed computing has become the established way to analyse collider data. The ATLAS grid infrastructure includes almost 100 sites worldwide, ranging from large national computing centers to smaller university clusters. These facilities are used for data reconstruction and simulation, which are centrally managed by the ATLAS production system, and for distributed user analysis. To ensure the smooth operation of such a complex system, regular tests of all sites are necessary to validate the site capability of successfully executing user and production jobs. We report on the development, optimization and results of an automated functional testing suite using the HammerCloud framework. Functional tests are short lightweight applications covering typical user analysis and production schemes, which are periodically submitted to all ATLAS grid sites. Results from those tests are collected and used to evaluate site performances. Sites that fail or are unable to run the tests are automatically excluded from the PanDA brokerage system, therefore avoiding user or production jobs to be sent to problematic sites.
Cloud-based Predictive Modeling System and its Application to Asthma Readmission Prediction
Chen, Robert; Su, Hang; Khalilia, Mohammed; Lin, Sizhe; Peng, Yue; Davis, Tod; Hirsh, Daniel A; Searles, Elizabeth; Tejedor-Sojo, Javier; Thompson, Michael; Sun, Jimeng
2015-01-01
The predictive modeling process is time consuming and requires clinical researchers to handle complex electronic health record (EHR) data in restricted computational environments. To address this problem, we implemented a cloud-based predictive modeling system via a hybrid setup combining a secure private server with the Amazon Web Services (AWS) Elastic MapReduce platform. EHR data is preprocessed on a private server and the resulting de-identified event sequences are hosted on AWS. Based on user-specified modeling configurations, an on-demand web service launches a cluster of Elastic Compute 2 (EC2) instances on AWS to perform feature selection and classification algorithms in a distributed fashion. Afterwards, the secure private server aggregates results and displays them via interactive visualization. We tested the system on a pediatric asthma readmission task on a de-identified EHR dataset of 2,967 patients. We conduct a larger scale experiment on the CMS Linkable 2008–2010 Medicare Data Entrepreneurs’ Synthetic Public Use File dataset of 2 million patients, which achieves over 25-fold speedup compared to sequential execution. PMID:26958172
NASA Astrophysics Data System (ADS)
Wang, Q. Daniel; Dong, Hui; Lang, Cornelia
2006-09-01
The Galactic centre (GC) provides a unique laboratory for a detailed examination of the interplay between massive star formation and the nuclear environment of our Galaxy. Here, we present a 100-ks Chandra Advanced CCD Imaging Spectrometer (ACIS) observation of the Arches and Quintuplet star clusters. We also report on a complementary mapping of the dense molecular gas near the Arches cluster made with the Owens Valley Millimeter Array. We present a catalogue of 244 point-like X-ray sources detected in the observation. Their number-flux relation indicates an overpopulation of relatively bright X-ray sources, which are apparently associated with the clusters. The sources in the core of the Arches and Quintuplet clusters are most likely extreme colliding wind massive star binaries. The diffuse X-ray emission from the core of the Arches cluster has a spectrum showing a 6.7-keV emission line and a surface intensity profile declining steeply with radius, indicating an origin in a cluster wind. In the outer regions near the Arches cluster, the overall diffuse X-ray enhancement demonstrates a bow shock morphology and is prominent in the Fe Kα 6.4-keV line emission with an equivalent width of ~1.4 keV. Much of this enhancement may result from an ongoing collision between the cluster and the adjacent molecular cloud, which have a relative velocity >~120km-1. The older and less-compact Quintuplet cluster contains much weaker X-ray sources and diffuse emission, probably originating from low-mass stellar objects as well as a cluster wind. However, the overall population of these objects, constrained by the observed total diffuse X-ray luminosities, is substantially smaller than expected for both clusters, if they have normal Miller & Scalo initial mass functions. This deficiency of low-mass objects may be a manifestation of the unique star formation environment of the GC, where high-velocity cloud-cloud and cloud-cluster collisions are frequent.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-22
... explored in this series is cloud computing. The workshop on this topic will be held in Gaithersburg, MD on October 21, 2011. Assertion: ``Current implementations of cloud computing indicate a new approach to security'' Implementations of cloud computing have provided new ways of thinking about how to secure data...
77 FR 74829 - Notice of Public Meeting-Cloud Computing and Big Data Forum and Workshop
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-18
...--Cloud Computing and Big Data Forum and Workshop AGENCY: National Institute of Standards and Technology... Standards and Technology (NIST) announces a Cloud Computing and Big Data Forum and Workshop to be held on... followed by a one-day hands-on workshop. The NIST Cloud Computing and Big Data Forum and Workshop will...
ERIC Educational Resources Information Center
Tweel, Abdeneaser
2012-01-01
High uncertainties related to cloud computing adoption may hinder IT managers from making solid decisions about adopting cloud computing. The problem addressed in this study was the lack of understanding of the relationship between factors related to the adoption of cloud computing and IT managers' interest in adopting this technology. In…
When cloud computing meets bioinformatics: a review.
Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong
2013-10-01
In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.
THE LOCATION, CLUSTERING, AND PROPAGATION OF MASSIVE STAR FORMATION IN GIANT MOLECULAR CLOUDS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ochsendorf, Bram B.; Meixner, Margaret; Chastenet, Jérémy
Massive stars are key players in the evolution of galaxies, yet their formation pathway remains unclear. In this work, we use data from several galaxy-wide surveys to build an unbiased data set of ∼600 massive young stellar objects, ∼200 giant molecular clouds (GMCs), and ∼100 young (<10 Myr) optical stellar clusters (SCs) in the Large Magellanic Cloud. We employ this data to quantitatively study the location and clustering of massive star formation and its relation to the internal structure of GMCs. We reveal that massive stars do not typically form at the highest column densities nor centers of their parentmore » GMCs at the ∼6 pc resolution of our observations. Massive star formation clusters over multiple generations and on size scales much smaller than the size of the parent GMC. We find that massive star formation is significantly boosted in clouds near SCs. However, whether a cloud is associated with an SC does not depend on either the cloud’s mass or global surface density. These results reveal a connection between different generations of massive stars on timescales up to 10 Myr. We compare our work with Galactic studies and discuss our findings in terms of GMC collapse, triggered star formation, and a potential dichotomy between low- and high-mass star formation.« less
NASA Astrophysics Data System (ADS)
Getman, Konstantin V.; Feigelson, Eric; Kuhn, Michael A.; Broos, Patrick S; Townsley, Leisa K.; Naylor, Tim; Povich, Matthew S.; Luhman, Kevin; Garmire, Gordon
2014-08-01
The MYStIX (Massive Young Star-Forming Complex Study in Infrared and X-ray) project seeks to characterize 20 OB-dominated young star forming regions (SFRs) at distances <4 kpc using photometric catalogs from the Chandra X-ray Observatory, Spitzer Space Telescope, UKIRT and 2MASS surveys. As part of the MYStIX project, we developed a new stellar chronometer that employs near-infrared and X-ray photometry data, AgeJX. Computing AgeJX averaged over MYStIX (sub)clusters reveals previously unknown age gradients across most of the MYStIX regions as well as within some individual rich clusters. Within the SFRs, the inferred AgeJX ages are youngest in obscured locations in molecular clouds, intermediate in revealed stellar clusters, and oldest in distributed stellar populations. Noticeable intra-cluster gradients are seen in the NGC 2024 (Flame Nebula) star cluster and the Orion Nebula Cluster (ONC): stars in cluster cores appear younger and thus were formed later than stars in cluster halos. The latter result has two important implications for the formation of young stellar clusters. Clusters likely form slowly: they do not arise from a single nearly-instantaneous burst of star formation. The simple models where clusters form inside-out are likely incorrect, and more complex models are needed. We provide several star formation scenarios that alone or in combination may lead to the observed core-halo age gradients.
NASA Astrophysics Data System (ADS)
Yu, Xiaoyuan; Yuan, Jian; Chen, Shi
2013-03-01
Cloud computing is one of the most popular topics in the IT industry and is recently being adopted by many companies. It has four development models, as: public cloud, community cloud, hybrid cloud and private cloud. Except others, private cloud can be implemented in a private network, and delivers some benefits of cloud computing without pitfalls. This paper makes a comparison of typical open source platforms through which we can implement a private cloud. After this comparison, we choose Eucalyptus and Wavemaker to do a case study on the private cloud. We also do some performance estimation of cloud platform services and development of prototype software as cloud services.
Cloud4Psi: cloud computing for 3D protein structure similarity searching.
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-10-01
Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. © The Author 2014. Published by Oxford University Press.
Cloud4Psi: cloud computing for 3D protein structure similarity searching
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-01-01
Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact: dariusz.mrozek@polsl.pl PMID:24930141
NASA Astrophysics Data System (ADS)
Arko, S. A.; Hogenson, R.; Geiger, A.; Herrmann, J.; Buechler, B.; Hogenson, K.
2016-12-01
In the coming years there will be an unprecedented amount of SAR data available on a free and open basis to research and operational users around the globe. The Alaska Satellite Facility (ASF) DAAC hosts, through an international agreement, data from the Sentinel-1 spacecraft and will be hosting data from the upcoming NASA ISRO SAR (NISAR) mission. To more effectively manage and exploit these vast datasets, ASF DAAC has begun moving portions of the archive to the cloud and utilizing cloud services to provide higher-level processing on the data. The Hybrid Pluggable Processing Pipeline (HyP3) project is designed to support higher-level data processing in the cloud and extend the capabilities of researchers to larger scales. Built upon a set of core Amazon cloud services, the HyP3 system allows users to request data processing using a number of canned algorithms or their own algorithms once they have been uploaded to the cloud. The HyP3 system automatically accesses the ASF cloud-based archive through the DAAC RESTful application programming interface and processes the data on Amazon's elastic compute cluster (EC2). Final products are distributed through Amazon's simple storage service (S3) and are available for user download. This presentation will provide an overview of ASF DAAC's activities moving the Sentinel-1 archive into the cloud and developing the integrated HyP3 system, covering both the benefits and difficulties of working in the cloud. Additionally, we will focus on the utilization of HyP3 for higher-level processing of SAR data. Two example algorithms, for sea-ice tracking and change detection, will be discussed as well as the mechanism for integrating new algorithms into the pipeline for community use.
Flexible services for the support of research.
Turilli, Matteo; Wallom, David; Williams, Chris; Gough, Steve; Curran, Neal; Tarrant, Richard; Bretherton, Dan; Powell, Andy; Johnson, Matt; Harmer, Terry; Wright, Peter; Gordon, John
2013-01-28
Cloud computing has been increasingly adopted by users and providers to promote a flexible, scalable and tailored access to computing resources. Nonetheless, the consolidation of this paradigm has uncovered some of its limitations. Initially devised by corporations with direct control over large amounts of computational resources, cloud computing is now being endorsed by organizations with limited resources or with a more articulated, less direct control over these resources. The challenge for these organizations is to leverage the benefits of cloud computing while dealing with limited and often widely distributed computing resources. This study focuses on the adoption of cloud computing by higher education institutions and addresses two main issues: flexible and on-demand access to a large amount of storage resources, and scalability across a heterogeneous set of cloud infrastructures. The proposed solutions leverage a federated approach to cloud resources in which users access multiple and largely independent cloud infrastructures through a highly customizable broker layer. This approach allows for a uniform authentication and authorization infrastructure, a fine-grained policy specification and the aggregation of accounting and monitoring. Within a loosely coupled federation of cloud infrastructures, users can access vast amount of data without copying them across cloud infrastructures and can scale their resource provisions when the local cloud resources become insufficient.
FOSS GIS on the GFZ HPC cluster: Towards a service-oriented Scientific Geocomputation Environment
NASA Astrophysics Data System (ADS)
Loewe, P.; Klump, J.; Thaler, J.
2012-12-01
High performance compute clusters can be used as geocomputation workbenches. Their wealth of resources enables us to take on geocomputation tasks which exceed the limitations of smaller systems. These general capabilities can be harnessed via tools such as Geographic Information System (GIS), provided they are able to utilize the available cluster configuration/architecture and provide a sufficient degree of user friendliness to allow for wide application. While server-level computing is clearly not sufficient for the growing numbers of data- or computation-intense tasks undertaken, these tasks do not get even close to the requirements needed for access to "top shelf" national cluster facilities. So until recently such kind of geocomputation research was effectively barred due to lack access to of adequate resources. In this paper we report on the experiences gained by providing GRASS GIS as a software service on a HPC compute cluster at the German Research Centre for Geosciences using Platform Computing's Load Sharing Facility (LSF). GRASS GIS is the oldest and largest Free Open Source (FOSS) GIS project. During ramp up in 2011, multiple versions of GRASS GIS (v 6.4.2, 6.5 and 7.0) were installed on the HPC compute cluster, which currently consists of 234 nodes with 480 CPUs providing 3084 cores. Nineteen different processing queues with varying hardware capabilities and priorities are provided, allowing for fine-grained scheduling and load balancing. After successful initial testing, mechanisms were developed to deploy scripted geocomputation tasks onto dedicated processing queues. The mechanisms are based on earlier work by NETELER et al. (2008) and allow to use all 3084 cores for GRASS based geocomputation work. However, in practice applications are limited to fewer resources as assigned to their respective queue. Applications of the new GIS functionality comprise so far of hydrological analysis, remote sensing and the generation of maps of simulated tsunamis in the Mediterranean Sea for the Tsunami Atlas of the FP-7 TRIDEC Project (www.tridec-online.eu). This included the processing of complex problems, requiring significant amounts of processing time up to full 20 CPU days. This GRASS GIS-based service is provided as a research utility in the sense of "Software as a Service" (SaaS) and is a first step towards a GFZ corporate cloud service.
The emerging role of cloud computing in molecular modelling.
Ebejer, Jean-Paul; Fulle, Simone; Morris, Garrett M; Finn, Paul W
2013-07-01
There is a growing recognition of the importance of cloud computing for large-scale and data-intensive applications. The distinguishing features of cloud computing and their relationship to other distributed computing paradigms are described, as are the strengths and weaknesses of the approach. We review the use made to date of cloud computing for molecular modelling projects and the availability of front ends for molecular modelling applications. Although the use of cloud computing technologies for molecular modelling is still in its infancy, we demonstrate its potential by presenting several case studies. Rapid growth can be expected as more applications become available and costs continue to fall; cloud computing can make a major contribution not just in terms of the availability of on-demand computing power, but could also spur innovation in the development of novel approaches that utilize that capacity in more effective ways. Copyright © 2013 Elsevier Inc. All rights reserved.
Highly efficient star formation in NGC 5253 possibly from stream-fed accretion.
Turner, J L; Beck, S C; Benford, D J; Consiglio, S M; Ho, P T P; Kovács, A; Meier, D S; Zhao, J-H
2015-03-19
Gas clouds in present-day galaxies are inefficient at forming stars. Low star-formation efficiency is a critical parameter in galaxy evolution: it is why stars are still forming nearly 14 billion years after the Big Bang and why star clusters generally do not survive their births, instead dispersing to form galactic disks or bulges. Yet the existence of ancient massive bound star clusters (globular clusters) in the Milky Way suggests that efficiencies were higher when they formed ten billion years ago. A local dwarf galaxy, NGC 5253, has a young star cluster that provides an example of highly efficient star formation. Here we report the detection of the J = 3→2 rotational transition of CO at the location of the massive cluster. The gas cloud is hot, dense, quiescent and extremely dusty. Its gas-to-dust ratio is lower than the Galactic value, which we attribute to dust enrichment by the embedded star cluster. Its star-formation efficiency exceeds 50 per cent, tenfold that of clouds in the Milky Way. We suggest that high efficiency results from the force-feeding of star formation by a streamer of gas falling into the galaxy.
Challenges in Securing the Interface Between the Cloud and Pervasive Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lagesse, Brent J
2011-01-01
Cloud computing presents an opportunity for pervasive systems to leverage computational and storage resources to accomplish tasks that would not normally be possible on such resource-constrained devices. Cloud computing can enable hardware designers to build lighter systems that last longer and are more mobile. Despite the advantages cloud computing offers to the designers of pervasive systems, there are some limitations of leveraging cloud computing that must be addressed. We take the position that cloud-based pervasive system must be secured holistically and discuss ways this might be accomplished. In this paper, we discuss a pervasive system utilizing cloud computing resources andmore » issues that must be addressed in such a system. In this system, the user's mobile device cannot always have network access to leverage resources from the cloud, so it must make intelligent decisions about what data should be stored locally and what processes should be run locally. As a result of these decisions, the user becomes vulnerable to attacks while interfacing with the pervasive system.« less
An Architecture for Cross-Cloud System Management
NASA Astrophysics Data System (ADS)
Dodda, Ravi Teja; Smith, Chris; van Moorsel, Aad
The emergence of the cloud computing paradigm promises flexibility and adaptability through on-demand provisioning of compute resources. As the utilization of cloud resources extends beyond a single provider, for business as well as technical reasons, the issue of effectively managing such resources comes to the fore. Different providers expose different interfaces to their compute resources utilizing varied architectures and implementation technologies. This heterogeneity poses a significant system management problem, and can limit the extent to which the benefits of cross-cloud resource utilization can be realized. We address this problem through the definition of an architecture to facilitate the management of compute resources from different cloud providers in an homogenous manner. This preserves the flexibility and adaptability promised by the cloud computing paradigm, whilst enabling the benefits of cross-cloud resource utilization to be realized. The practical efficacy of the architecture is demonstrated through an implementation utilizing compute resources managed through different interfaces on the Amazon Elastic Compute Cloud (EC2) service. Additionally, we provide empirical results highlighting the performance differential of these different interfaces, and discuss the impact of this performance differential on efficiency and profitability.
'Cloud computing' and clinical trials: report from an ECRIN workshop.
Ohmann, Christian; Canham, Steve; Danielyan, Edgar; Robertshaw, Steve; Legré, Yannick; Clivio, Luca; Demotes, Jacques
2015-07-29
Growing use of cloud computing in clinical trials prompted the European Clinical Research Infrastructures Network, a European non-profit organisation established to support multinational clinical research, to organise a one-day workshop on the topic to clarify potential benefits and risks. The issues that arose in that workshop are summarised and include the following: the nature of cloud computing and the cloud computing industry; the risks in using cloud computing services now; the lack of explicit guidance on this subject, both generally and with reference to clinical trials; and some possible ways of reducing risks. There was particular interest in developing and using a European 'community cloud' specifically for academic clinical trial data. It was recognised that the day-long workshop was only the start of an ongoing process. Future discussion needs to include clarification of trial-specific regulatory requirements for cloud computing and involve representatives from the relevant regulatory bodies.
NASA Astrophysics Data System (ADS)
Lin, Yuxin; Liu, Hauyu Baobab; Li, Di; Zhang, Zhi-Yu; Ginsburg, Adam; Pineda, Jaime E.; Qian, Lei; Galván-Madrid, Roberto; McLeod, Anna Faye; Rosolowsky, Erik; Dale, James E.; Immer, Katharina; Koch, Eric; Longmore, Steve; Walker, Daniel; Testi, Leonardo
2016-09-01
We have developed an iterative procedure to systematically combine the millimeter and submillimeter images of OB cluster-forming molecular clouds, which were taken by ground-based (CSO, JCMT, APEX, and IRAM-30 m) and space telescopes (Herschel and Planck). For the seven luminous (L\\gt {10}6 L ⊙) Galactic OB cluster-forming molecular clouds selected for our analyses, namely W49A, W43-Main, W43-South, W33, G10.6-0.4, G10.2-0.3, and G10.3-0.1, we have performed single-component, modified blackbody fits to each pixel of the combined (sub)millimeter images, and the Herschel PACS and SPIRE images at shorter wavelengths. The ˜10″ resolution dust column density and temperature maps of these sources revealed dramatically different morphologies, indicating very different modes of OB cluster-formation, or parent molecular cloud structures in different evolutionary stages. The molecular clouds W49A, W33, and G10.6-0.4 show centrally concentrated massive molecular clumps that are connected with approximately radially orientated molecular gas filaments. The W43-Main and W43-South molecular cloud complexes, which are located at the intersection of the Galactic near 3 kpc (or Scutum) arm and the Galactic bar, show a widely scattered distribution of dense molecular clumps/cores over the observed ˜10 pc spatial scale. The relatively evolved sources G10.2-0.3 and G10.3-0.1 appear to be affected by stellar feedback, and show a complicated cloud morphology embedded with abundant dense molecular clumps/cores. We find that with the high angular resolution we achieved, our visual classification of cloud morphology can be linked to the systematically derived statistical quantities (I.e., the enclosed mass profile, the column density probability distribution function (N-PDF), the two-point correlation function of column density, and the probability distribution function of clump/core separations). In particular, the massive molecular gas clumps located at the center of G10.6-0.4 and W49A, which contribute to a considerable fraction of their overall cloud masses, may be special OB cluster-forming environments as a direct consequence of global cloud collapse. These centralized massive molecular gas clumps also uniquely occupy much higher column densities than what is determined by the overall fit of power-law N-PDF. We have made efforts to archive the derived statistical quantities of individual target sources, to permit comparisons with theoretical frameworks, numerical simulations, and other observations in the future.
Cloud Computing - A Unified Approach for Surveillance Issues
NASA Astrophysics Data System (ADS)
Rachana, C. R.; Banu, Reshma, Dr.; Ahammed, G. F. Ali, Dr.; Parameshachari, B. D., Dr.
2017-08-01
Cloud computing describes highly scalable resources provided as an external service via the Internet on a basis of pay-per-use. From the economic point of view, the main attractiveness of cloud computing is that users only use what they need, and only pay for what they actually use. Resources are available for access from the cloud at any time, and from any location through networks. Cloud computing is gradually replacing the traditional Information Technology Infrastructure. Securing data is one of the leading concerns and biggest issue for cloud computing. Privacy of information is always a crucial pointespecially when an individual’s personalinformation or sensitive information is beingstored in the organization. It is indeed true that today; cloud authorization systems are notrobust enough. This paper presents a unified approach for analyzing the various security issues and techniques to overcome the challenges in the cloud environment.
Research on the application in disaster reduction for using cloud computing technology
NASA Astrophysics Data System (ADS)
Tao, Liang; Fan, Yida; Wang, Xingling
Cloud Computing technology has been rapidly applied in different domains recently, promotes the progress of the domain's informatization. Based on the analysis of the state of application requirement in disaster reduction and combining the characteristics of Cloud Computing technology, we present the research on the application of Cloud Computing technology in disaster reduction. First of all, we give the architecture of disaster reduction cloud, which consists of disaster reduction infrastructure as a service (IAAS), disaster reduction cloud application platform as a service (PAAS) and disaster reduction software as a service (SAAS). Secondly, we talk about the standard system of disaster reduction in five aspects. Thirdly, we indicate the security system of disaster reduction cloud. Finally, we draw a conclusion the use of cloud computing technology will help us to solve the problems for disaster reduction and promote the development of disaster reduction.
NASA Astrophysics Data System (ADS)
Roth, A.; Schneider, J.; Klimach, T.; Mertes, S.; van Pinxteren, D.; Herrmann, H.; Borrmann, S.
2016-01-01
Cloud residues and out-of-cloud aerosol particles with diameters between 150 and 900 nm were analysed by online single particle aerosol mass spectrometry during the 6-week study Hill Cap Cloud Thuringia (HCCT)-2010 in September-October 2010. The measurement location was the mountain Schmücke (937 m a.s.l.) in central Germany. More than 160 000 bipolar mass spectra from out-of-cloud aerosol particles and more than 13 000 bipolar mass spectra from cloud residual particles were obtained and were classified using a fuzzy c-means clustering algorithm. Analysis of the uncertainty of the sorting algorithm was conducted on a subset of the data by comparing the clustering output with particle-by-particle inspection and classification by the operator. This analysis yielded a false classification probability between 13 and 48 %. Additionally, particle types were identified by specific marker ions. The results from the ambient aerosol analysis show that 63 % of the analysed particles belong to clusters having a diurnal variation, suggesting that local or regional sources dominate the aerosol, especially for particles containing soot and biomass burning particles. In the cloud residues, the relative percentage of large soot-containing particles and particles containing amines was found to be increased compared to the out-of-cloud aerosol, while, in general, organic particles were less abundant in the cloud residues. In the case of amines, this can be explained by the high solubility of the amines, while the large soot-containing particles were found to be internally mixed with inorganics, which explains their activation as cloud condensation nuclei. Furthermore, the results show that during cloud processing, both sulfate and nitrate are added to the residual particles, thereby changing the mixing state and increasing the fraction of particles with nitrate and/or sulfate. This is expected to lead to higher hygroscopicity after cloud evaporation, and therefore to an increase of the particles' ability to act as cloud condensation nuclei after their cloud passage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shin, Dongwan; Claycomb, William R.; Urias, Vincent E.
Cloud computing is a paradigm rapidly being embraced by government and industry as a solution for cost-savings, scalability, and collaboration. While a multitude of applications and services are available commercially for cloud-based solutions, research in this area has yet to fully embrace the full spectrum of potential challenges facing cloud computing. This tutorial aims to provide researchers with a fundamental understanding of cloud computing, with the goals of identifying a broad range of potential research topics, and inspiring a new surge in research to address current issues. We will also discuss real implementations of research-oriented cloud computing systems for bothmore » academia and government, including configuration options, hardware issues, challenges, and solutions.« less
Millstone: software for multiplex microbial genome analysis and engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.
Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.
Millstone: software for multiplex microbial genome analysis and engineering.
Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M
2017-05-25
Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.
Millstone: software for multiplex microbial genome analysis and engineering
Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.; ...
2017-05-25
Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.
ERIC Educational Resources Information Center
Conn, Samuel S.; Reichgelt, Han
2013-01-01
Cloud computing represents an architecture and paradigm of computing designed to deliver infrastructure, platforms, and software as constructible computing resources on demand to networked users. As campuses are challenged to better accommodate academic needs for applications and computing environments, cloud computing can provide an accommodating…
Challenges and Security in Cloud Computing
NASA Astrophysics Data System (ADS)
Chang, Hyokyung; Choi, Euiin
People who live in this world want to solve any problems as they happen then. An IT technology called Ubiquitous computing should help the situations easier and we call a technology which makes it even better and powerful cloud computing. Cloud computing, however, is at the stage of the beginning to implement and use and it faces a lot of challenges in technical matters and security issues. This paper looks at the cloud computing security.
Making Cloud Computing Available For Researchers and Innovators (Invited)
NASA Astrophysics Data System (ADS)
Winsor, R.
2010-12-01
High Performance Computing (HPC) facilities exist in most academic institutions but are almost invariably over-subscribed. Access is allocated based on academic merit, the only practical method of assigning valuable finite compute resources. Cloud computing on the other hand, and particularly commercial clouds, draw flexibly on an almost limitless resource as long as the user has sufficient funds to pay the bill. How can the commercial cloud model be applied to scientific computing? Is there a case to be made for a publicly available research cloud and how would it be structured? This talk will explore these themes and describe how Cybera, a not-for-profit non-governmental organization in Alberta Canada, aims to leverage its high speed research and education network to provide cloud computing facilities for a much wider user base.
Big data mining analysis method based on cloud computing
NASA Astrophysics Data System (ADS)
Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao
2017-08-01
Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.
Horizontal branch stars, and galactic and magellanic cloud globular clusters
NASA Technical Reports Server (NTRS)
Deboer, K. S.
1981-01-01
Seven blue horizontal branch stars in the field were observed and a few HB stars were isolated in globular clusters. Energy distributions are compared to assess possible differences and also used in comparison with model atmospheres. Observed energy distributions of HB stars in NGC 6397 are used to estimate the total number of HB stars which produced the integrated fluxes as observed by ANS. Preliminary results are given for colors of globular clusters observed in the Magellanic Clouds and for their extent, based on the Washburn IUE extraction.
clubber: removing the bioinformatics bottleneck in big data analyses.
Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana
2017-06-13
With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these "big data" analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber's goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment.
clubber: removing the bioinformatics bottleneck in big data analyses
Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana
2018-01-01
With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment. PMID:28609295
Calibration of LOFAR data on the cloud
NASA Astrophysics Data System (ADS)
Sabater, J.; Sánchez-Expósito, S.; Best, P.; Garrido, J.; Verdes-Montenegro, L.; Lezzi, D.
2017-04-01
New scientific instruments are starting to generate an unprecedented amount of data. The Low Frequency Array (LOFAR), one of the Square Kilometre Array (SKA) pathfinders, is already producing data on a petabyte scale. The calibration of these data presents a huge challenge for final users: (a) extensive storage and computing resources are required; (b) the installation and maintenance of the software required for the processing is not trivial; and (c) the requirements of calibration pipelines, which are experimental and under development, are quickly evolving. After encountering some limitations in classical infrastructures like dedicated clusters, we investigated the viability of cloud infrastructures as a solution. We found that the installation and operation of LOFAR data calibration pipelines is not only possible, but can also be efficient in cloud infrastructures. The main advantages were: (1) the ease of software installation and maintenance, and the availability of standard APIs and tools, widely used in the industry; this reduces the requirement for significant manual intervention, which can have a highly negative impact in some infrastructures; (2) the flexibility to adapt the infrastructure to the needs of the problem, especially as those demands change over time; (3) the on-demand consumption of (shared) resources. We found that a critical factor (also in other infrastructures) is the availability of scratch storage areas of an appropriate size. We found no significant impediments associated with the speed of data transfer, the use of virtualization, the use of external block storage, or the memory available (provided a minimum threshold is reached). Finally, we considered the cost-effectiveness of a commercial cloud like Amazon Web Services. While a cloud solution is more expensive than the operation of a large, fully-utilized cluster completely dedicated to LOFAR data reduction, we found that its costs are competitive if the number of datasets to be analysed is not high, or if the costs of maintaining a system capable of calibrating LOFAR data become high. Coupled with the advantages discussed above, this suggests that a cloud infrastructure may be favourable for many users.
Charting a Security Landscape in the Clouds: Data Protection and Collaboration in Cloud Storage
2016-07-01
cloud computing is perhaps the most revolutionary force in the information technology industry today. This field encompasses many different domains...characteristic shared by all cloud computing tasks is that they involve storing data in the cloud . In this report, we therefore aim to describe and rank the...CONCLUSION The advent of cloud computing has caused government organizations to rethink their IT architectures so that they can take advantage of the
Zao, John K.; Gan, Tchin-Tze; You, Chun-Kai; Chung, Cheng-En; Wang, Yu-Te; Rodríguez Méndez, Sergio José; Mullen, Tim; Yu, Chieh; Kothe, Christian; Hsiao, Ching-Teng; Chu, San-Liang; Shieh, Ce-Kuen; Jung, Tzyy-Ping
2014-01-01
EEG-based Brain-computer interfaces (BCI) are facing basic challenges in real-world applications. The technical difficulties in developing truly wearable BCI systems that are capable of making reliable real-time prediction of users' cognitive states in dynamic real-life situations may seem almost insurmountable at times. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report an attempt to develop a pervasive on-line EEG-BCI system using state-of-art technologies including multi-tier Fog and Cloud Computing, semantic Linked Data search, and adaptive prediction/classification models. To verify our approach, we implement a pilot system by employing wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end Fog Servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end Cloud Servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line EEG-BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch to use our system in real-life personal stress monitoring and the UCSD Movement Disorder Center to conduct in-home Parkinson's disease patient monitoring experiments. We shall proceed to develop the necessary BCI ontology and introduce automatic semantic annotation and progressive model refinement capability to our system. PMID:24917804
Zao, John K; Gan, Tchin-Tze; You, Chun-Kai; Chung, Cheng-En; Wang, Yu-Te; Rodríguez Méndez, Sergio José; Mullen, Tim; Yu, Chieh; Kothe, Christian; Hsiao, Ching-Teng; Chu, San-Liang; Shieh, Ce-Kuen; Jung, Tzyy-Ping
2014-01-01
EEG-based Brain-computer interfaces (BCI) are facing basic challenges in real-world applications. The technical difficulties in developing truly wearable BCI systems that are capable of making reliable real-time prediction of users' cognitive states in dynamic real-life situations may seem almost insurmountable at times. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report an attempt to develop a pervasive on-line EEG-BCI system using state-of-art technologies including multi-tier Fog and Cloud Computing, semantic Linked Data search, and adaptive prediction/classification models. To verify our approach, we implement a pilot system by employing wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end Fog Servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end Cloud Servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line EEG-BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch to use our system in real-life personal stress monitoring and the UCSD Movement Disorder Center to conduct in-home Parkinson's disease patient monitoring experiments. We shall proceed to develop the necessary BCI ontology and introduce automatic semantic annotation and progressive model refinement capability to our system.
Introducing Cloud Computing Topics in Curricula
ERIC Educational Resources Information Center
Chen, Ling; Liu, Yang; Gallagher, Marcus; Pailthorpe, Bernard; Sadiq, Shazia; Shen, Heng Tao; Li, Xue
2012-01-01
The demand for graduates with exposure in Cloud Computing is on the rise. For many educational institutions, the challenge is to decide on how to incorporate appropriate cloud-based technologies into their curricula. In this paper, we describe our design and experiences of integrating Cloud Computing components into seven third/fourth-year…
Autonomic Management of Application Workflows on Hybrid Computing Infrastructure
Kim, Hyunjoo; el-Khamra, Yaakoub; Rodero, Ivan; ...
2011-01-01
In this paper, we present a programming and runtime framework that enables the autonomic management of complex application workflows on hybrid computing infrastructures. The framework is designed to address system and application heterogeneity and dynamics to ensure that application objectives and constraints are satisfied. The need for such autonomic system and application management is becoming critical as computing infrastructures become increasingly heterogeneous, integrating different classes of resources from high-end HPC systems to commodity clusters and clouds. For example, the framework presented in this paper can be used to provision the appropriate mix of resources based on application requirements and constraints.more » The framework also monitors the system/application state and adapts the application and/or resources to respond to changing requirements or environment. To demonstrate the operation of the framework and to evaluate its ability, we employ a workflow used to characterize an oil reservoir executing on a hybrid infrastructure composed of TeraGrid nodes and Amazon EC2 instances of various types. Specifically, we show how different applications objectives such as acceleration, conservation and resilience can be effectively achieved while satisfying deadline and budget constraints, using an appropriate mix of dynamically provisioned resources. Our evaluations also demonstrate that public clouds can be used to complement and reinforce the scheduling and usage of traditional high performance computing infrastructure.« less
Capturing and analyzing wheelchair maneuvering patterns with mobile cloud computing.
Fu, Jicheng; Hao, Wei; White, Travis; Yan, Yuqing; Jones, Maria; Jan, Yih-Kuen
2013-01-01
Power wheelchairs have been widely used to provide independent mobility to people with disabilities. Despite great advancements in power wheelchair technology, research shows that wheelchair related accidents occur frequently. To ensure safe maneuverability, capturing wheelchair maneuvering patterns is fundamental to enable other research, such as safe robotic assistance for wheelchair users. In this study, we propose to record, store, and analyze wheelchair maneuvering data by means of mobile cloud computing. Specifically, the accelerometer and gyroscope sensors in smart phones are used to record wheelchair maneuvering data in real-time. Then, the recorded data are periodically transmitted to the cloud for storage and analysis. The analyzed results are then made available to various types of users, such as mobile phone users, traditional desktop users, etc. The combination of mobile computing and cloud computing leverages the advantages of both techniques and extends the smart phone's capabilities of computing and data storage via the Internet. We performed a case study to implement the mobile cloud computing framework using Android smart phones and Google App Engine, a popular cloud computing platform. Experimental results demonstrated the feasibility of the proposed mobile cloud computing framework.
Bootstrapping and Maintaining Trust in the Cloud
2016-12-01
simultaneous cloud nodes. 1. INTRODUCTION The proliferation and popularity of infrastructure-as-a- service (IaaS) cloud computing services such as...Amazon Web Services and Google Compute Engine means more cloud tenants are hosting sensitive, private, and business critical data and applications in the...thousands of IaaS resources as they are elastically instantiated and terminated. Prior cloud trusted computing solutions address a subset of these features
Motions in Nearby Galaxy Cluster Reveal Presence of Hidden Superstructure
NASA Astrophysics Data System (ADS)
2004-09-01
A nearby galaxy cluster is facing an intergalactic headwind as it is pulled by an underlying superstructure of dark matter, according to new evidence from NASA's Chandra X-ray Observatory. Astronomers think that most of the matter in the universe is concentrated in long large filaments of dark matter and that galaxy clusters are formed where these filaments intersect. A Chandra survey of the Fornax galaxy cluster revealed a vast, swept-back cloud of hot gas near the center of the cluster. This geometry indicates that the hot gas cloud, which is several hundred thousand light years in length, is moving rapidly through a larger, less dense cloud of gas. The motion of the core gas cloud, together with optical observations of a group of galaxies racing inward on a collision course with it, suggests that an unseen, large structure is collapsing and drawing everything toward a common center of gravity. X-ray Image of Fornax with labels X-ray Image of Fornax with labels "At a relatively nearby distance of about 60 million light years, the Fornax cluster represents a crucial laboratory for studying the interplay of galaxies, hot gas and dark matter as the cluster evolves." said Caleb Scharf of Columbia University in New York, NY, lead author of a paper describing the Chandra survey that was presented at an American Astronomical Society meeting in New Orleans, LA. "What we are seeing could be associated directly with the intergalactic gas surrounding a very large scale structure that stretches over millions of light years." The infalling galaxy group, whose motion was detected by Michael Drinkwater of the University of Melbourne in Australia, and colleagues, is about 3 million light years from the cluster core, so a collision with the core will not occur for a few billion years. Insight as to how this collision will look is provided by the elliptical galaxy NGC 1404 that is plunging into the core of the cluster for the first time. As discussed by Scharf and another group led by Marie Machacek of the Harvard-Smithsonian Center for Astrophysics in Cambridge, Mass., the hot gas cloud surrounding this galaxy has a sharp leading edge and a trailing tail of gas being stripped from the galaxy. Illustration of Fornax Cluster Illustration of Fornax Cluster "One thing that makes what we see in Fornax rather compelling is that it looks a lot like some of the latest computer simulations," added Scharf. "The Fornax picture, with infalling galaxies, and the swept back geometry of the cluster gas - seen only with the Chandra resolution and the proximity of Fornax - is one of the best matches to date with these high-resolution simulations." Over the course of hundreds of millions of years, NGC 1404's orbit will take it through the cluster core several times, most of the gas it contains will be stripped away, and the formation of new stars will cease. In contrast, galaxies that remain outside the core will retain their gas, and new stars can continue to form. Indeed, Scharf and colleagues found that galaxies located in regions outside the core were more likely to show X-ray activity which could be associated with active star formation. Dissolve from Optical to X-ray View of Fornax Animation Dissolve from Optical to X-ray View of Fornax Animation The wide-field and deep X-ray view around Fornax was obtained through ten Chandra pointings, each lasting about 14 hours. Other members of the research team were David Zurek of the American Museum of Natural History, New York, NY, and Martin Bureau, a Hubble Fellow currently at Columbia. NASA's Marshall Space Flight Center, Huntsville, Ala., manages the Chandra program for NASA's Office of Space Science, Washington. Northrop Grumman of Redondo Beach, Calif., formerly TRW, Inc., was the prime development contractor for the observatory. The Smithsonian Astrophysical Observatory controls science and flight operations from the Chandra X-ray Center in Cambridge, Mass. Additional information and images are available at: http://chandra.harvard.edu and http://chandra.nasa.gov
Study on the application of mobile internet cloud computing platform
NASA Astrophysics Data System (ADS)
Gong, Songchun; Fu, Songyin; Chen, Zheng
2012-04-01
The innovative development of computer technology promotes the application of the cloud computing platform, which actually is the substitution and exchange of a sort of resource service models and meets the needs of users on the utilization of different resources after changes and adjustments of multiple aspects. "Cloud computing" owns advantages in many aspects which not merely reduce the difficulties to apply the operating system and also make it easy for users to search, acquire and process the resources. In accordance with this point, the author takes the management of digital libraries as the research focus in this paper, and analyzes the key technologies of the mobile internet cloud computing platform in the operation process. The popularization and promotion of computer technology drive people to create the digital library models, and its core idea is to strengthen the optimal management of the library resource information through computers and construct an inquiry and search platform with high performance, allowing the users to access to the necessary information resources at any time. However, the cloud computing is able to promote the computations within the computers to distribute in a large number of distributed computers, and hence implement the connection service of multiple computers. The digital libraries, as a typical representative of the applications of the cloud computing, can be used to carry out an analysis on the key technologies of the cloud computing.
SPARCCS - Smartphone-Assisted Readiness, Command and Control System
2012-06-01
and database needs. By doing this SPARCCS takes advantage of all the capabilities cloud computing has to offer, especially that of disbursed data...40092829/ Microsoft. (2011). Cloud Computing . Retrieved September 24, 2011, http ://www.microsoft.com/industry/government/guides/cloud_computing/2...Command, and Control System) to address these issues. We use smartphones in conjunction with cloud computing to extend the benefits of collaborative
Future Naval Use of COTS Networking Infrastructure
2009-07-01
user to benefit from Google’s vast databases and computational resources. Obviously, the ability to harness the full power of the Cloud could be... Computing Impact Findings Action Items Take-Aways Appendices: Pages 54-68 A. Terms of Reference Document B. Sample Definitions of Cloud ...and definition of Cloud Computing . While Cloud Computing is developing in many variations – including Infrastructure as a Service (IaaS), Platform as
The application of cloud computing to scientific workflows: a study of cost and performance.
Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S
2013-01-28
The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.
Use of cloud computing in biomedicine.
Sobeslav, Vladimir; Maresova, Petra; Krejcar, Ondrej; Franca, Tanos C C; Kuca, Kamil
2016-12-01
Nowadays, biomedicine is characterised by a growing need for processing of large amounts of data in real time. This leads to new requirements for information and communication technologies (ICT). Cloud computing offers a solution to these requirements and provides many advantages, such as cost savings, elasticity and scalability of using ICT. The aim of this paper is to explore the concept of cloud computing and the related use of this concept in the area of biomedicine. Authors offer a comprehensive analysis of the implementation of the cloud computing approach in biomedical research, decomposed into infrastructure, platform and service layer, and a recommendation for processing large amounts of data in biomedicine. Firstly, the paper describes the appropriate forms and technological solutions of cloud computing. Secondly, the high-end computing paradigm of cloud computing aspects is analysed. Finally, the potential and current use of applications in scientific research of this technology in biomedicine is discussed.
A resource management architecture based on complex network theory in cloud computing federation
NASA Astrophysics Data System (ADS)
Zhang, Zehua; Zhang, Xuejie
2011-10-01
Cloud Computing Federation is a main trend of Cloud Computing. Resource Management has significant effect on the design, realization, and efficiency of Cloud Computing Federation. Cloud Computing Federation has the typical characteristic of the Complex System, therefore, we propose a resource management architecture based on complex network theory for Cloud Computing Federation (abbreviated as RMABC) in this paper, with the detailed design of the resource discovery and resource announcement mechanisms. Compare with the existing resource management mechanisms in distributed computing systems, a Task Manager in RMABC can use the historical information and current state data get from other Task Managers for the evolution of the complex network which is composed of Task Managers, thus has the advantages in resource discovery speed, fault tolerance and adaptive ability. The result of the model experiment confirmed the advantage of RMABC in resource discovery performance.
Gas expulsion vs gas retention in young stellar clusters II: effects of cooling and mass segregation
NASA Astrophysics Data System (ADS)
Silich, Sergiy; Tenorio-Tagle, Guillermo
2018-05-01
Gas expulsion or gas retention is a central issue in most of the models for multiple stellar populations and light element anti-correlations in globular clusters. The success of the residual matter expulsion or its retention within young stellar clusters has also a fundamental importance in order to understand how star formation proceeds in present-day and ancient star-forming galaxies and if proto-globular clusters with multiple stellar populations are formed in the present epoch. It is usually suggested that either the residual gas is rapidly ejected from star-forming clouds by stellar winds and supernova explosions, or that the enrichment of the residual gas and the formation of the second stellar generation occur so rapidly, that the negative stellar feedback is not significant. Here we continue our study of the early development of star clusters in the extreme environments and discuss the restrictions that strong radiative cooling and stellar mass segregation provide on the gas expulsion from dense star-forming clouds. A large range of physical initial conditions in star-forming clouds which include the star-forming cloud mass, compactness, gas metallicity, star formation efficiency and effects of massive stars segregation are discussed. It is shown that in sufficiently massive and compact clusters hot shocked winds around individual massive stars may cool before merging with their neighbors. This dramatically reduces the negative stellar feedback, prevents the development of the global star cluster wind and expulsion of the residual and the processed matter into the ambient interstellar medium. The critical lines which separate the gas expulsion and the gas retention regimes are obtained.
CSNS computing environment Based on OpenStack
NASA Astrophysics Data System (ADS)
Li, Yakang; Qi, Fazhi; Chen, Gang; Wang, Yanming; Hong, Jianshu
2017-10-01
Cloud computing can allow for more flexible configuration of IT resources and optimized hardware utilization, it also can provide computing service according to the real need. We are applying this computing mode to the China Spallation Neutron Source(CSNS) computing environment. So, firstly, CSNS experiment and its computing scenarios and requirements are introduced in this paper. Secondly, the design and practice of cloud computing platform based on OpenStack are mainly demonstrated from the aspects of cloud computing system framework, network, storage and so on. Thirdly, some improvments to openstack we made are discussed further. Finally, current status of CSNS cloud computing environment are summarized in the ending of this paper.
COMBAT: mobile-Cloud-based cOmpute/coMmunications infrastructure for BATtlefield applications
NASA Astrophysics Data System (ADS)
Soyata, Tolga; Muraleedharan, Rajani; Langdon, Jonathan; Funai, Colin; Ames, Scott; Kwon, Minseok; Heinzelman, Wendi
2012-05-01
The amount of data processed annually over the Internet has crossed the zetabyte boundary, yet this Big Data cannot be efficiently processed or stored using today's mobile devices. Parallel to this explosive growth in data, a substantial increase in mobile compute-capability and the advances in cloud computing have brought the state-of-the- art in mobile-cloud computing to an inflection point, where the right architecture may allow mobile devices to run applications utilizing Big Data and intensive computing. In this paper, we propose the MObile Cloud-based Hybrid Architecture (MOCHA), which formulates a solution to permit mobile-cloud computing applications such as object recognition in the battlefield by introducing a mid-stage compute- and storage-layer, called the cloudlet. MOCHA is built on the key observation that many mobile-cloud applications have the following characteristics: 1) they are compute-intensive, requiring the compute-power of a supercomputer, and 2) they use Big Data, requiring a communications link to cloud-based database sources in near-real-time. In this paper, we describe the operation of MOCHA in battlefield applications, by formulating the aforementioned mobile and cloudlet to be housed within a soldier's vest and inside a military vehicle, respectively, and enabling access to the cloud through high latency satellite links. We provide simulations using the traditional mobile-cloud approach as well as utilizing MOCHA with a mid-stage cloudlet to quantify the utility of this architecture. We show that the MOCHA platform for mobile-cloud computing promises a future for critical battlefield applications that access Big Data, which is currently not possible using existing technology.
Hybrid cloud: bridging of private and public cloud computing
NASA Astrophysics Data System (ADS)
Aryotejo, Guruh; Kristiyanto, Daniel Y.; Mufadhol
2018-05-01
Cloud Computing is quickly emerging as a promising paradigm in the recent years especially for the business sector. In addition, through cloud service providers, cloud computing is widely used by Information Technology (IT) based startup company to grow their business. However, the level of most businesses awareness on data security issues is low, since some Cloud Service Provider (CSP) could decrypt their data. Hybrid Cloud Deployment Model (HCDM) has characteristic as open source, which is one of secure cloud computing model, thus HCDM may solve data security issues. The objective of this study is to design, deploy and evaluate a HCDM as Infrastructure as a Service (IaaS). In the implementation process, Metal as a Service (MAAS) engine was used as a base to build an actual server and node. Followed by installing the vsftpd application, which serves as FTP server. In comparison with HCDM, public cloud was adopted through public cloud interface. As a result, the design and deployment of HCDM was conducted successfully, instead of having good security, HCDM able to transfer data faster than public cloud significantly. To the best of our knowledge, Hybrid Cloud Deployment model is one of secure cloud computing model due to its characteristic as open source. Furthermore, this study will serve as a base for future studies about Hybrid Cloud Deployment model which may relevant for solving big security issues of IT-based startup companies especially in Indonesia.
NASA Astrophysics Data System (ADS)
Wagner-Kaiser, R.; Mackey, Dougal; Sarajedini, Ata; Cohen, Roger E.; Geisler, Doug; Yang, Soung-Chul; Grocholski, Aaron J.; Cummings, Jeffrey D.
2018-03-01
We leverage new high-quality data from Hubble Space Telescope program GO-14164 to explore the variation in horizontal branch morphology among globular clusters in the Large Magellanic Cloud (LMC). Our new observations lead to photometry with a precision commensurate with that available for the Galactic globular cluster population. Our analysis indicates that, once metallicity is accounted for, clusters in the LMC largely share similar horizontal branch morphologies regardless of their location within the system. Furthermore, the LMC clusters possess, on average, slightly redder morphologies than most of the inner halo Galactic population; we find, instead, that their characteristics tend to be more similar to those exhibited by clusters in the outer Galactic halo. Our results are consistent with previous studies, showing a correlation between horizontal branch morphology and age.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pete Beckman and Ian Foster
Chicago Matters: Beyond Burnham (WTTW). Chicago has become a world center of "cloud computing." Argonne experts Pete Beckman and Ian Foster explain what "cloud computing" is and how you probably already use it on a daily basis.
Transitioning ISR architecture into the cloud
NASA Astrophysics Data System (ADS)
Lash, Thomas D.
2012-06-01
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Hyperfine excitation of C2H in collisions with ortho- and para-H2
NASA Astrophysics Data System (ADS)
Dagdigian, Paul J.
2018-06-01
Accurate estimation of the abundance of the ethynyl (C2H) radical requires accurate radiative and collisional rate coefficients. Hyperfine-resolved rate coefficients for (de-)excitation of C2H in collisions with ortho- and para-H2 are presented in this work. These rate coefficients were computed in time-independent close-coupling quantum scattering calculations that employed a potential energy surface recently computed at the coupled-clusters level of theory that describes the interaction of C2H with H2. Rate coefficients for temperatures from 10 to 300 K were computed for all transitions among the first 40 hyperfine energy levels of C2H in collisions with ortho- and para-H2. These rate coefficients were employed in simple radiative transfer calculations to simulate the excitation of C2H in typical molecular clouds.
Using SIR (Scientific Information Retrieval System) for data management during a field program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tichler, J.L.
As part of the US Department of Energy's program, PRocessing of Emissions by Clouds and Precipitation (PRECP), a team of scientists from four laboratories conducted a study in north central New York State, to characterize the chemical and physical processes occurring in winter storms. Sampling took place from three aircraft, two instrumented motor homes and a network of 26 surface precipitation sampling sites. Data management personnel were part of the field program, using a portable IBM PC-AT computer to enter information as it became available during the field study. Having the same database software on the field computer and onmore » the cluster of VAX 11/785 computers in use aided database development and the transfer of data between machines. 2 refs., 3 figs., 5 tabs.« less
Clustering the Orion B giant molecular cloud based on its molecular emission.
Bron, Emeric; Daudon, Chloé; Pety, Jérôme; Levrier, François; Gerin, Maryvonne; Gratier, Pierre; Orkisz, Jan H; Guzman, Viviana; Bardeau, Sébastien; Goicoechea, Javier R; Liszt, Harvey; Öberg, Karin; Peretto, Nicolas; Sievers, Albrecht; Tremblin, Pascal
2018-02-01
Previous attempts at segmenting molecular line maps of molecular clouds have focused on using position-position-velocity data cubes of a single molecular line to separate the spatial components of the cloud. In contrast, wide field spectral imaging over a large spectral bandwidth in the (sub)mm domain now allows one to combine multiple molecular tracers to understand the different physical and chemical phases that constitute giant molecular clouds (GMCs). We aim at using multiple tracers (sensitive to different physical processes and conditions) to segment a molecular cloud into physically/chemically similar regions (rather than spatially connected components), thus disentangling the different physical/chemical phases present in the cloud. We use a machine learning clustering method, namely the Meanshift algorithm, to cluster pixels with similar molecular emission, ignoring spatial information. Clusters are defined around each maximum of the multidimensional Probability Density Function (PDF) of the line integrated intensities. Simple radiative transfer models were used to interpret the astrophysical information uncovered by the clustering analysis. A clustering analysis based only on the J = 1 - 0 lines of three isotopologues of CO proves suffcient to reveal distinct density/column density regimes ( n H ~ 100 cm -3 , ~ 500 cm -3 , and > 1000 cm -3 ), closely related to the usual definitions of diffuse, translucent and high-column-density regions. Adding two UV-sensitive tracers, the J = 1 - 0 line of HCO + and the N = 1 - 0 line of CN, allows us to distinguish two clearly distinct chemical regimes, characteristic of UV-illuminated and UV-shielded gas. The UV-illuminated regime shows overbright HCO + and CN emission, which we relate to a photochemical enrichment effect. We also find a tail of high CN/HCO + intensity ratio in UV-illuminated regions. Finer distinctions in density classes ( n H ~ 7 × 10 3 cm -3 ~ 4 × 10 4 cm -3 ) for the densest regions are also identified, likely related to the higher critical density of the CN and HCO + (1 - 0) lines. These distinctions are only possible because the high-density regions are spatially resolved. Molecules are versatile tracers of GMCs because their line intensities bear the signature of the physics and chemistry at play in the gas. The association of simultaneous multi-line, wide-field mapping and powerful machine learning methods such as the Meanshift clustering algorithm reveals how to decode the complex information available in these molecular tracers.
Bigdata Driven Cloud Security: A Survey
NASA Astrophysics Data System (ADS)
Raja, K.; Hanifa, Sabibullah Mohamed
2017-08-01
Cloud Computing (CC) is a fast-growing technology to perform massive-scale and complex computing. It eliminates the need to maintain expensive computing hardware, dedicated space, and software. Recently, it has been observed that massive growth in the scale of data or big data generated through cloud computing. CC consists of a front-end, includes the users’ computers and software required to access the cloud network, and back-end consists of various computers, servers and database systems that create the cloud. In SaaS (Software as-a-Service - end users to utilize outsourced software), PaaS (Platform as-a-Service-platform is provided) and IaaS (Infrastructure as-a-Service-physical environment is outsourced), and DaaS (Database as-a-Service-data can be housed within a cloud), where leading / traditional cloud ecosystem delivers the cloud services become a powerful and popular architecture. Many challenges and issues are in security or threats, most vital barrier for cloud computing environment. The main barrier to the adoption of CC in health care relates to Data security. When placing and transmitting data using public networks, cyber attacks in any form are anticipated in CC. Hence, cloud service users need to understand the risk of data breaches and adoption of service delivery model during deployment. This survey deeply covers the CC security issues (covering Data Security in Health care) so as to researchers can develop the robust security application models using Big Data (BD) on CC (can be created / deployed easily). Since, BD evaluation is driven by fast-growing cloud-based applications developed using virtualized technologies. In this purview, MapReduce [12] is a good example of big data processing in a cloud environment, and a model for Cloud providers.
NASA Astrophysics Data System (ADS)
Kuwata, Keith T.
Ionic clusters are useful as model systems for the study of fundamental processes in solution and in the atmosphere. Their structure and reactivity can be studied in detail using vibrational predissociation spectroscopy, in conjunction with high level ab initio calculations. This thesis presents the applications of infrared spectroscopy and computation to a variety of gas-phase cluster systems. A crucial component of the process of stratospheric ozone depletion is the action of polar stratospheric clouds (PSCs) to convert the reservoir species HCl and chlorine nitrate (ClONO2) to photochemically labile compounds. Quantum chemistry was used to explore one possible mechanism by which this activation is effected: Cl- + ClONO2 /to Cl2 + NO3- eqno(1)Correlated ab initio calculations predicted that the direct reaction of chloride ion with ClONO2 is facile, which was confirmed in an experimental kinetics study. In the reaction a weakly bound intermediate Cl2-NO3- is formed, with ~70% of the charge localized on the nitrate moiety. This enables the Cl2-NO3- cluster to be well solvated even in bulk solution, allowing (1) to be facile on PSCs. Quantum chemistry was also applied to the hydration of nitrosonium ion (NO+), an important process in the ionosphere. The calculations, in conjunction with an infrared spectroscopy experiment, revealed the structure of the gas-phase clusters NO+(H2O)n. The large degree of covalent interaction between NO+ and the lone pairs of the H2O ligands is contrasted with the weak electrostatic bonding between iodide ion and H2O. Finally, the competition between ion solvation and solvent self-association is explored for the gas-phase clusters Cl/-(H2O)n and Cl-(NH3)n. For the case of water, vibrational predissociation spectroscopy reveals less hydrogen bonding among H2O ligands than predicted by ab initio calculations. Nevertheless, for n /ge 5, cluster structure is dominated by water-water interactions, with Cl- only partially solvated by the water cluster. Preliminary infrared spectra and computations on Cl- (NH3)n indicate that NH3 preferentially binds to Cl- ion instead of forming inter-solvent networks.
Dynamic electronic institutions in agent oriented cloud robotic systems.
Nagrath, Vineet; Morel, Olivier; Malik, Aamir; Saad, Naufal; Meriaudeau, Fabrice
2015-01-01
The dot-com bubble bursted in the year 2000 followed by a swift movement towards resource virtualization and cloud computing business model. Cloud computing emerged not as new form of computing or network technology but a mere remoulding of existing technologies to suit a new business model. Cloud robotics is understood as adaptation of cloud computing ideas for robotic applications. Current efforts in cloud robotics stress upon developing robots that utilize computing and service infrastructure of the cloud, without debating on the underlying business model. HTM5 is an OMG's MDA based Meta-model for agent oriented development of cloud robotic systems. The trade-view of HTM5 promotes peer-to-peer trade amongst software agents. HTM5 agents represent various cloud entities and implement their business logic on cloud interactions. Trade in a peer-to-peer cloud robotic system is based on relationships and contracts amongst several agent subsets. Electronic Institutions are associations of heterogeneous intelligent agents which interact with each other following predefined norms. In Dynamic Electronic Institutions, the process of formation, reformation and dissolution of institutions is automated leading to run time adaptations in groups of agents. DEIs in agent oriented cloud robotic ecosystems bring order and group intellect. This article presents DEI implementations through HTM5 methodology.
Libraries in the Cloud: Making a Case for Google and Amazon
ERIC Educational Resources Information Center
Buck, Stephanie
2009-01-01
As news outlets create headlines such as "A Cloud & A Prayer," "The Cloud Is the Computer," and "Leveraging Clouds to Make You More Efficient," many readers have been left with cloud confusion. Many definitions exist for cloud computing, and a uniform definition is hard to find. In its most basic form, cloud…
ERIC Educational Resources Information Center
Dulaney, Malik H.
2013-01-01
Emerging technologies challenge the management of information technology in organizations. Paradigm changing technologies, such as cloud computing, have the ability to reverse the norms in organizational management, decision making, and information technology governance. This study explores the effects of cloud computing on information technology…
Factors Influencing the Adoption of Cloud Computing by Decision Making Managers
ERIC Educational Resources Information Center
Ross, Virginia Watson
2010-01-01
Cloud computing is a growing field, addressing the market need for access to computing resources to meet organizational computing requirements. The purpose of this research is to evaluate the factors that influence an organization in their decision whether to adopt cloud computing as a part of their strategic information technology planning.…
A General Cross-Layer Cloud Scheduling Framework for Multiple IoT Computer Tasks.
Wu, Guanlin; Bao, Weidong; Zhu, Xiaomin; Zhang, Xiongtao
2018-05-23
The diversity of IoT services and applications brings enormous challenges to improving the performance of multiple computer tasks' scheduling in cross-layer cloud computing systems. Unfortunately, the commonly-employed frameworks fail to adapt to the new patterns on the cross-layer cloud. To solve this issue, we design a new computer task scheduling framework for multiple IoT services in cross-layer cloud computing systems. Specifically, we first analyze the features of the cross-layer cloud and computer tasks. Then, we design the scheduling framework based on the analysis and present detailed models to illustrate the procedures of using the framework. With the proposed framework, the IoT services deployed in cross-layer cloud computing systems can dynamically select suitable algorithms and use resources more effectively to finish computer tasks with different objectives. Finally, the algorithms are given based on the framework, and extensive experiments are also given to validate its effectiveness, as well as its superiority.
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
Cool Star Beginnings: YSOs in the Perseus Molecular Cloud
NASA Astrophysics Data System (ADS)
Young, Kaisa E.; Young, Chadwick H.
2015-01-01
Nearby molecular clouds, where there is considerable evidence of ongoing star formation, provide the best opportunity to observe stars in the earliest stages of their formation. The Perseus molecular cloud contains two young clusters, IC 348 and NGC 1333 and several small dense cores of the type that produce only a few stars. Perseus is often cited as an intermediate case between quiescent low-mass and turbulent high-mass clouds, making it perhaps an ideal environment for studying ``typical low-mass star formation. We present an infrared study of the Perseus molecular cloud with data from the Spitzer Space Telescope as part of the ``From Molecular Cores to Planet Forming Disks (c2d) Legacy project tep{eva03}. By comparing Spitzer's near- and mid-infrared maps, we identify and classify the young stellar objects (YSOs) in the cloud using updated extinction corrected photometry. Virtually all of the YSOs in Perseus are forming in the clusters and other smaller associations at the east and west ends of the cloud with very little evidence of star formation in the midsection even in areas of high extinction.
Research on phone contacts online status based on mobile cloud computing
NASA Astrophysics Data System (ADS)
Wang, Wen-jinga; Ge, Weib
2013-03-01
Because the limited ability of storage space, CPU processing on mobile phone, it is difficult to realize complex applications on mobile phones, but along with the development of cloud computing, we can place the computing and storage in the clouds, provide users with rich cloud services, helping users complete various function through the browser has become the trend for future mobile communication. This article is taking the mobile phone contacts online status as an example to analysis the development and application of mobile cloud computing.
Bootstrapping and Maintaining Trust in the Cloud
2016-12-01
proliferation and popularity of infrastructure-as-a- service (IaaS) cloud computing services such as Amazon Web Services and Google Compute Engine means...IaaS trusted computing system: • Secure Bootstrapping – the system should enable the tenant to securely install an initial root secret into each cloud ...elastically instantiated and terminated. Prior cloud trusted computing solutions address a subset of these features, but none achieve all. Excalibur [31] sup
NASA Astrophysics Data System (ADS)
Howard, Corey S.; Pudritz, Ralph E.; Harris, William E.
2017-09-01
The process of radiative feedback in giant molecular clouds (GMCs) is an important mechanism for limiting star cluster formation through the heating and ionization of the surrounding gas. We explore the degree to which radiative feedback affects early (≲5 Myr) cluster formation in GMCs having masses that range from 104 to 106 M⊙ using the flash code. The inclusion of radiative feedback lowers the efficiency of cluster formation by 20-50 per cent relative to hydrodynamic simulations. Two models in particular - 5 × 104 and 105 M⊙ - show the largest suppression of the cluster formation efficiency, corresponding to a factor of ˜2. For these clouds only, the internal energy, a measure of the energy injected by radiative feedback, exceeds the gravitational potential for a significant amount of time. We find a clear relation between the maximum cluster mass, Mc,max, formed in a GMC and the mass of the GMC itself, MGMC: Mc,max ∝ M_{GMC}^{0.81}. This scaling result suggests that young globular clusters at the necessary scale of 106 M⊙ form within host GMCs of masses near ˜5 × 107 M⊙. We compare simulated cluster mass distributions to the observed embedded cluster mass function [d log (N)/dlog (M) ∝ Mβ where β = -1] and find good agreement (β = -0.99 ± 0.14) only for simulations including radiative feedback, indicating this process is important in controlling the growth of young clusters. However, the high star formation efficiencies, which range from 16 to 21 per cent, and high star formation rates compared to locally observed regions suggest other feedback mechanisms are also important during the formation and growth of stellar clusters.
NASA Astrophysics Data System (ADS)
Qian, Ling; Luo, Zhiguo; Du, Yujian; Guo, Leitao
In order to support the maximum number of user and elastic service with the minimum resource, the Internet service provider invented the cloud computing. within a few years, emerging cloud computing has became the hottest technology. From the publication of core papers by Google since 2003 to the commercialization of Amazon EC2 in 2006, and to the service offering of AT&T Synaptic Hosting, the cloud computing has been evolved from internal IT system to public service, from cost-saving tools to revenue generator, and from ISP to telecom. This paper introduces the concept, history, pros and cons of cloud computing as well as the value chain and standardization effort.
Evaluating open-source cloud computing solutions for geosciences
NASA Astrophysics Data System (ADS)
Huang, Qunying; Yang, Chaowei; Liu, Kai; Xia, Jizhe; Xu, Chen; Li, Jing; Gui, Zhipeng; Sun, Min; Li, Zhenglong
2013-09-01
Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.
Cloud Collaboration: Cloud-Based Instruction for Business Writing Class
ERIC Educational Resources Information Center
Lin, Charlie; Yu, Wei-Chieh Wayne; Wang, Jenny
2014-01-01
Cloud computing technologies, such as Google Docs, Adobe Creative Cloud, Dropbox, and Microsoft Windows Live, have become increasingly appreciated to the next generation digital learning tools. Cloud computing technologies encourage students' active engagement, collaboration, and participation in their learning, facilitate group work, and support…
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
Security model for VM in cloud
NASA Astrophysics Data System (ADS)
Kanaparti, Venkataramana; Naveen K., R.; Rajani, S.; Padmvathamma, M.; Anitha, C.
2013-03-01
Cloud computing is a new approach emerged to meet ever-increasing demand for computing resources and to reduce operational costs and Capital Expenditure for IT services. As this new way of computation allows data and applications to be stored away from own corporate server, it brings more issues in security such as virtualization security, distributed computing, application security, identity management, access control and authentication. Even though Virtualization forms the basis for cloud computing it poses many threats in securing cloud. As most of Security threats lies at Virtualization layer in cloud we proposed this new Security Model for Virtual Machine in Cloud (SMVC) in which every process is authenticated by Trusted-Agent (TA) in Hypervisor as well as in VM. Our proposed model is designed to with-stand attacks by unauthorized process that pose threat to applications related to Data Mining, OLAP systems, Image processing which requires huge resources in cloud deployed on one or more VM's.
Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian
2011-01-01
The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.
Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian
2011-01-01
Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928
ULTRAVIOLET ESCAPE FRACTIONS FROM GIANT MOLECULAR CLOUDS DURING EARLY CLUSTER FORMATION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Howard, Corey; Pudritz, Ralph; Klessen, Ralf
2017-01-01
The UV photon escape fraction from molecular clouds is a key parameter for understanding the ionization of the interstellar medium and extragalactic processes such as cosmic reionization. We present the ionizing photon flux and the corresponding photon escape fraction ( f {sub esc}) arising as a consequence of star cluster formation in a turbulent, 10{sup 6} M {sub ⊙} giant molecular cloud, simulated using the code FLASH. We make use of sink particles to represent young, star-forming clusters coupled with a radiative transfer scheme to calculate the emergent UV flux. We find that the ionizing photon flux across the cloudmore » boundary is highly variable in time and space due to the turbulent nature of the intervening gas. The escaping photon fraction remains at ∼5% for the first 2.5 Myr, followed by two pronounced peaks at 3.25 and 3.8 Myr with a maximum f {sub esc} of 30% and 37%, respectively. These peaks are due to the formation of large H ii regions that expand into regions of lower density, some of which reaching the cloud surface. However, these phases are short-lived, and f {sub esc} drops sharply as the H ii regions are quenched by the central cluster passing through high-density material due to the turbulent nature of the cloud. We find an average f {sub esc} of 15% with factor of two variations over 1 Myr timescales. Our results suggest that assuming a single value for f {sub esc} from a molecular cloud is in general a poor approximation, and that the dynamical evolution of the system leads to large temporal variation.« less
Cloud flexibility using DIRAC interware
NASA Astrophysics Data System (ADS)
Fernandez Albor, Víctor; Seco Miguelez, Marcos; Fernandez Pena, Tomas; Mendez Muñoz, Victor; Saborido Silva, Juan Jose; Graciani Diaz, Ricardo
2014-06-01
Communities of different locations are running their computing jobs on dedicated infrastructures without the need to worry about software, hardware or even the site where their programs are going to be executed. Nevertheless, this usually implies that they are restricted to use certain types or versions of an Operating System because either their software needs an definite version of a system library or a specific platform is required by the collaboration to which they belong. On this scenario, if a data center wants to service software to incompatible communities, it has to split its physical resources among those communities. This splitting will inevitably lead to an underuse of resources because the data centers are bound to have periods where one or more of its subclusters are idle. It is, in this situation, where Cloud Computing provides the flexibility and reduction in computational cost that data centers are searching for. This paper describes a set of realistic tests that we ran on one of such implementations. The test comprise software from three different HEP communities (Auger, LHCb and QCD phenomelogists) and the Parsec Benchmark Suite running on one or more of three Linux flavors (SL5, Ubuntu 10.04 and Fedora 13). The implemented infrastructure has, at the cloud level, CloudStack that manages the virtual machines (VM) and the hosts on which they run, and, at the user level, the DIRAC framework along with a VM extension that will submit, monitorize and keep track of the user jobs and also requests CloudStack to start or stop the necessary VM's. In this infrastructure, the community software is distributed via the CernVM-FS, which has been proven to be a reliable and scalable software distribution system. With the resulting infrastructure, users are allowed to send their jobs transparently to the Data Center. The main purpose of this system is the creation of flexible cluster, multiplatform with an scalable method for software distribution for several VOs. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine, which is transparent to the user.
Parallel Processing of Big Point Clouds Using Z-Order Partitioning
NASA Astrophysics Data System (ADS)
Alis, C.; Boehm, J.; Liu, K.
2016-06-01
As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. We propose a partitioning based on Z-order which is a form of locality-sensitive hashing. The Z-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the Z-order code for the grid square with coordinates (x = 1 = 012, y = 3 = 112) is 10112 = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the k nearest neighbour algorithm for a hemispherical and a triangular wave point cloud.
NASA Astrophysics Data System (ADS)
Phillips, D. A.; Herring, T.; Melbourne, T. I.; Murray, M. H.; Szeliga, W. M.; Floyd, M.; Puskas, C. M.; King, R. W.; Boler, F. M.; Meertens, C. M.; Mattioli, G. S.
2017-12-01
The Geodesy Advancing Geosciences and EarthScope (GAGE) Facility, operated by UNAVCO, provides a diverse suite of geodetic data, derived products and cyberinfrastructure services to support community Earth science research and education. GPS data and products including decadal station position time series and velocities are provided for 2000+ continuous GPS stations from the Plate Boundary Observatory (PBO) and other networks distributed throughout the high Arctic, North America, and Caribbean regions. The position time series contain a multitude of signals in addition to the secular motions, including coseismic and postseismic displacements, interseismic strain accumulation, and transient signals associated with hydrologic and other processes. We present our latest velocity field solutions, new time series offset estimate products, and new time series examples associated with various phenomena. Position time series, and the signals they contain, are inherently dependent upon analysis parameters such as network scaling and reference frame realization. The estimation of scale changes for example, a common practice, has large impacts on vertical motion estimates. GAGE/PBO velocities and time series are currently provided in IGS (IGb08) and North America (NAM08, IGb08 rotated to a fixed North America Plate) reference frames. We are reprocessing all data (1996 to present) as part of the transition from IGb08 to IGS14 that began in 2017. New NAM14 and IGS14 data products are discussed. GAGE/PBO GPS data products are currently generated using onsite computing clusters. As part of an NSF funded EarthCube Building Blocks project called "Deploying MultiFacility Cyberinfrastructure in Commercial and Private Cloud-based Systems (GeoSciCloud)", we are investigating performance, cost, and efficiency differences between local computing resources and cloud based resources. Test environments include a commercial cloud provider (Amazon/AWS), NSF cloud-like infrastructures within XSEDE (TACC, the Texas Advanced Computing Center), and in-house cyberinfrastructures. Preliminary findings from this effort are presented. Web services developed by UNAVCO to facilitate the discovery, customization and dissemination of GPS data and products are also presented.
The pointing errors of geosynchronous satellites
NASA Technical Reports Server (NTRS)
Sikdar, D. N.; Das, A.
1971-01-01
A study of the correlation between cloud motion and wind field was initiated. Cloud heights and displacements were being obtained from a ceilometer and movie pictures, while winds were measured from pilot balloon observations on a near-simultaneous basis. Cloud motion vectors were obtained from time-lapse cloud pictures, using the WINDCO program, for 27, 28 July, 1969, in the Atlantic. The relationship between observed features of cloud clusters and the ambient wind field derived from cloud trajectories on a wide range of space and time scales is discussed.
Earth Science Data Fusion with Event Building Approach
NASA Technical Reports Server (NTRS)
Lukashin, C.; Bartle, Ar.; Callaway, E.; Gyurjyan, V.; Mancilla, S.; Oyarzun, R.; Vakhnin, A.
2015-01-01
Objectives of the NASA Information And Data System (NAIADS) project are to develop a prototype of a conceptually new middleware framework to modernize and significantly improve efficiency of the Earth Science data fusion, big data processing and analytics. The key components of the NAIADS include: Service Oriented Architecture (SOA) multi-lingual framework, multi-sensor coincident data Predictor, fast into-memory data Staging, multi-sensor data-Event Builder, complete data-Event streaming (a work flow with minimized IO), on-line data processing control and analytics services. The NAIADS project is leveraging CLARA framework, developed in Jefferson Lab, and integrated with the ZeroMQ messaging library. The science services are prototyped and incorporated into the system. Merging the SCIAMACHY Level-1 observations and MODIS/Terra Level-2 (Clouds and Aerosols) data products, and ECMWF re- analysis will be used for NAIADS demonstration and performance tests in compute Cloud and Cluster environments.
NASA Astrophysics Data System (ADS)
Berzano, D.; Blomer, J.; Buncic, P.; Charalampidis, I.; Ganis, G.; Meusel, R.
2015-12-01
During the last years, several Grid computing centres chose virtualization as a better way to manage diverse use cases with self-consistent environments on the same bare infrastructure. The maturity of control interfaces (such as OpenNebula and OpenStack) opened the possibility to easily change the amount of resources assigned to each use case by simply turning on and off virtual machines. Some of those private clouds use, in production, copies of the Virtual Analysis Facility, a fully virtualized and self-contained batch analysis cluster capable of expanding and shrinking automatically upon need: however, resources starvation occurs frequently as expansion has to compete with other virtual machines running long-living batch jobs. Such batch nodes cannot relinquish their resources in a timely fashion: the more jobs they run, the longer it takes to drain them and shut off, and making one-job virtual machines introduces a non-negligible virtualization overhead. By improving several components of the Virtual Analysis Facility we have realized an experimental “Docked” Analysis Facility for ALICE, which leverages containers instead of virtual machines for providing performance and security isolation. We will present the techniques we have used to address practical problems, such as software provisioning through CVMFS, as well as our considerations on the maturity of containers for High Performance Computing. As the abstraction layer is thinner, our Docked Analysis Facilities may feature a more fine-grained sizing, down to single-job node containers: we will show how this approach will positively impact automatic cluster resizing by deploying lightweight pilot containers instead of replacing central queue polls.
SCEAPI: A unified Restful Web API for High-Performance Computing
NASA Astrophysics Data System (ADS)
Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi
2017-10-01
The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
ERIC Educational Resources Information Center
Islam, Muhammad Faysal
2013-01-01
Cloud computing offers the advantage of on-demand, reliable and cost efficient computing solutions without the capital investment and management resources to build and maintain in-house data centers and network infrastructures. Scalability of cloud solutions enable consumers to upgrade or downsize their services as needed. In a cloud environment,…
Intracluster age gradients in numerous young stellar clusters
NASA Astrophysics Data System (ADS)
Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.
2018-05-01
The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.
Secure data sharing in public cloud
NASA Astrophysics Data System (ADS)
Venkataramana, Kanaparti; Naveen Kumar, R.; Tatekalva, Sandhya; Padmavathamma, M.
2012-04-01
Secure multi-party protocols have been proposed for entities (organizations or individuals) that don't fully trust each other to share sensitive information. Many types of entities need to collect, analyze, and disseminate data rapidly and accurately, without exposing sensitive information to unauthorized or untrusted parties. Solutions based on secure multiparty computation guarantee privacy and correctness, at an extra communication (too costly in communication to be practical) and computation cost. The high overhead motivates us to extend this SMC to cloud environment which provides large computation and communication capacity which makes SMC to be used between multiple clouds (i.e., it may between private or public or hybrid clouds).Cloud may encompass many high capacity servers which acts as a hosts which participate in computation (IaaS and PaaS) for final result, which is controlled by Cloud Trusted Authority (CTA) for secret sharing within the cloud. The communication between two clouds is controlled by High Level Trusted Authority (HLTA) which is one of the hosts in a cloud which provides MgaaS (Management as a Service). Due to high risk for security in clouds, HLTA generates and distributes public keys and private keys by using Carmichael-R-Prime- RSA algorithm for exchange of private data in SMC between itself and clouds. In cloud, CTA creates Group key for Secure communication between the hosts in cloud based on keys sent by HLTA for exchange of Intermediate values and shares for computation of final result. Since this scheme is extended to be used in clouds( due to high availability and scalability to increase computation power) it is possible to implement SMC practically for privacy preserving in data mining at low cost for the clients.
Applications integration in a hybrid cloud computing environment: modelling and platform
NASA Astrophysics Data System (ADS)
Li, Qing; Wang, Ze-yuan; Li, Wei-hua; Li, Jun; Wang, Cheng; Du, Rui-yang
2013-08-01
With the development of application services providers and cloud computing, more and more small- and medium-sized business enterprises use software services and even infrastructure services provided by professional information service companies to replace all or part of their information systems (ISs). These information service companies provide applications, such as data storage, computing processes, document sharing and even management information system services as public resources to support the business process management of their customers. However, no cloud computing service vendor can satisfy the full functional IS requirements of an enterprise. As a result, enterprises often have to simultaneously use systems distributed in different clouds and their intra enterprise ISs. Thus, this article presents a framework to integrate applications deployed in public clouds and intra ISs. A run-time platform is developed and a cross-computing environment process modelling technique is also developed to improve the feasibility of ISs under hybrid cloud computing environments.
NASA Technical Reports Server (NTRS)
Maluf, David A.; Shetye, Sandeep D.; Chilukuri, Sri; Sturken, Ian
2012-01-01
Cloud computing can reduce cost significantly because businesses can share computing resources. In recent years Small and Medium Businesses (SMB) have used Cloud effectively for cost saving and for sharing IT expenses. With the success of SMBs, many perceive that the larger enterprises ought to move into Cloud environment as well. Government agency s stove-piped environments are being considered as candidates for potential use of Cloud either as an enterprise entity or pockets of small communities. Cloud Computing is the delivery of computing as a service rather than as a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network. Underneath the offered services, there exists a modern infrastructure cost of which is often spread across its services or its investors. As NASA is considered as an Enterprise class organization, like other enterprises, a shift has been occurring in perceiving its IT services as candidates for Cloud services. This paper discusses market trends in cloud computing from an enterprise angle and then addresses the topic of Cloud Computing for NASA in two possible forms. First, in the form of a public Cloud to support it as an enterprise, as well as to share it with the commercial and public at large. Second, as a private Cloud wherein the infrastructure is operated solely for NASA, whether managed internally or by a third-party and hosted internally or externally. The paper addresses the strengths and weaknesses of both paradigms of public and private Clouds, in both internally and externally operated settings. The content of the paper is from a NASA perspective but is applicable to any large enterprise with thousands of employees and contractors.
Cotes-Ruiz, Iván Tomás; Prado, Rocío P.; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás
2017-01-01
Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique. PMID:28085932
Cotes-Ruiz, Iván Tomás; Prado, Rocío P; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás
2017-01-01
Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique.
Securing the Data Storage and Processing in Cloud Computing Environment
ERIC Educational Resources Information Center
Owens, Rodney
2013-01-01
Organizations increasingly utilize cloud computing architectures to reduce costs and energy consumption both in the data warehouse and on mobile devices by better utilizing the computing resources available. However, the security and privacy issues with publicly available cloud computing infrastructures have not been studied to a sufficient depth…
A Comprehensive Toolset for General-Purpose Private Computing and Outsourcing
2016-12-08
project and scientific advances made towards each of the research thrusts throughout the project duration. 1 Project Objectives Cloud computing enables...possibilities that the cloud enables is computation outsourcing, when the client can utilize any necessary computing resources for its computational task...Security considerations, however, stand on the way of harnessing the full benefits of cloud computing to the fullest extent and prevent clients from
Chemical evolution of the Magellanic Clouds
NASA Astrophysics Data System (ADS)
Barbuy, B.; de Freitas Pacheco, J. A.; Idiart, T.
We have obtained integrated spectra for 14 clusters in the Magellanic Clouds, on which the spectral indices Hβ, Mg2, Fe5270, Fe5335 were measured. Selecting indices whose behaviour depends essentially on age and metallicity (Hβ and
Security Risks of Cloud Computing and Its Emergence as 5th Utility Service
NASA Astrophysics Data System (ADS)
Ahmad, Mushtaq
Cloud Computing is being projected by the major cloud services provider IT companies such as IBM, Google, Yahoo, Amazon and others as fifth utility where clients will have access for processing those applications and or software projects which need very high processing speed for compute intensive and huge data capacity for scientific, engineering research problems and also e- business and data content network applications. These services for different types of clients are provided under DASM-Direct Access Service Management based on virtualization of hardware, software and very high bandwidth Internet (Web 2.0) communication. The paper reviews these developments for Cloud Computing and Hardware/Software configuration of the cloud paradigm. The paper also examines the vital aspects of security risks projected by IT Industry experts, cloud clients. The paper also highlights the cloud provider's response to cloud security risks.
Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian
2011-08-30
Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
NASA Astrophysics Data System (ADS)
Wan, Junwei; Chen, Hongyan; Zhao, Jing
2017-08-01
According to the requirements of real-time, reliability and safety for aerospace experiment, the single center cloud computing technology application verification platform is constructed. At the IAAS level, the feasibility of the cloud computing technology be applied to the field of aerospace experiment is tested and verified. Based on the analysis of the test results, a preliminary conclusion is obtained: Cloud computing platform can be applied to the aerospace experiment computing intensive business. For I/O intensive business, it is recommended to use the traditional physical machine.
Formal Specification and Analysis of Cloud Computing Management
2012-01-24
te r Cloud Computing in a Nutshell We begin this introduction to Cloud Computing with a famous quote by Larry Ellison: “The interesting thing about...the wording of some of our ads.” — Larry Ellison, Oracle CEO [106] In view of this statement, we summarize the essential aspects of Cloud Computing...1] M. Abadi, M. Burrows , M. Manasse, and T. Wobber. Moderately hard, memory-bound functions. ACM Transactions on Internet Technology, 5(2):299–327
A Test-Bed of Secure Mobile Cloud Computing for Military Applications
2016-09-13
searching databases. This kind of applications is a typical example of mobile cloud computing (MCC). MCC has lots of applications in the military...Release; Distribution Unlimited UU UU UU UU 13-09-2016 1-Aug-2014 31-Jul-2016 Final Report: A Test-bed of Secure Mobile Cloud Computing for Military...Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 Test-bed, Mobile Cloud Computing , Security, Military Applications REPORT
Cloud computing can simplify HIT infrastructure management.
Glaser, John
2011-08-01
Software as a Service (SaaS), built on cloud computing technology, is emerging as the forerunner in IT infrastructure because it helps healthcare providers reduce capital investments. Cloud computing leads to predictable, monthly, fixed operating expenses for hospital IT staff. Outsourced cloud computing facilities are state-of-the-art data centers boasting some of the most sophisticated networking equipment on the market. The SaaS model helps hospitals safeguard against technology obsolescence, minimizes maintenance requirements, and simplifies management.
A Weibull distribution accrual failure detector for cloud computing.
Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Dong, Jian; Zhao, Yao; Wen, Dongxin
2017-01-01
Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing.
Fragment assignment in the cloud with eXpress-D
2013-01-01
Background Probabilistic assignment of ambiguously mapped fragments produced by high-throughput sequencing experiments has been demonstrated to greatly improve accuracy in the analysis of RNA-Seq and ChIP-Seq, and is an essential step in many other sequence census experiments. A maximum likelihood method using the expectation-maximization (EM) algorithm for optimization is commonly used to solve this problem. However, batch EM-based approaches do not scale well with the size of sequencing datasets, which have been increasing dramatically over the past few years. Thus, current approaches to fragment assignment rely on heuristics or approximations for tractability. Results We present an implementation of a distributed EM solution to the fragment assignment problem using Spark, a data analytics framework that can scale by leveraging compute clusters within datacenters–“the cloud”. We demonstrate that our implementation easily scales to billions of sequenced fragments, while providing the exact maximum likelihood assignment of ambiguous fragments. The accuracy of the method is shown to be an improvement over the most widely used tools available and can be run in a constant amount of time when cluster resources are scaled linearly with the amount of input data. Conclusions The cloud offers one solution for the difficulties faced in the analysis of massive high-thoughput sequencing data, which continue to grow rapidly. Researchers in bioinformatics must follow developments in distributed systems–such as new frameworks like Spark–for ways to port existing methods to the cloud and help them scale to the datasets of the future. Our software, eXpress-D, is freely available at: http://github.com/adarob/express-d. PMID:24314033
RELATIVE PROPER MOTIONS IN THE RHO OPHIUCHI CLUSTER
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilking, Bruce A.; Sullivan, Timothy; Vrba, Frederick J., E-mail: bwilking@umsl.edu, E-mail: tsullivan@umsl.edu, E-mail: fjv@nofs.navy.mil
2015-12-10
Near-infrared images optimized for astrometry have been obtained for four fields in the high-density L 1688 cloud core over a 12 year period. The targeted regions include deeply embedded young stellar objects (YSOs) and very low luminosity objects too faint and/or heavily veiled for spectroscopy. Relative proper motions in R.A. and decl. were computed for 111 sources and again for a subset of 65 YSOs, resulting in a mean proper motion of (0,0) for each field. Assuming each field has the same mean proper motion, YSOs in the four fields were combined to yield estimates of the velocity dispersions inmore » R.A. and decl. that are consistent with 1.0 km s{sup −1}. These values appear to be independent of the evolutionary state of the YSOs. The observed velocity dispersions are consistent with the dispersion in radial velocity derived for optically visible YSOs at the periphery of the cloud core and are consistent with virial equilibrium. The higher velocity dispersion of the YSOs in the plane of the sky relative to that of dense cores may be a consequence of stellar encounters due to dense cores and filaments fragmenting to form small groups of stars or the global collapse of the L 1688 cloud core. An analysis of the differential magnitudes of objects over the 12 year baseline has not only confirmed the near-infrared variability for 29 YSOs established by prior studies, but has also identified 18 new variability candidates. Four of these have not been previously identified as YSOs and may be newly identified cluster members.« less