Sample records for scientific computing platforms

  1. A high performance scientific cloud computing environment for materials simulations

    NASA Astrophysics Data System (ADS)

    Jorissen, K.; Vila, F. D.; Rehr, J. J.

    2012-09-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

  2. Using the High-Level Based Program Interface to Facilitate the Large Scale Scientific Computing

    PubMed Central

    Shang, Yizi; Shang, Ling; Gao, Chuanchang; Lu, Guiming; Ye, Yuntao; Jia, Dongdong

    2014-01-01

    This paper is to make further research on facilitating the large-scale scientific computing on the grid and the desktop grid platform. The related issues include the programming method, the overhead of the high-level program interface based middleware, and the data anticipate migration. The block based Gauss Jordan algorithm as a real example of large-scale scientific computing is used to evaluate those issues presented above. The results show that the high-level based program interface makes the complex scientific applications on large-scale scientific platform easier, though a little overhead is unavoidable. Also, the data anticipation migration mechanism can improve the efficiency of the platform which needs to process big data based scientific applications. PMID:24574931

  3. OMPC: an Open-Source MATLAB®-to-Python Compiler

    PubMed Central

    Jurica, Peter; van Leeuwen, Cees

    2008-01-01

    Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB®, the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB®-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB® functions into Python programs. The imported MATLAB® modules will run independently of MATLAB®, relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB®. OMPC is available at http://ompc.juricap.com. PMID:19225577

  4. Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads.

    PubMed

    Stone, John E; Hallock, Michael J; Phillips, James C; Peterson, Joseph R; Luthey-Schulten, Zaida; Schulten, Klaus

    2016-05-01

    Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers.

  5. OMPC: an Open-Source MATLAB-to-Python Compiler.

    PubMed

    Jurica, Peter; van Leeuwen, Cees

    2009-01-01

    Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB((R)), the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB((R))-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB((R)) functions into Python programs. The imported MATLAB((R)) modules will run independently of MATLAB((R)), relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB((R)). OMPC is available at http://ompc.juricap.com.

  6. Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads

    PubMed Central

    Stone, John E.; Hallock, Michael J.; Phillips, James C.; Peterson, Joseph R.; Luthey-Schulten, Zaida; Schulten, Klaus

    2016-01-01

    Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers. PMID:27516922

  7. Games as a Platform for Student Participation in Authentic Scientific Research

    ERIC Educational Resources Information Center

    Magnussen, Rikke; Hansen, Sidse Damgaard; Planke, Tilo; Sherson, Jacob Friis

    2014-01-01

    This paper presents results from the design and testing of an educational version of Quantum Moves, a Scientific Discovery Game that allows players to help solve authentic scientific challenges in the effort to develop a quantum computer. The primary aim of developing a game-based platform for student-research collaboration is to investigate if…

  8. Architecture and Initial Development of a Digital Library Platform for Computable Knowledge Objects for Health.

    PubMed

    Flynn, Allen J; Bahulekar, Namita; Boisvert, Peter; Lagoze, Carl; Meng, George; Rampton, James; Friedman, Charles P

    2017-01-01

    Throughout the world, biomedical knowledge is routinely generated and shared through primary and secondary scientific publications. However, there is too much latency between publication of knowledge and its routine use in practice. To address this latency, what is actionable in scientific publications can be encoded to make it computable. We have created a purpose-built digital library platform to hold, manage, and share actionable, computable knowledge for health called the Knowledge Grid Library. Here we present it with its system architecture.

  9. Feasibility study for the use of a YF-12 aircraft as a scientific instrument platform for observing the 1970 solar eclipse

    NASA Technical Reports Server (NTRS)

    Mercer, R. D.

    1973-01-01

    The scientific and engineering findings are presented of the feasibility study for the use of a YF-12 aircraft as a scientific instrument platform for observing the 1970 solar eclipse. Included in the report is the computer program documentation of the solar eclipse determination; summary data on SR-71A type aircraft capabilities and limitations as an observing platform for solar eclipses; and the recordings of an informal conference on observations of solar eclipses using SR-71A type aircraft.

  10. RAPPORT: running scientific high-performance computing applications on the cloud.

    PubMed

    Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

    2013-01-28

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.

  11. The SCEC Broadband Platform: A Collaborative Open-Source Software Package for Strong Ground Motion Simulation and Validation

    NASA Astrophysics Data System (ADS)

    Silva, F.; Maechling, P. J.; Goulet, C.; Somerville, P.; Jordan, T. H.

    2013-12-01

    The Southern California Earthquake Center (SCEC) Broadband Platform is a collaborative software development project involving SCEC researchers, graduate students, and the SCEC Community Modeling Environment. The SCEC Broadband Platform is open-source scientific software that can generate broadband (0-100Hz) ground motions for earthquakes, integrating complex scientific modules that implement rupture generation, low and high-frequency seismogram synthesis, non-linear site effects calculation, and visualization into a software system that supports easy on-demand computation of seismograms. The Broadband Platform operates in two primary modes: validation simulations and scenario simulations. In validation mode, the Broadband Platform runs earthquake rupture and wave propagation modeling software to calculate seismograms of a historical earthquake for which observed strong ground motion data is available. Also in validation mode, the Broadband Platform calculates a number of goodness of fit measurements that quantify how well the model-based broadband seismograms match the observed seismograms for a certain event. Based on these results, the Platform can be used to tune and validate different numerical modeling techniques. During the past year, we have modified the software to enable the addition of a large number of historical events, and we are now adding validation simulation inputs and observational data for 23 historical events covering the Eastern and Western United States, Japan, Taiwan, Turkey, and Italy. In scenario mode, the Broadband Platform can run simulations for hypothetical (scenario) earthquakes. In this mode, users input an earthquake description, a list of station names and locations, and a 1D velocity model for their region of interest, and the Broadband Platform software then calculates ground motions for the specified stations. By establishing an interface between scientific modules with a common set of input and output files, the Broadband Platform facilitates the addition of new scientific methods, which are written by earth scientists in a number of languages such as C, C++, Fortran, and Python. The Broadband Platform's modular design also supports the reuse of existing software modules as building blocks to create new scientific methods. Additionally, the Platform implements a wrapper around each scientific module, converting input and output files to and from the specific formats required (or produced) by individual scientific codes. Working in close collaboration with scientists and research engineers, the SCEC software development group continues to add new capabilities to the Broadband Platform and to release new versions as open-source scientific software distributions that can be compiled and run on many Linux computer systems. Our latest release includes the addition of 3 new simulation methods and several new data products, such as map and distance-based goodness of fit plots. Finally, as the number and complexity of scenarios simulated using the Broadband Platform increase, we have added batching utilities to substantially improve support for running large-scale simulations on computing clusters.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lingerfelt, Eric J; Endeve, Eirik; Hui, Yawei

    Improvements in scientific instrumentation allow imaging at mesoscopic to atomic length scales, many spectroscopic modes, and now--with the rise of multimodal acquisition systems and the associated processing capability--the era of multidimensional, informationally dense data sets has arrived. Technical issues in these combinatorial scientific fields are exacerbated by computational challenges best summarized as a necessity for drastic improvement in the capability to transfer, store, and analyze large volumes of data. The Bellerophon Environment for Analysis of Materials (BEAM) platform provides material scientists the capability to directly leverage the integrated computational and analytical power of High Performance Computing (HPC) to perform scalablemore » data analysis and simulation and manage uploaded data files via an intuitive, cross-platform client user interface. This framework delivers authenticated, "push-button" execution of complex user workflows that deploy data analysis algorithms and computational simulations utilizing compute-and-data cloud infrastructures and HPC environments like Titan at the Oak Ridge Leadershp Computing Facility (OLCF).« less

  13. GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data

    NASA Astrophysics Data System (ADS)

    Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.

    2016-12-01

    Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.

  14. Scientific Services on the Cloud

    NASA Astrophysics Data System (ADS)

    Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong

    Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.

  15. Energy Consumption Management of Virtual Cloud Computing Platform

    NASA Astrophysics Data System (ADS)

    Li, Lin

    2017-11-01

    For energy consumption management research on virtual cloud computing platforms, energy consumption management of virtual computers and cloud computing platform should be understood deeper. Only in this way can problems faced by energy consumption management be solved. In solving problems, the key to solutions points to data centers with high energy consumption, so people are in great need to use a new scientific technique. Virtualization technology and cloud computing have become powerful tools in people’s real life, work and production because they have strong strength and many advantages. Virtualization technology and cloud computing now is in a rapid developing trend. It has very high resource utilization rate. In this way, the presence of virtualization and cloud computing technologies is very necessary in the constantly developing information age. This paper has summarized, explained and further analyzed energy consumption management questions of the virtual cloud computing platform. It eventually gives people a clearer understanding of energy consumption management of virtual cloud computing platform and brings more help to various aspects of people’s live, work and son on.

  16. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform.

    PubMed

    Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N

    2017-03-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.

  17. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform

    PubMed Central

    Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.

    2016-01-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692

  18. Evolution of the Virtualized HPC Infrastructure of Novosibirsk Scientific Center

    NASA Astrophysics Data System (ADS)

    Adakin, A.; Anisenkov, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Korol, A.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Skovpen, K.; Sukharev, A.; Zaytsev, A.

    2012-12-01

    Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies, and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for a particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. A dedicated optical network with the initial bandwidth of 10 Gb/s connecting these three facilities was built in order to make it possible to share the computing resources among the research communities, thus increasing the efficiency of operating the existing computing facilities and offering a common platform for building the computing infrastructure for future scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technology based on XEN and KVM platforms. This contribution gives a thorough review of the present status and future development prospects for the NSC virtualized computing infrastructure and the experience gained while using it for running production data analysis jobs related to HEP experiments being carried out at BINP, especially the KEDR detector experiment at the VEPP-4M electron-positron collider.

  19. Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

    ERIC Educational Resources Information Center

    Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

    2013-01-01

    With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…

  20. The SCEC Broadband Platform: A Collaborative Open-Source Software Package for Strong Ground Motion Simulation and Validation

    NASA Astrophysics Data System (ADS)

    Silva, F.; Maechling, P. J.; Goulet, C. A.; Somerville, P.; Jordan, T. H.

    2014-12-01

    The Southern California Earthquake Center (SCEC) Broadband Platform is a collaborative software development project involving geoscientists, earthquake engineers, graduate students, and the SCEC Community Modeling Environment. The SCEC Broadband Platform (BBP) is open-source scientific software that can generate broadband (0-100Hz) ground motions for earthquakes, integrating complex scientific modules that implement rupture generation, low and high-frequency seismogram synthesis, non-linear site effects calculation, and visualization into a software system that supports easy on-demand computation of seismograms. The Broadband Platform operates in two primary modes: validation simulations and scenario simulations. In validation mode, the Platform runs earthquake rupture and wave propagation modeling software to calculate seismograms for a well-observed historical earthquake. Then, the BBP calculates a number of goodness of fit measurements that quantify how well the model-based broadband seismograms match the observed seismograms for a certain event. Based on these results, the Platform can be used to tune and validate different numerical modeling techniques. In scenario mode, the Broadband Platform can run simulations for hypothetical (scenario) earthquakes. In this mode, users input an earthquake description, a list of station names and locations, and a 1D velocity model for their region of interest, and the Broadband Platform software then calculates ground motions for the specified stations. Working in close collaboration with scientists and research engineers, the SCEC software development group continues to add new capabilities to the Broadband Platform and to release new versions as open-source scientific software distributions that can be compiled and run on many Linux computer systems. Our latest release includes 5 simulation methods, 7 simulation regions covering California, Japan, and Eastern North America, the ability to compare simulation results against GMPEs, and several new data products, such as map and distance-based goodness of fit plots. As the number and complexity of scenarios simulated using the Broadband Platform increases, we have added batching utilities to substantially improve support for running large-scale simulations on computing clusters.

  1. User Inspired Management of Scientific Jobs in Grids and Clouds

    ERIC Educational Resources Information Center

    Withana, Eran Chinthaka

    2011-01-01

    From time-critical, real time computational experimentation to applications which process petabytes of data there is a continuing search for faster, more responsive computing platforms capable of supporting computational experimentation. Weather forecast models, for instance, process gigabytes of data to produce regional (mesoscale) predictions on…

  2. A Systematic Approach for Obtaining Performance on Matrix-Like Operations

    NASA Astrophysics Data System (ADS)

    Veras, Richard Michael

    Scientific Computation provides a critical role in the scientific process because it allows us ask complex queries and test predictions that would otherwise be unfeasible to perform experimentally. Because of its power, Scientific Computing has helped drive advances in many fields ranging from Engineering and Physics to Biology and Sociology to Economics and Drug Development and even to Machine Learning and Artificial Intelligence. Common among these domains is the desire for timely computational results, thus a considerable amount of human expert effort is spent towards obtaining performance for these scientific codes. However, this is no easy task because each of these domains present their own unique set of challenges to software developers, such as domain specific operations, structurally complex data and ever-growing datasets. Compounding these problems are the myriads of constantly changing, complex and unique hardware platforms that an expert must target. Unfortunately, an expert is typically forced to reproduce their effort across multiple problem domains and hardware platforms. In this thesis, we demonstrate the automatic generation of expert level high-performance scientific codes for Dense Linear Algebra (DLA), Structured Mesh (Stencil), Sparse Linear Algebra and Graph Analytic. In particular, this thesis seeks to address the issue of obtaining performance on many complex platforms for a certain class of matrix-like operations that span across many scientific, engineering and social fields. We do this by automating a method used for obtaining high performance in DLA and extending it to structured, sparse and scale-free domains. We argue that it is through the use of the underlying structure found in the data from these domains that enables this process. Thus, obtaining performance for most operations does not occur in isolation of the data being operated on, but instead depends significantly on the structure of the data.

  3. Uncover the Cloud for Geospatial Sciences and Applications to Adopt Cloud Computing

    NASA Astrophysics Data System (ADS)

    Yang, C.; Huang, Q.; Xia, J.; Liu, K.; Li, J.; Xu, C.; Sun, M.; Bambacus, M.; Xu, Y.; Fay, D.

    2012-12-01

    Cloud computing is emerging as the future infrastructure for providing computing resources to support and enable scientific research, engineering development, and application construction, as well as work force education. On the other hand, there is a lot of doubt about the readiness of cloud computing to support a variety of scientific research, development and educations. This research is a project funded by NASA SMD to investigate through holistic studies how ready is the cloud computing to support geosciences. Four applications with different computing characteristics including data, computing, concurrent, and spatiotemporal intensities are taken to test the readiness of cloud computing to support geosciences. Three popular and representative cloud platforms including Amazon EC2, Microsoft Azure, and NASA Nebula as well as a traditional cluster are utilized in the study. Results illustrates that cloud is ready to some degree but more research needs to be done to fully implemented the cloud benefit as advertised by many vendors and defined by NIST. Specifically, 1) most cloud platform could help stand up new computing instances, a new computer, in a few minutes as envisioned, therefore, is ready to support most computing needs in an on demand fashion; 2) the load balance and elasticity, a defining characteristic, is ready in some cloud platforms, such as Amazon EC2, to support bigger jobs, e.g., needs response in minutes, while some are not ready to support the elasticity and load balance well. All cloud platform needs further research and development to support real time application at subminute level; 3) the user interface and functionality of cloud platforms vary a lot and some of them are very professional and well supported/documented, such as Amazon EC2, some of them needs significant improvement for the general public to adopt cloud computing without professional training or knowledge about computing infrastructure; 4) the security is a big concern in cloud computing platform, with the sharing spirit of cloud computing, it is very hard to ensure higher level security, except a private cloud is built for a specific organization without public access, public cloud platform does not support FISMA medium level yet and may never be able to support FISMA high level; 5) HPC jobs needs of cloud computing is not well supported and only Amazon EC2 supports this well. The research is being taken by NASA and other agencies to consider cloud computing adoption. We hope the publication of the research would also benefit the public to adopt cloud computing.

  4. Footwear Physics.

    ERIC Educational Resources Information Center

    Blaser, Mark; Larsen, Jamie

    1996-01-01

    Presents five interactive, computer-based activities that mimic scientific tests used by sport researchers to help companies design high-performance athletic shoes, including impact tests, flexion tests, friction tests, video analysis, and computer modeling. Provides a platform for teachers to build connections between chemistry (polymer science),…

  5. Cancer Diagnosis Epigenomics Scientific Workflow Scheduling in the Cloud Computing Environment Using an Improved PSO Algorithm

    PubMed

    N, Sadhasivam; R, Balamurugan; M, Pandi

    2018-01-27

    Objective: Epigenetic modifications involving DNA methylation and histone statud are responsible for the stable maintenance of cellular phenotypes. Abnormalities may be causally involved in cancer development and therefore could have diagnostic potential. The field of epigenomics refers to all epigenetic modifications implicated in control of gene expression, with a focus on better understanding of human biology in both normal and pathological states. Epigenomics scientific workflow is essentially a data processing pipeline to automate the execution of various genome sequencing operations or tasks. Cloud platform is a popular computing platform for deploying large scale epigenomics scientific workflow. Its dynamic environment provides various resources to scientific users on a pay-per-use billing model. Scheduling epigenomics scientific workflow tasks is a complicated problem in cloud platform. We here focused on application of an improved particle swam optimization (IPSO) algorithm for this purpose. Methods: The IPSO algorithm was applied to find suitable resources and allocate epigenomics tasks so that the total cost was minimized for detection of epigenetic abnormalities of potential application for cancer diagnosis. Result: The results showed that IPSO based task to resource mapping reduced total cost by 6.83 percent as compared to the traditional PSO algorithm. Conclusion: The results for various cancer diagnosis tasks showed that IPSO based task to resource mapping can achieve better costs when compared to PSO based mapping for epigenomics scientific application workflow. Creative Commons Attribution License

  6. Harnessing the power of emerging petascale platforms

    NASA Astrophysics Data System (ADS)

    Mellor-Crummey, John

    2007-07-01

    As part of the US Department of Energy's Scientific Discovery through Advanced Computing (SciDAC-2) program, science teams are tackling problems that require computational simulation and modeling at the petascale. A grand challenge for computer science is to develop software technology that makes it easier to harness the power of these systems to aid scientific discovery. As part of its activities, the SciDAC-2 Center for Scalable Application Development Software (CScADS) is building open source software tools to support efficient scientific computing on the emerging leadership-class platforms. In this paper, we describe two tools for performance analysis and tuning that are being developed as part of CScADS: a tool for analyzing scalability and performance, and a tool for optimizing loop nests for better node performance. We motivate these tools by showing how they apply to S3D, a turbulent combustion code under development at Sandia National Laboratory. For S3D, our node performance analysis tool helped uncover several performance bottlenecks. Using our loop nest optimization tool, we transformed S3D's most costly loop nest to reduce execution time by a factor of 2.94 for a processor working on a 503 domain.

  7. Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Samuel; Oliker, Leonid; Vuduc, Richard

    2008-10-16

    We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less

  8. Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds

    NASA Astrophysics Data System (ADS)

    Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.

    In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.

  9. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  10. Bringing Legacy Visualization Software to Modern Computing Devices via Application Streaming

    NASA Astrophysics Data System (ADS)

    Fisher, Ward

    2014-05-01

    Planning software compatibility across forthcoming generations of computing platforms is a problem commonly encountered in software engineering and development. While this problem can affect any class of software, data analysis and visualization programs are particularly vulnerable. This is due in part to their inherent dependency on specialized hardware and computing environments. A number of strategies and tools have been designed to aid software engineers with this task. While generally embraced by developers at 'traditional' software companies, these methodologies are often dismissed by the scientific software community as unwieldy, inefficient and unnecessary. As a result, many important and storied scientific software packages can struggle to adapt to a new computing environment; for example, one in which much work is carried out on sub-laptop devices (such as tablets and smartphones). Rewriting these packages for a new platform often requires significant investment in terms of development time and developer expertise. In many cases, porting older software to modern devices is neither practical nor possible. As a result, replacement software must be developed from scratch, wasting resources better spent on other projects. Enabled largely by the rapid rise and adoption of cloud computing platforms, 'Application Streaming' technologies allow legacy visualization and analysis software to be operated wholly from a client device (be it laptop, tablet or smartphone) while retaining full functionality and interactivity. It mitigates much of the developer effort required by other more traditional methods while simultaneously reducing the time it takes to bring the software to a new platform. This work will provide an overview of Application Streaming and how it compares against other technologies which allow scientific visualization software to be executed from a remote computer. We will discuss the functionality and limitations of existing application streaming frameworks and how a developer might prepare their software for application streaming. We will also examine the secondary benefits realized by moving legacy software to the cloud. Finally, we will examine the process by which a legacy Java application, the Integrated Data Viewer (IDV), is to be adapted for tablet computing via Application Streaming.

  11. Open Marketplace for Simulation Software on the Basis of a Web Platform

    NASA Astrophysics Data System (ADS)

    Kryukov, A. P.; Demichev, A. P.

    2016-02-01

    The focus in development of a new generation of middleware shifts from the global grid systems to building convenient and efficient web platforms for remote access to individual computing resources. Further line of their development, suggested in this work, is related not only with the quantitative increase in their number and with the expansion of scientific, engineering, and manufacturing areas in which they are used, but also with improved technology for remote deployment of application software on the resources interacting with the web platforms. Currently, the services for providers of application software in the context of scientific-oriented web platforms is not developed enough. The proposed in this work new web platforms of application software market should have all the features of the existing web platforms for submissions of jobs to remote resources plus the provision of specific web services for interaction on market principles between the providers and consumers of application packages. The suggested approach will be approved on the example of simulation applications in the field of nonlinear optics.

  12. Balloon platform for extended-life astronomy research

    NASA Technical Reports Server (NTRS)

    Ostwald, L. T.

    1974-01-01

    A configuration has been developed for a long-life balloon platform to carry pointing telescopes weighing as much as 80 pounds (36 kg) to point at selected celestial targets. A platform of this configuration weighs about 375 pounds (170 kg) gross and can be suspended from a high altitude super pressure balloon for a lifetime of several months. The balloon platform contains a solar array and storage batteries for electrical power, up and down link communications equipment, and navigational and attitude control systems for orienting the scientific instrument. A biaxial controller maintains the telescope attitude in response to look-angle data stored in an on-board computer memory which is updated periodically by ground command. Gimbal angles are computed by using location data derived by an on-board navigational receiver.

  13. Cloud Computing with iPlant Atmosphere.

    PubMed

    McKay, Sheldon J; Skidmore, Edwin J; LaRose, Christopher J; Mercer, Andre W; Noutsos, Christos

    2013-10-15

    Cloud Computing refers to distributed computing platforms that use virtualization software to provide easy access to physical computing infrastructure and data storage, typically administered through a Web interface. Cloud-based computing provides access to powerful servers, with specific software and virtual hardware configurations, while eliminating the initial capital cost of expensive computers and reducing the ongoing operating costs of system administration, maintenance contracts, power consumption, and cooling. This eliminates a significant barrier to entry into bioinformatics and high-performance computing for many researchers. This is especially true of free or modestly priced cloud computing services. The iPlant Collaborative offers a free cloud computing service, Atmosphere, which allows users to easily create and use instances on virtual servers preconfigured for their analytical needs. Atmosphere is a self-service, on-demand platform for scientific computing. This unit demonstrates how to set up, access and use cloud computing in Atmosphere. Copyright © 2013 John Wiley & Sons, Inc.

  14. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

    PubMed Central

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  15. The Globus Galaxies Platform. Delivering Science Gateways as a Service

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madduri, Ravi; Chard, Kyle; Chard, Ryan

    We use public cloud computers to host sophisticated scientific data; software is then used to transform scientific practice by enabling broad access to capabilities previously available only to the few. The primary obstacle to more widespread use of public clouds to host scientific software (‘cloud-based science gateways’) has thus far been the considerable gap between the specialized needs of science applications and the capabilities provided by cloud infrastructures. We describe here a domain-independent, cloud-based science gateway platform, the Globus Galaxies platform, which overcomes this gap by providing a set of hosted services that directly address the needs of science gatewaymore » developers. The design and implementation of this platform leverages our several years of experience with Globus Genomics, a cloud-based science gateway that has served more than 200 genomics researchers across 30 institutions. Building on that foundation, we have also implemented a platform that leverages the popular Galaxy system for application hosting and workflow execution; Globus services for data transfer, user and group management, and authentication; and a cost-aware elastic provisioning model specialized for public cloud resources. We describe here the capabilities and architecture of this platform, present six scientific domains in which we have successfully applied it, report on user experiences, and analyze the economics of our deployments. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less

  16. Optimization of sparse matrix-vector multiplication on emerging multicore platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Samuel; Oliker, Leonid; Vuduc, Richard

    2007-01-01

    We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less

  17. Comparison of scientific computing platforms for MCNP4A Monte Carlo calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hendricks, J.S.; Brockhoff, R.C.

    1994-04-01

    The performance of seven computer platforms is evaluated with the widely used and internationally available MCNP4A Monte Carlo radiation transport code. All results are reproducible and are presented in such a way as to enable comparison with computer platforms not in the study. The authors observed that the HP/9000-735 workstation runs MCNP 50% faster than the Cray YMP 8/64. Compared with the Cray YMP 8/64, the IBM RS/6000-560 is 68% as fast, the Sun Sparc10 is 66% as fast, the Silicon Graphics ONYX is 90% as fast, the Gateway 2000 model 4DX2-66V personal computer is 27% as fast, and themore » Sun Sparc2 is 24% as fast. In addition to comparing the timing performance of the seven platforms, the authors observe that changes in compilers and software over the past 2 yr have resulted in only modest performance improvements, hardware improvements have enhanced performance by less than a factor of [approximately]3, timing studies are very problem dependent, MCNP4Q runs about as fast as MCNP4.« less

  18. Software Engineering for Scientific Computer Simulations

    NASA Astrophysics Data System (ADS)

    Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.

    2004-11-01

    Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.

  19. Singularity: Scientific containers for mobility of compute.

    PubMed

    Kurtzer, Gregory M; Sochat, Vanessa; Bauer, Michael W

    2017-01-01

    Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.

  20. Singularity: Scientific containers for mobility of compute

    PubMed Central

    Kurtzer, Gregory M.; Bauer, Michael W.

    2017-01-01

    Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science. PMID:28494014

  1. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1995-01-01

    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.

  2. Techniques and Tools for Performance Tuning of Parallel and Distributed Scientific Applications

    NASA Technical Reports Server (NTRS)

    Sarukkai, Sekhar R.; VanderWijngaart, Rob F.; Castagnera, Karen (Technical Monitor)

    1994-01-01

    Performance degradation in scientific computing on parallel and distributed computer systems can be caused by numerous factors. In this half-day tutorial we explain what are the important methodological issues involved in obtaining codes that have good performance potential. Then we discuss what are the possible obstacles in realizing that potential on contemporary hardware platforms, and give an overview of the software tools currently available for identifying the performance bottlenecks. Finally, some realistic examples are used to illustrate the actual use and utility of such tools.

  3. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    NASA Astrophysics Data System (ADS)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2015-05-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  4. DB90: A Fortran Callable Relational Database Routine for Scientific and Engineering Computer Programs

    NASA Technical Reports Server (NTRS)

    Wrenn, Gregory A.

    2005-01-01

    This report describes a database routine called DB90 which is intended for use with scientific and engineering computer programs. The software is written in the Fortran 90/95 programming language standard with file input and output routines written in the C programming language. These routines should be completely portable to any computing platform and operating system that has Fortran 90/95 and C compilers. DB90 allows a program to supply relation names and up to 5 integer key values to uniquely identify each record of each relation. This permits the user to select records or retrieve data in any desired order.

  5. Cloud computing and validation of expandable in silico livers.

    PubMed

    Ropella, Glen E P; Hunt, C Anthony

    2010-12-03

    In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware.

  6. The Common Data Acquisition Platform in the Helmholtz Association

    NASA Astrophysics Data System (ADS)

    Kaever, P.; Balzer, M.; Kopmann, A.; Zimmer, M.; Rongen, H.

    2017-04-01

    Various centres of the German Helmholtz Association (HGF) started in 2012 to develop a modular data acquisition (DAQ) platform, covering the entire range from detector readout to data transfer into parallel computing environments. This platform integrates generic hardware components like the multi-purpose HGF-Advanced Mezzanine Card or a smart scientific camera framework, adding user value with Linux drivers and board support packages. Technically the scope comprises the DAQ-chain from FPGA-modules to computing servers, notably frontend-electronics-interfaces, microcontrollers and GPUs with their software plus high-performance data transmission links. The core idea is a generic and component-based approach, enabling the implementation of specific experiment requirements with low effort. This so called DTS-platform will support standards like MTCA.4 in hard- and software to ensure compatibility with commercial components. Its capability to deploy on other crate standards or FPGA-boards with PCI express or Ethernet interfaces remains an essential feature. Competences of the participating centres are coordinated in order to provide a solid technological basis for both research topics in the Helmholtz Programme ``Matter and Technology'': ``Detector Technology and Systems'' and ``Accelerator Research and Development''. The DTS-platform aims at reducing costs and development time and will ensure access to latest technologies for the collaboration. Due to its flexible approach, it has the potential to be applied in other scientific programs.

  7. ArcGIS Framework for Scientific Data Analysis and Serving

    NASA Astrophysics Data System (ADS)

    Xu, H.; Ju, W.; Zhang, J.

    2015-12-01

    ArcGIS is a platform for managing, visualizing, analyzing, and serving geospatial data. Scientific data as part of the geospatial data features multiple dimensions (X, Y, time, and depth) and large volume. Multidimensional mosaic dataset (MDMD), a newly enhanced data model in ArcGIS, models the multidimensional gridded data (e.g. raster or image) as a hypercube and enables ArcGIS's capabilities to handle the large volume and near-real time scientific data. Built on top of geodatabase, the MDMD stores the dimension values and the variables (2D arrays) in a geodatabase table which allows accessing a slice or slices of the hypercube through a simple query and supports animating changes along time or vertical dimension using ArcGIS desktop or web clients. Through raster types, MDMD can manage not only netCDF, GRIB, and HDF formats but also many other formats or satellite data. It is scalable and can handle large data volume. The parallel geo-processing engine makes the data ingestion fast and easily. Raster function, definition of a raster processing algorithm, is a very important component in ArcGIS platform for on-demand raster processing and analysis. The scientific data analytics is achieved through the MDMD and raster function templates which perform on-demand scientific computation with variables ingested in the MDMD. For example, aggregating monthly average from daily data; computing total rainfall of a year; calculating heat index for forecasting data, and identifying fishing habitat zones etc. Addtionally, MDMD with the associated raster function templates can be served through ArcGIS server as image services which provide a framework for on-demand server side computation and analysis, and the published services can be accessed by multiple clients such as ArcMap, ArcGIS Online, JavaScript, REST, WCS, and WMS. This presentation will focus on the MDMD model and raster processing templates. In addtion, MODIS land cover, NDFD weather service, and HYCOM ocean model will be used to illustrate how ArcGIS platform and MDMD model can facilitate scientific data visualization and analytics and how the analysis results can be shared to more audience through ArcGIS Online and Portal.

  8. Flexible workflow sharing and execution services for e-scientists

    NASA Astrophysics Data System (ADS)

    Kacsuk, Péter; Terstyanszky, Gábor; Kiss, Tamas; Sipos, Gergely

    2013-04-01

    The sequence of computational and data manipulation steps required to perform a specific scientific analysis is called a workflow. Workflows that orchestrate data and/or compute intensive applications on Distributed Computing Infrastructures (DCIs) recently became standard tools in e-science. At the same time the broad and fragmented landscape of workflows and DCIs slows down the uptake of workflow-based work. The development, sharing, integration and execution of workflows is still a challenge for many scientists. The FP7 "Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs" (SHIWA) project significantly improved the situation, with a simulation platform that connects different workflow systems, different workflow languages, different DCIs and workflows into a single, interoperable unit. The SHIWA Simulation Platform is a service package, already used by various scientific communities, and used as a tool by the recently started ER-flow FP7 project to expand the use of workflows among European scientists. The presentation will introduce the SHIWA Simulation Platform and the services that ER-flow provides based on the platform to space and earth science researchers. The SHIWA Simulation Platform includes: 1. SHIWA Repository: A database where workflows and meta-data about workflows can be stored. The database is a central repository to discover and share workflows within and among communities . 2. SHIWA Portal: A web portal that is integrated with the SHIWA Repository and includes a workflow executor engine that can orchestrate various types of workflows on various grid and cloud platforms. 3. SHIWA Desktop: A desktop environment that provides similar access capabilities than the SHIWA Portal, however it runs on the users' desktops/laptops instead of a portal server. 4. Workflow engines: the ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflow engines are already integrated with the execution engine of the SHIWA Portal. Other engines can be added when required. Through the SHIWA Portal one can define and run simulations on the SHIWA Virtual Organisation, an e-infrastructure that gathers computing and data resources from various DCIs, including the European Grid Infrastructure. The Portal via third party workflow engines provides support for the most widely used academic workflow engines and it can be extended with other engines on demand. Such extensions translate between workflow languages and facilitate the nesting of workflows into larger workflows even when those are written in different languages and require different interpreters for execution. Through the workflow repository and the portal lonely scientists and scientific collaborations can share and offer workflows for reuse and execution. Given the integrated nature of the SHIWA Simulation Platform the shared workflows can be executed online, without installing any special client environment and downloading workflows. The FP7 "Building a European Research Community through Interoperable Workflows and Data" (ER-flow) project disseminates the achievements of the SHIWA project and use these achievements to build workflow user communities across Europe. ER-flow provides application supports to research communities within and beyond the project consortium to develop, share and run workflows with the SHIWA Simulation Platform.

  9. Heterogeneous high throughput scientific computing with APM X-Gene and Intel Xeon Phi

    DOE PAGES

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; ...

    2015-05-22

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. As a result, we report our experience on software porting, performance and energy efficiency and evaluatemore » the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).« less

  10. Design and Use Online Platforms to Learn Mathematics and the Use of Them in Simulations of Problems in Applied Sciences

    ERIC Educational Resources Information Center

    Méndez-Fragoso, Ricardo; Villavicencio-Torres, Mirna; Martínez-Moreno, Josué

    2017-01-01

    In this contribution, we show the practical use of the computer to visualise simple computational simulations to show phenomena that occur in everyday life, or require an abstract understanding for being unintuitive phenomena. The relationship of the mathematics to different scientific disciplines motivates us to devise different treatments to…

  11. GABBs: Cyberinfrastructure for Self-Service Geospatial Data Exploration, Computation, and Sharing

    NASA Astrophysics Data System (ADS)

    Song, C. X.; Zhao, L.; Biehl, L. L.; Merwade, V.; Villoria, N.

    2016-12-01

    Geospatial data are present everywhere today with the proliferation of location-aware computing devices. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. In addressing these needs, the Geospatial data Analysis Building Blocks (GABBs) project aims at building geospatial modeling, data analysis and visualization capabilities in an open source web platform, HUBzero. Funded by NSF's Data Infrastructure Building Blocks initiative, GABBs is creating a geospatial data architecture that integrates spatial data management, mapping and visualization, and interfaces in the HUBzero platform for scientific collaborations. The geo-rendering enabled Rappture toolkit, a generic Python mapping library, geospatial data exploration and publication tools, and an integrated online geospatial data management solution are among the software building blocks from the project. The GABBS software will be available through Amazon's AWS Marketplace VM images and open source. Hosting services are also available to the user community. The outcome of the project will enable researchers and educators to self-manage their scientific data, rapidly create GIS-enable tools, share geospatial data and tools on the web, and build dynamic workflows connecting data and tools, all without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the GABBs architecture, toolkits and libraries, and showcase the scientific use cases that utilize GABBs capabilities, as well as the challenges and solutions for GABBs to interoperate with other cyberinfrastructure platforms.

  12. Embracing the quantum limit in silicon computing.

    PubMed

    Morton, John J L; McCamey, Dane R; Eriksson, Mark A; Lyon, Stephen A

    2011-11-16

    Quantum computers hold the promise of massive performance enhancements across a range of applications, from cryptography and databases to revolutionary scientific simulation tools. Such computers would make use of the same quantum mechanical phenomena that pose limitations on the continued shrinking of conventional information processing devices. Many of the key requirements for quantum computing differ markedly from those of conventional computers. However, silicon, which plays a central part in conventional information processing, has many properties that make it a superb platform around which to build a quantum computer. © 2011 Macmillan Publishers Limited. All rights reserved

  13. Building A Community Focused Data and Modeling Collaborative platform with Hardware Virtualization Technology

    NASA Astrophysics Data System (ADS)

    Michaelis, A.; Wang, W.; Melton, F. S.; Votava, P.; Milesi, C.; Hashimoto, H.; Nemani, R. R.; Hiatt, S. H.

    2009-12-01

    As the length and diversity of the global earth observation data records grow, modeling and analyses of biospheric conditions increasingly requires multiple terabytes of data from a diversity of models and sensors. With network bandwidth beginning to flatten, transmission of these data from centralized data archives presents an increasing challenge, and costs associated with local storage and management of data and compute resources are often significant for individual research and application development efforts. Sharing community valued intermediary data sets, results and codes from individual efforts with others that are not in direct funded collaboration can also be a challenge with respect to time, cost and expertise. We purpose a modeling, data and knowledge center that houses NASA satellite data, climate data and ancillary data where a focused community may come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform, named Ecosystem Modeling Center (EMC). With the recent development of new technologies for secure hardware virtualization, an opportunity exists to create specific modeling, analysis and compute environments that are customizable, “archiveable” and transferable. Allowing users to instantiate such environments on large compute infrastructures that are directly connected to large data archives may significantly reduce costs and time associated with scientific efforts by alleviating users from redundantly retrieving and integrating data sets and building modeling analysis codes. The EMC platform also provides the possibility for users receiving indirect assistance from expertise through prefabricated compute environments, potentially reducing study “ramp up” times.

  14. A Scientific Workflow Platform for Generic and Scalable Object Recognition on Medical Images

    NASA Astrophysics Data System (ADS)

    Möller, Manuel; Tuot, Christopher; Sintek, Michael

    In the research project THESEUS MEDICO we aim at a system combining medical image information with semantic background knowledge from ontologies to give clinicians fully cross-modal access to biomedical image repositories. Therefore joint efforts have to be made in more than one dimension: Object detection processes have to be specified in which an abstraction is performed starting from low-level image features across landmark detection utilizing abstract domain knowledge up to high-level object recognition. We propose a system based on a client-server extension of the scientific workflow platform Kepler that assists the collaboration of medical experts and computer scientists during development and parameter learning.

  15. iSPHERE - A New Approach to Collaborative Research and Cloud Computing

    NASA Astrophysics Data System (ADS)

    Al-Ubaidi, T.; Khodachenko, M. L.; Kallio, E. J.; Harry, A.; Alexeev, I. I.; Vázquez-Poletti, J. L.; Enke, H.; Magin, T.; Mair, M.; Scherf, M.; Poedts, S.; De Causmaecker, P.; Heynderickx, D.; Congedo, P.; Manolescu, I.; Esser, B.; Webb, S.; Ruja, C.

    2015-10-01

    The project iSPHERE (integrated Scientific Platform for HEterogeneous Research and Engineering) that has been proposed for Horizon 2020 (EINFRA-9- 2015, [1]) aims at creating a next generation Virtual Research Environment (VRE) that embraces existing and emerging technologies and standards in order to provide a versatile platform for scientific investigations and collaboration. The presentation will introduce the large project consortium, provide a comprehensive overview of iSPHERE's basic concepts and approaches and outline general user requirements that the VRE will strive to satisfy. An overview of the envisioned architecture will be given, focusing on the adapted Service Bus concept, i.e. the "Scientific Service Bus" as it is called in iSPHERE. The bus will act as a central hub for all communication and user access, and will be implemented in the course of the project. The agile approach [2] that has been chosen for detailed elaboration and documentation of user requirements, as well as for the actual implementation of the system, will be outlined and its motivation and basic structure will be discussed. The presentation will show which user communities will benefit and which concrete problems, scientific investigations are facing today, will be tackled by the system. Another focus of the presentation is iSPHERE's seamless integration of cloud computing resources and how these will benefit scientific modeling teams by providing a reliable and web based environment for cloud based model execution, storage of results, and comparison with measurements, including fully web based tools for data mining, analysis and visualization. Also the envisioned creation of a dedicated data model for experimental plasma physics will be discussed. It will be shown why the Scientific Service Bus provides an ideal basis to integrate a number of data models and communication protocols and to provide mechanisms for data exchange across multiple and even multidisciplinary platforms.

  16. The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections

    NASA Astrophysics Data System (ADS)

    Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.

    2014-12-01

    The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that software is both upgradable and maintainable, and can be readily reused with complexly integrated systems and become part of the growing global trusted community tools for cross-disciplinary research.

  17. Curating Big Data Made Simple: Perspectives from Scientific Communities.

    PubMed

    Sowe, Sulayman K; Zettsu, Koji

    2014-03-01

    The digital universe is exponentially producing an unprecedented volume of data that has brought benefits as well as fundamental challenges for enterprises and scientific communities alike. This trend is inherently exciting for the development and deployment of cloud platforms to support scientific communities curating big data. The excitement stems from the fact that scientists can now access and extract value from the big data corpus, establish relationships between bits and pieces of information from many types of data, and collaborate with a diverse community of researchers from various domains. However, despite these perceived benefits, to date, little attention is focused on the people or communities who are both beneficiaries and, at the same time, producers of big data. The technical challenges posed by big data are as big as understanding the dynamics of communities working with big data, whether scientific or otherwise. Furthermore, the big data era also means that big data platforms for data-intensive research must be designed in such a way that research scientists can easily search and find data for their research, upload and download datasets for onsite/offsite use, perform computations and analysis, share their findings and research experience, and seamlessly collaborate with their colleagues. In this article, we present the architecture and design of a cloud platform that meets some of these requirements, and a big data curation model that describes how a community of earth and environmental scientists is using the platform to curate data. Motivation for developing the platform, lessons learnt in overcoming some challenges associated with supporting scientists to curate big data, and future research directions are also presented.

  18. A Computing Environment to Support Repeatable Scientific Big Data Experimentation of World-Wide Scientific Literature

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schlicher, Bob G; Kulesz, James J; Abercrombie, Robert K

    A principal tenant of the scientific method is that experiments must be repeatable and relies on ceteris paribus (i.e., all other things being equal). As a scientific community, involved in data sciences, we must investigate ways to establish an environment where experiments can be repeated. We can no longer allude to where the data comes from, we must add rigor to the data collection and management process from which our analysis is conducted. This paper describes a computing environment to support repeatable scientific big data experimentation of world-wide scientific literature, and recommends a system that is housed at the Oakmore » Ridge National Laboratory in order to provide value to investigators from government agencies, academic institutions, and industry entities. The described computing environment also adheres to the recently instituted digital data management plan mandated by multiple US government agencies, which involves all stages of the digital data life cycle including capture, analysis, sharing, and preservation. It particularly focuses on the sharing and preservation of digital research data. The details of this computing environment are explained within the context of cloud services by the three layer classification of Software as a Service , Platform as a Service , and Infrastructure as a Service .« less

  19. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

    NASA Astrophysics Data System (ADS)

    Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

    Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  20. Cloud computing and validation of expandable in silico livers

    PubMed Central

    2010-01-01

    Background In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. Results The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. Conclusions The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling simulations to encompass greater detail with no extra investment in hardware. PMID:21129207

  1. Shipping Science Worldwide with Open Source Containers

    NASA Astrophysics Data System (ADS)

    Molineaux, J. P.; McLaughlin, B. D.; Pilone, D.; Plofchan, P. G.; Murphy, K. J.

    2014-12-01

    Scientific applications often present difficult web-hosting needs. Their compute- and data-intensive nature, as well as an increasing need for high-availability and distribution, combine to create a challenging set of hosting requirements. In the past year, advancements in container-based virtualization and related tooling have offered new lightweight and flexible ways to accommodate diverse applications with all the isolation and portability benefits of traditional virtualization. This session will introduce and demonstrate an open-source, single-interface, Platform-as-a-Serivce (PaaS) that empowers application developers to seamlessly leverage geographically distributed, public and private compute resources to achieve highly-available, performant hosting for scientific applications.

  2. System Architecture Development for Energy and Water Infrastructure Data Management and Geovisual Analytics

    NASA Astrophysics Data System (ADS)

    Berres, A.; Karthik, R.; Nugent, P.; Sorokine, A.; Myers, A.; Pang, H.

    2017-12-01

    Building an integrated data infrastructure that can meet the needs of a sustainable energy-water resource management requires a robust data management and geovisual analytics platform, capable of cross-domain scientific discovery and knowledge generation. Such a platform can facilitate the investigation of diverse complex research and policy questions for emerging priorities in Energy-Water Nexus (EWN) science areas. Using advanced data analytics, machine learning techniques, multi-dimensional statistical tools, and interactive geovisualization components, such a multi-layered federated platform is being developed, the Energy-Water Nexus Knowledge Discovery Framework (EWN-KDF). This platform utilizes several enterprise-grade software design concepts and standards such as extensible service-oriented architecture, open standard protocols, event-driven programming model, enterprise service bus, and adaptive user interfaces to provide a strategic value to the integrative computational and data infrastructure. EWN-KDF is built on the Compute and Data Environment for Science (CADES) environment in Oak Ridge National Laboratory (ORNL).

  3. THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS

    DTIC Science & Technology

    2017-09-01

    Evan Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, Saul Perlmutter. Scientific Computing Meets Big Data Technology: An Astronomy ...Processing Astronomy Imagery Using Big Data Technology. IEEE Transaction on Big Data, 2016. Approved for Public Release; Distribution Unlimited. 22 [93

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vang, Leng; Prescott, Steven R; Smith, Curtis

    In collaborating scientific research arena it is important to have an environment where analysts have access to a shared of information documents, software tools and be able to accurately maintain and track historical changes in models. A new cloud-based environment would be accessible remotely from anywhere regardless of computing platforms given that the platform has available of Internet access and proper browser capabilities. Information stored at this environment would be restricted based on user assigned credentials. This report reviews development of a Cloud-based Architecture Capabilities (CAC) as a web portal for PRA tools.

  5. Computational Aspects of Data Assimilation and the ESMF

    NASA Technical Reports Server (NTRS)

    daSilva, A.

    2003-01-01

    The scientific challenge of developing advanced data assimilation applications is a daunting task. Independently developed components may have incompatible interfaces or may be written in different computer languages. The high-performance computer (HPC) platforms required by numerically intensive Earth system applications are complex, varied, rapidly evolving and multi-part systems themselves. Since the market for high-end platforms is relatively small, there is little robust middleware available to buffer the modeler from the difficulties of HPC programming. To complicate matters further, the collaborations required to develop large Earth system applications often span initiatives, institutions and agencies, involve geoscience, software engineering, and computer science communities, and cross national borders.The Earth System Modeling Framework (ESMF) project is a concerted response to these challenges. Its goal is to increase software reuse, interoperability, ease of use and performance in Earth system models through the use of a common software framework, developed in an open manner by leaders in the modeling community. The ESMF addresses the technical and to some extent the cultural - aspects of Earth system modeling, laying the groundwork for addressing the more difficult scientific aspects, such as the physical compatibility of components, in the future. In this talk we will discuss the general philosophy and architecture of the ESMF, focussing on those capabilities useful for developing advanced data assimilation applications.

  6. Northwest Trajectory Analysis Capability: A Platform for Enhancing Computational Biophysics Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peterson, Elena S.; Stephan, Eric G.; Corrigan, Abigail L.

    2008-07-30

    As computational resources continue to increase, the ability of computational simulations to effectively complement, and in some cases replace, experimentation in scientific exploration also increases. Today, large-scale simulations are recognized as an effective tool for scientific exploration in many disciplines including chemistry and biology. A natural side effect of this trend has been the need for an increasingly complex analytical environment. In this paper, we describe Northwest Trajectory Analysis Capability (NTRAC), an analytical software suite developed to enhance the efficiency of computational biophysics analyses. Our strategy is to layer higher-level services and introduce improved tools within the user’s familiar environmentmore » without preventing researchers from using traditional tools and methods. Our desire is to share these experiences to serve as an example for effectively analyzing data intensive large scale simulation data.« less

  7. IBM techexplorer and MathML: Interactive Multimodal Scientific Documents

    NASA Astrophysics Data System (ADS)

    Diaz, Angel

    2001-06-01

    The World Wide Web provides a standard publishing platform for disseminating scientific and technical articles, books, journals, courseware, or even homework on the internet; however, the transition from paper to web-based interactive content has brought new opportunities for creating interactive content. Students, scientists, and engineers are now faced with the task of rendering the 2D presentational structure of mathematics, harnessing the wealth of scientific and technical software, and creating truly accessible scientific portals across international boundaries and markets. The recent emergence of World Wide Web Consortium (W3C) standards such as the Mathematical Markup Language (MathML), Language (XSL), and Aural CSS (ACSS) provide a foundation whereby mathematics can be displayed, enlivened, computed, and audio formatted. With interoperability ensured by standards, software applications can be easily brought together to create extensible and interactive scientific content. In this presentation we will provide an overview of the IBM techexplorer Hypermedia Browser, a web browser plug-in and ActiveX control aimed at bringing interactive mathematics to the masses across platforms and applications. We will demonstrate "live" mathematics where documents that contain MathML expressions can be edited and computed right inside your favorite web browser. This demonstration will be generalized as we show how MathML can be used to enliven even PowerPoint presentations. Finally, we will close the loop by demonstrating a novel approach to spoken mathematics based on MathML, DOM, XSL, ACSS, techexplorer, and IBM ViaVoice. By making use of techexplorer as the glue that binds the rendered content to the web browser, the back-end computation software, the Java applets that augment the exposition, and voice-rendering systems such as ViaVoice, authors can indeed create truly extensible and interactive scientific content. For more information see: [http://www.software.ibm.com/techexplorer] [http://www.alphaworks.ibm.com] [http://www.w3.org

  8. A Group Intelligence-Based Asynchronous Argumentation Learning-Assistance Platform

    ERIC Educational Resources Information Center

    Huang, Chenn-Jung; Chang, Shun-Chih; Chen, Heng-Ming; Tseng, Jhe-Hao; Chien, Sheng-Yuan

    2016-01-01

    Structured argumentation support environments have been built and used in scientific discourse in the literature. However, to the best our knowledge, there is no research work in the literature examining whether student's knowledge has grown during learning activities with asynchronous argumentation. In this work, an intelligent computer-supported…

  9. Design of the smart scenic spot service platform

    NASA Astrophysics Data System (ADS)

    Yin, Min; Wang, Shi-tai

    2015-12-01

    With the deepening of the smart city construction, the model "smart+" is rapidly developing. Guilin, the international tourism metropolis fast constructing need smart tourism technology support. This paper studied the smart scenic spot service object and its requirements. And then constructed the smart service platform of the scenic spot application of 3S technology (Geographic Information System (GIS), Remote Sensing (RS) and Global Navigation Satellite System (GNSS)) and the Internet of things, cloud computing. Based on Guilin Seven-star Park scenic area as an object, this paper designed the Seven-star smart scenic spot service platform framework. The application of this platform will improve the tourists' visiting experience, make the tourism management more scientifically and standardly, increase tourism enterprises operating earnings.

  10. Near Real-time Scientific Data Analysis and Visualization with the ArcGIS Platform

    NASA Astrophysics Data System (ADS)

    Shrestha, S. R.; Viswambharan, V.; Doshi, A.

    2017-12-01

    Scientific multidimensional data are generated from a variety of sources and platforms. These datasets are mostly produced by earth observation and/or modeling systems. Agencies like NASA, NOAA, USGS, and ESA produce large volumes of near real-time observation, forecast, and historical data that drives fundamental research and its applications in larger aspects of humanity from basic decision making to disaster response. A common big data challenge for organizations working with multidimensional scientific data and imagery collections is the time and resources required to manage and process such large volumes and varieties of data. The challenge of adopting data driven real-time visualization and analysis, as well as the need to share these large datasets, workflows, and information products to wider and more diverse communities, brings an opportunity to use the ArcGIS platform to handle such demand. In recent years, a significant effort has put in expanding the capabilities of ArcGIS to support multidimensional scientific data across the platform. New capabilities in ArcGIS to support scientific data management, processing, and analysis as well as creating information products from large volumes of data using the image server technology are becoming widely used in earth science and across other domains. We will discuss and share the challenges associated with big data by the geospatial science community and how we have addressed these challenges in the ArcGIS platform. We will share few use cases, such as NOAA High Resolution Refresh Radar (HRRR) data, that demonstrate how we access large collections of near real-time data (that are stored on-premise or on the cloud), disseminate them dynamically, process and analyze them on-the-fly, and serve them to a variety of geospatial applications. We will also share how on-the-fly processing using raster functions capabilities, can be extended to create persisted data and information products using raster analytics capabilities that exploit distributed computing in an enterprise environment.

  11. Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Chase Qishi; Zhu, Michelle Mengxia

    The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less

  12. An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

    2012-01-01

    The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.

  13. Open Science in the Cloud: Towards a Universal Platform for Scientific and Statistical Computing

    NASA Astrophysics Data System (ADS)

    Chine, Karim

    The UK, through the e-Science program, the US through the NSF-funded cyber infrastructure and the European Union through the ICT Calls aimed to provide "the technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge".1 The Grid (Foster, 2002; Foster; Kesselman, Nick, & Tuecke, 2002), foreseen as a major accelerator of discovery, didn't meet the expectations it had excited at its beginnings and was not adopted by the broad population of research professionals. The Grid is a good tool for particle physicists and it has allowed them to tackle the tremendous computational challenges inherent to their field. However, as a technology and paradigm for delivering computing on demand, it doesn't work and it can't be fixed. On one hand, "the abstractions that Grids expose - to the end-user, to the deployers and to application developers - are inappropriate and they need to be higher level" (Jha, Merzky, & Fox), and on the other hand, academic Grids are inherently economically unsustainable. They can't compete with a service outsourced to the Industry whose quality and price would be driven by market forces. The virtualization technologies and their corollary, the Infrastructure-as-a-Service (IaaS) style cloud, hold the promise to enable what the Grid failed to deliver: a sustainable environment for computational sciences that would lower the barriers for accessing federated computational resources, software tools and data; enable collaboration and resources sharing and provide the building blocks of a ubiquitous platform for traceable and reproducible computational research.

  14. A midas plugin to enable construction of reproducible web-based image processing pipelines

    PubMed Central

    Grauer, Michael; Reynolds, Patrick; Hoogstoel, Marion; Budin, Francois; Styner, Martin A.; Oguz, Ipek

    2013-01-01

    Image processing is an important quantitative technique for neuroscience researchers, but difficult for those who lack experience in the field. In this paper we present a web-based platform that allows an expert to create a brain image processing pipeline, enabling execution of that pipeline even by those biomedical researchers with limited image processing knowledge. These tools are implemented as a plugin for Midas, an open-source toolkit for creating web based scientific data storage and processing platforms. Using this plugin, an image processing expert can construct a pipeline, create a web-based User Interface, manage jobs, and visualize intermediate results. Pipelines are executed on a grid computing platform using BatchMake and HTCondor. This represents a new capability for biomedical researchers and offers an innovative platform for scientific collaboration. Current tools work well, but can be inaccessible for those lacking image processing expertise. Using this plugin, researchers in collaboration with image processing experts can create workflows with reasonable default settings and streamlined user interfaces, and data can be processed easily from a lab environment without the need for a powerful desktop computer. This platform allows simplified troubleshooting, centralized maintenance, and easy data sharing with collaborators. These capabilities enable reproducible science by sharing datasets and processing pipelines between collaborators. In this paper, we present a description of this innovative Midas plugin, along with results obtained from building and executing several ITK based image processing workflows for diffusion weighted MRI (DW MRI) of rodent brain images, as well as recommendations for building automated image processing pipelines. Although the particular image processing pipelines developed were focused on rodent brain MRI, the presented plugin can be used to support any executable or script-based pipeline. PMID:24416016

  15. A midas plugin to enable construction of reproducible web-based image processing pipelines.

    PubMed

    Grauer, Michael; Reynolds, Patrick; Hoogstoel, Marion; Budin, Francois; Styner, Martin A; Oguz, Ipek

    2013-01-01

    Image processing is an important quantitative technique for neuroscience researchers, but difficult for those who lack experience in the field. In this paper we present a web-based platform that allows an expert to create a brain image processing pipeline, enabling execution of that pipeline even by those biomedical researchers with limited image processing knowledge. These tools are implemented as a plugin for Midas, an open-source toolkit for creating web based scientific data storage and processing platforms. Using this plugin, an image processing expert can construct a pipeline, create a web-based User Interface, manage jobs, and visualize intermediate results. Pipelines are executed on a grid computing platform using BatchMake and HTCondor. This represents a new capability for biomedical researchers and offers an innovative platform for scientific collaboration. Current tools work well, but can be inaccessible for those lacking image processing expertise. Using this plugin, researchers in collaboration with image processing experts can create workflows with reasonable default settings and streamlined user interfaces, and data can be processed easily from a lab environment without the need for a powerful desktop computer. This platform allows simplified troubleshooting, centralized maintenance, and easy data sharing with collaborators. These capabilities enable reproducible science by sharing datasets and processing pipelines between collaborators. In this paper, we present a description of this innovative Midas plugin, along with results obtained from building and executing several ITK based image processing workflows for diffusion weighted MRI (DW MRI) of rodent brain images, as well as recommendations for building automated image processing pipelines. Although the particular image processing pipelines developed were focused on rodent brain MRI, the presented plugin can be used to support any executable or script-based pipeline.

  16. The AT Odyssey Continues. Proceedings of the RESNA 2001 Annual Conference (Reno, Nevada, June 22-26, 2001). Volume 21.

    ERIC Educational Resources Information Center

    Simpson, Richard, Ed.

    These proceedings of the annual RESNA (Association for the Advancement of Rehabilitation Technology) conference include more than 200 presentations on all facets of assistive technology, including concurrent sessions, scientific platform sessions, interactive poster presentations, computer demonstrations, and the research symposia. The scientific…

  17. Framework Development Supporting the Safety Portal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prescott, Steven Ralph; Kvarfordt, Kellie Jean; Vang, Leng

    2015-07-01

    In a collaborating scientific research arena it is important to have an environment where analysts have access to a shared repository of information, documents, and software tools, and be able to accurately maintain and track historical changes in models. The new Safety Portal cloud-based environment will be accessible remotely from anywhere regardless of computing platforms given that the platform has available Internet access and proper browser capabilities. Information stored at this environment would be restricted based on user assigned credentials. This report discusses current development of a cloud-based web portal for PRA tools.

  18. CILogon: An Integrated Identity and Access Management Platform for Science

    NASA Astrophysics Data System (ADS)

    Basney, J.

    2016-12-01

    When scientists work together, they use web sites and other software to share their ideas and data. To ensure the integrity of their work, these systems require the scientists to log in and verify that they are part of the team working on a particular science problem. Too often, the identity and access verification process is a stumbling block for the scientists. Scientific research projects are forced to invest time and effort into developing and supporting Identity and Access Management (IAM) services, distracting them from the core goals of their research collaboration. CILogon provides an IAM platform that enables scientists to work together to meet their IAM needs more effectively so they can allocate more time and effort to their core mission of scientific research. The CILogon platform enables federated identity management and collaborative organization management. Federated identity management enables researchers to use their home organization identities to access cyberinfrastructure, rather than requiring yet another username and password to log on. Collaborative organization management enables research projects to define user groups for authorization to collaboration platforms (e.g., wikis, mailing lists, and domain applications). CILogon's IAM platform serves the unique needs of research collaborations, namely the need to dynamically form collaboration groups across organizations and countries, sharing access to data, instruments, compute clusters, and other resources to enable scientific discovery. CILogon provides a software-as-a-service platform to ease integration with cyberinfrastructure, while making all software components publicly available under open source licenses to enable re-use. Figure 1 illustrates the components and interfaces of this platform. CILogon has been operational since 2010 and has been used by over 7,000 researchers from more than 170 identity providers to access cyberinfrastructure including Globus, LIGO, Open Science Grid, SeedMe, and XSEDE. The "CILogon 2.0" platform, launched in 2016, adds support for virtual organization (VO) membership management, identity linking, international collaborations, and standard integration protocols, through integration with the Internet2 COmanage collaboration software.

  19. Educational process in modern climatology within the web-GIS platform "Climate"

    NASA Astrophysics Data System (ADS)

    Gordova, Yulia; Gorbatenko, Valentina; Gordov, Evgeny; Martynova, Yulia; Okladnikov, Igor; Titov, Alexander; Shulgina, Tamara

    2013-04-01

    These days, common to all scientific fields the problem of training of scientists in the environmental sciences is exacerbated by the need to develop new computational and information technology skills in distributed multi-disciplinary teams. To address this and other pressing problems of Earth system sciences, software infrastructure for information support of integrated research in the geosciences was created based on modern information and computational technologies and a software and hardware platform "Climate» (http://climate.scert.ru/) was developed. In addition to the direct analysis of geophysical data archives, the platform is aimed at teaching the basics of the study of changes in regional climate. The educational component of the platform includes a series of lectures on climate, environmental and meteorological modeling and laboratory work cycles on the basics of analysis of current and potential future regional climate change using Siberia territory as an example. The educational process within the Platform is implemented using the distance learning system Moodle (www.moodle.org). This work is partially supported by the Ministry of education and science of the Russian Federation (contract #8345), SB RAS project VIII.80.2.1, RFBR grant #11-05-01190a, and integrated project SB RAS #131.

  20. Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Yier

    As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from thismore » project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.« less

  1. Final Scientific/Technical Report for "Enabling Exascale Hardware and Software Design through Scalable System Virtualization"

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dinda, Peter August

    2015-03-17

    This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems softwaremore » for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3), although a prototype connecting Palacios with the GEM5 architectural simulator was demonstrated, our conclusion was that such a platform was less useful for design space exploration than anticipated due to inherent complexity of the connection between the instruction set architecture level and the microarchitectural level. For effort (4), we found that a code injection approach proved to be more fruitful. The results of our efforts are publicly available in the open source Palacios codebase and published papers, all of which are available from the project web site, v3vee.org. Palacios is currently one of the two codebases (the other being Sandia’s Kitten lightweight kernel) that underlies the node operating system for the DOE Hobbes Project, one of two projects tasked with building a systems software prototype for the national exascale computing effort.« less

  2. Heterogeneous scalable framework for multiphase flows

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morris, Karla Vanessa

    2013-09-01

    Two categories of challenges confront the developer of computational spray models: those related to the computation and those related to the physics. Regarding the computation, the trend towards heterogeneous, multi- and many-core platforms will require considerable re-engineering of codes written for the current supercomputing platforms. Regarding the physics, accurate methods for transferring mass, momentum and energy from the dispersed phase onto the carrier fluid grid have so far eluded modelers. Significant challenges also lie at the intersection between these two categories. To be competitive, any physics model must be expressible in a parallel algorithm that performs well on evolving computermore » platforms. This work created an application based on a software architecture where the physics and software concerns are separated in a way that adds flexibility to both. The develop spray-tracking package includes an application programming interface (API) that abstracts away the platform-dependent parallelization concerns, enabling the scientific programmer to write serial code that the API resolves into parallel processes and threads of execution. The project also developed the infrastructure required to provide similar APIs to other application. The API allow object-oriented Fortran applications direct interaction with Trilinos to support memory management of distributed objects in central processing units (CPU) and graphic processing units (GPU) nodes for applications using C++.« less

  3. Digital imaging of root traits (DIRT): a high-throughput computing and collaboration platform for field-based root phenomics.

    PubMed

    Das, Abhiram; Schneider, Hannah; Burridge, James; Ascanio, Ana Karine Martinez; Wojciechowski, Tobias; Topp, Christopher N; Lynch, Jonathan P; Weitz, Joshua S; Bucksch, Alexander

    2015-01-01

    Plant root systems are key drivers of plant function and yield. They are also under-explored targets to meet global food and energy demands. Many new technologies have been developed to characterize crop root system architecture (CRSA). These technologies have the potential to accelerate the progress in understanding the genetic control and environmental response of CRSA. Putting this potential into practice requires new methods and algorithms to analyze CRSA in digital images. Most prior approaches have solely focused on the estimation of root traits from images, yet no integrated platform exists that allows easy and intuitive access to trait extraction and analysis methods from images combined with storage solutions linked to metadata. Automated high-throughput phenotyping methods are increasingly used in laboratory-based efforts to link plant genotype with phenotype, whereas similar field-based studies remain predominantly manual low-throughput. Here, we present an open-source phenomics platform "DIRT", as a means to integrate scalable supercomputing architectures into field experiments and analysis pipelines. DIRT is an online platform that enables researchers to store images of plant roots, measure dicot and monocot root traits under field conditions, and share data and results within collaborative teams and the broader community. The DIRT platform seamlessly connects end-users with large-scale compute "commons" enabling the estimation and analysis of root phenotypes from field experiments of unprecedented size. DIRT is an automated high-throughput computing and collaboration platform for field based crop root phenomics. The platform is accessible at http://www.dirt.iplantcollaborative.org/ and hosted on the iPlant cyber-infrastructure using high-throughput grid computing resources of the Texas Advanced Computing Center (TACC). DIRT is a high volume central depository and high-throughput RSA trait computation platform for plant scientists working on crop roots. It enables scientists to store, manage and share crop root images with metadata and compute RSA traits from thousands of images in parallel. It makes high-throughput RSA trait computation available to the community with just a few button clicks. As such it enables plant scientists to spend more time on science rather than on technology. All stored and computed data is easily accessible to the public and broader scientific community. We hope that easy data accessibility will attract new tool developers and spur creative data usage that may even be applied to other fields of science.

  4. The Benefits and Complexities of Operating Geographic Information Systems (GIS) in a High Performance Computing (HPC) Environment

    NASA Astrophysics Data System (ADS)

    Shute, J.; Carriere, L.; Duffy, D.; Hoy, E.; Peters, J.; Shen, Y.; Kirschbaum, D.

    2017-12-01

    The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center is building and maintaining an Enterprise GIS capability for its stakeholders, to include NASA scientists, industry partners, and the public. This platform is powered by three GIS subsystems operating in a highly-available, virtualized environment: 1) the Spatial Analytics Platform is the primary NCCS GIS and provides users discoverability of the vast DigitalGlobe/NGA raster assets within the NCCS environment; 2) the Disaster Mapping Platform provides mapping and analytics services to NASA's Disaster Response Group; and 3) the internal (Advanced Data Analytics Platform/ADAPT) enterprise GIS provides users with the full suite of Esri and open source GIS software applications and services. All systems benefit from NCCS's cutting edge infrastructure, to include an InfiniBand network for high speed data transfers; a mixed/heterogeneous environment featuring seamless sharing of information between Linux and Windows subsystems; and in-depth system monitoring and warning systems. Due to its co-location with the NCCS Discover High Performance Computing (HPC) environment and the Advanced Data Analytics Platform (ADAPT), the GIS platform has direct access to several large NCCS datasets including DigitalGlobe/NGA, Landsat, MERRA, and MERRA2. Additionally, the NCCS ArcGIS Desktop Windows virtual machines utilize existing NetCDF and OPeNDAP assets for visualization, modelling, and analysis - thus eliminating the need for data duplication. With the advent of this platform, Earth scientists have full access to vast data repositories and the industry-leading tools required for successful management and analysis of these multi-petabyte, global datasets. The full system architecture and integration with scientific datasets will be presented. Additionally, key applications and scientific analyses will be explained, to include the NASA Global Landslide Catalog (GLC) Reporter crowdsourcing application, the NASA GLC Viewer discovery and analysis tool, the DigitalGlobe/NGA Data Discovery Tool, the NASA Disaster Response Group Mapping Platform (https://maps.disasters.nasa.gov), and support for NASA's Arctic - Boreal Vulnerability Experiment (ABoVE).

  5. ITK: enabling reproducible research and open science

    PubMed Central

    McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis

    2014-01-01

    Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46. PMID:24600387

  6. ITK: enabling reproducible research and open science.

    PubMed

    McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis

    2014-01-01

    Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46.

  7. Three-dimensional dynamics of scientific balloon systems in response to sudden gust loadings. [including a computer program user manual

    NASA Technical Reports Server (NTRS)

    Dorsey, D. R., Jr.

    1975-01-01

    A mathematical model was developed of the three-dimensional dynamics of a high-altitude scientific research balloon system perturbed from its equilibrium configuration by an arbitrary gust loading. The platform is modelled as a system of four coupled pendula, and the equations of motion were developed in the Lagrangian formalism assuming a small-angle approximation. Three-dimensional pendulation, torsion, and precessional motion due to Coriolis forces are considered. Aerodynamic and viscous damping effects on the pendulatory and torsional motions are included. A general model of the gust field incident upon the balloon system was developed. The digital computer simulation program is described, and a guide to its use is given.

  8. Technology & Disability: Research, Design, Practice, and Policy. Proceedings of the RESNA International Conference (25th, Minneapolis, Minnesota, June 27-July 1, 2002).

    ERIC Educational Resources Information Center

    Simpson, Richard, Ed.

    These proceedings of the 2002 annual RESNA (Association for the Advancement of Rehabilitation Technology) conference include more than 200 presentations on all facets of assistive technology, including concurrent sessions, scientific platform sessions, interactive poster presentations, computer demonstrations, and the research symposium. The…

  9. Opportunities for Computational Discovery in Basic Energy Sciences

    NASA Astrophysics Data System (ADS)

    Pederson, Mark

    2011-03-01

    An overview of the broad-ranging support of computational physics and computational science within the Department of Energy Office of Science will be provided. Computation as the third branch of physics is supported by all six offices (Advanced Scientific Computing, Basic Energy, Biological and Environmental, Fusion Energy, High-Energy Physics, and Nuclear Physics). Support focuses on hardware, software and applications. Most opportunities within the fields of~condensed-matter physics, chemical-physics and materials sciences are supported by the Officeof Basic Energy Science (BES) or through partnerships between BES and the Office for Advanced Scientific Computing. Activities include radiation sciences, catalysis, combustion, materials in extreme environments, energy-storage materials, light-harvesting and photovoltaics, solid-state lighting and superconductivity.~ A summary of two recent reports by the computational materials and chemical communities on the role of computation during the next decade will be provided. ~In addition to materials and chemistry challenges specific to energy sciences, issues identified~include a focus on the role of the domain scientist in integrating, expanding and sustaining applications-oriented capabilities on evolving high-performance computing platforms and on the role of computation in accelerating the development of innovative technologies. ~~

  10. SCEC Earthquake System Science Using High Performance Computing

    NASA Astrophysics Data System (ADS)

    Maechling, P. J.; Jordan, T. H.; Archuleta, R.; Beroza, G.; Bielak, J.; Chen, P.; Cui, Y.; Day, S.; Deelman, E.; Graves, R. W.; Minster, J. B.; Olsen, K. B.

    2008-12-01

    The SCEC Community Modeling Environment (SCEC/CME) collaboration performs basic scientific research using high performance computing with the goal of developing a predictive understanding of earthquake processes and seismic hazards in California. SCEC/CME research areas including dynamic rupture modeling, wave propagation modeling, probabilistic seismic hazard analysis (PSHA), and full 3D tomography. SCEC/CME computational capabilities are organized around the development and application of robust, re- usable, well-validated simulation systems we call computational platforms. The SCEC earthquake system science research program includes a wide range of numerical modeling efforts and we continue to extend our numerical modeling codes to include more realistic physics and to run at higher and higher resolution. During this year, the SCEC/USGS OpenSHA PSHA computational platform was used to calculate PSHA hazard curves and hazard maps using the new UCERF2.0 ERF and new 2008 attenuation relationships. Three SCEC/CME modeling groups ran 1Hz ShakeOut simulations using different codes and computer systems and carefully compared the results. The DynaShake Platform was used to calculate several dynamic rupture-based source descriptions equivalent in magnitude and final surface slip to the ShakeOut 1.2 kinematic source description. A SCEC/CME modeler produced 10Hz synthetic seismograms for the ShakeOut 1.2 scenario rupture by combining 1Hz deterministic simulation results with 10Hz stochastic seismograms. SCEC/CME modelers ran an ensemble of seven ShakeOut-D simulations to investigate the variability of ground motions produced by dynamic rupture-based source descriptions. The CyberShake Platform was used to calculate more than 15 new probabilistic seismic hazard analysis (PSHA) hazard curves using full 3D waveform modeling and the new UCERF2.0 ERF. The SCEC/CME group has also produced significant computer science results this year. Large-scale SCEC/CME high performance codes were run on NSF TeraGrid sites including simulations that use the full PSC Big Ben supercomputer (4096 cores) and simulations that ran on more than 10K cores at TACC Ranger. The SCEC/CME group used scientific workflow tools and grid-computing to run more than 1.5 million jobs at NCSA for the CyberShake project. Visualizations produced by a SCEC/CME researcher of the 10Hz ShakeOut 1.2 scenario simulation data were used by USGS in ShakeOut publications and public outreach efforts. OpenSHA was ported onto an NSF supercomputer and was used to produce very high resolution hazard PSHA maps that contained more than 1.6 million hazard curves.

  11. Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sreepathi, Sarat; Kumar, Jitendra; Mills, Richard T.

    A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like themore » Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.« less

  12. Large Spatial Scale Ground Displacement Mapping through the P-SBAS Processing of Sentinel-1 Data on a Cloud Computing Environment

    NASA Astrophysics Data System (ADS)

    Casu, F.; Bonano, M.; de Luca, C.; Lanari, R.; Manunta, M.; Manzo, M.; Zinno, I.

    2017-12-01

    Since its launch in 2014, the Sentinel-1 (S1) constellation has played a key role on SAR data availability and dissemination all over the World. Indeed, the free and open access data policy adopted by the European Copernicus program together with the global coverage acquisition strategy, make the Sentinel constellation as a game changer in the Earth Observation scenario. Being the SAR data become ubiquitous, the technological and scientific challenge is focused on maximizing the exploitation of such huge data flow. In this direction, the use of innovative processing algorithms and distributed computing infrastructures, such as the Cloud Computing platforms, can play a crucial role. In this work we present a Cloud Computing solution for the advanced interferometric (DInSAR) processing chain based on the Parallel SBAS (P-SBAS) approach, aimed at processing S1 Interferometric Wide Swath (IWS) data for the generation of large spatial scale deformation time series in efficient, automatic and systematic way. Such a DInSAR chain ingests Sentinel 1 SLC images and carries out several processing steps, to finally compute deformation time series and mean deformation velocity maps. Different parallel strategies have been designed ad hoc for each processing step of the P-SBAS S1 chain, encompassing both multi-core and multi-node programming techniques, in order to maximize the computational efficiency achieved within a Cloud Computing environment and cut down the relevant processing times. The presented P-SBAS S1 processing chain has been implemented on the Amazon Web Services platform and a thorough analysis of the attained parallel performances has been performed to identify and overcome the major bottlenecks to the scalability. The presented approach is used to perform national-scale DInSAR analyses over Italy, involving the processing of more than 3000 S1 IWS images acquired from both ascending and descending orbits. Such an experiment confirms the big advantage of exploiting large computational and storage resources of Cloud Computing platforms for large scale DInSAR analysis. The presented Cloud Computing P-SBAS processing chain can be a precious tool in the perspective of developing operational services disposable for the EO scientific community related to hazard monitoring and risk prevention and mitigation.

  13. Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

    NASA Astrophysics Data System (ADS)

    Amooie, M. A.; Moortgat, J.

    2017-12-01

    We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.

  14. XML Based Scientific Data Management Facility

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Zubair, M.; Ziebartt, John (Technical Monitor)

    2001-01-01

    The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of HTML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.

  15. XML Based Scientific Data Management Facility

    NASA Technical Reports Server (NTRS)

    Mehrotra, P.; Zubair, M.; Bushnell, Dennis M. (Technical Monitor)

    2002-01-01

    The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of XML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management ,facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.

  16. File-System Workload on a Scientific Multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1995-01-01

    Many scientific applications have intense computational and I/O requirements. Although multiprocessors have permitted astounding increases in computational performance, the formidable I/O needs of these applications cannot be met by current multiprocessors a their I/O subsystems. To prevent I/O subsystems from forever bottlenecking multiprocessors and limiting the range of feasible applications, new I/O subsystems must be designed. The successful design of computer systems (both hardware and software) depends on a thorough understanding of their intended use. A system designer optimizes the policies and mechanisms for the cases expected to most common in the user's workload. In the case of multiprocessor file systems, however, designers have been forced to build file systems based only on speculation about how they would be used, extrapolating from file-system characterizations of general-purpose workloads on uniprocessor and distributed systems or scientific workloads on vector supercomputers (see sidebar on related work). To help these system designers, in June 1993 we began the Charisma Project, so named because the project sought to characterize 1/0 in scientific multiprocessor applications from a variety of production parallel computing platforms and sites. The Charisma project is unique in recording individual read and write requests-in live, multiprogramming, parallel workloads (rather than from selected or nonparallel applications). In this article, we present the first results from the project: a characterization of the file-system workload an iPSC/860 multiprocessor running production, parallel scientific applications at NASA's Ames Research Center.

  17. Parallel Computation of the Regional Ocean Modeling System (ROMS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, P; Song, Y T; Chao, Y

    2005-04-05

    The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less

  18. The Cell Collective: Toward an open and collaborative approach to systems biology

    PubMed Central

    2012-01-01

    Background Despite decades of new discoveries in biomedical research, the overwhelming complexity of cells has been a significant barrier to a fundamental understanding of how cells work as a whole. As such, the holistic study of biochemical pathways requires computer modeling. Due to the complexity of cells, it is not feasible for one person or group to model the cell in its entirety. Results The Cell Collective is a platform that allows the world-wide scientific community to create these models collectively. Its interface enables users to build and use models without specifying any mathematical equations or computer code - addressing one of the major hurdles with computational research. In addition, this platform allows scientists to simulate and analyze the models in real-time on the web, including the ability to simulate loss/gain of function and test what-if scenarios in real time. Conclusions The Cell Collective is a web-based platform that enables laboratory scientists from across the globe to collaboratively build large-scale models of various biological processes, and simulate/analyze them in real time. In this manuscript, we show examples of its application to a large-scale model of signal transduction. PMID:22871178

  19. Dispel4py: An Open-Source Python library for Data-Intensive Seismology

    NASA Astrophysics Data System (ADS)

    Filgueira, Rosa; Krause, Amrey; Spinuso, Alessandro; Klampanos, Iraklis; Danecek, Peter; Atkinson, Malcolm

    2015-04-01

    Scientific workflows are a necessary tool for many scientific communities as they enable easy composition and execution of applications on computing resources while scientists can focus on their research without being distracted by the computation management. Nowadays, scientific communities (e.g. Seismology) have access to a large variety of computing resources and their computational problems are best addressed using parallel computing technology. However, successful use of these technologies requires a lot of additional machinery whose use is not straightforward for non-experts: different parallel frameworks (MPI, Storm, multiprocessing, etc.) must be used depending on the computing resources (local machines, grids, clouds, clusters) where applications are run. This implies that for achieving the best applications' performance, users usually have to change their codes depending on the features of the platform selected for running them. This work presents dispel4py, a new open-source Python library for describing abstract stream-based workflows for distributed data-intensive applications. Special care has been taken to provide dispel4py with the ability to map abstract workflows to different platforms dynamically at run-time. Currently dispel4py has four mappings: Apache Storm, MPI, multi-threading and sequential. The main goal of dispel4py is to provide an easy-to-use tool to develop and test workflows in local resources by using the sequential mode with a small dataset. Later, once a workflow is ready for long runs, it can be automatically executed on different parallel resources. dispel4py takes care of the underlying mappings by performing an efficient parallelisation. Processing Elements (PE) represent the basic computational activities of any dispel4Py workflow, which can be a seismologic algorithm, or a data transformation process. For creating a dispel4py workflow, users only have to write very few lines of code to describe their PEs and how they are connected by using Python, which is widely supported on many platforms and is popular in many scientific domains, such as in geosciences. Once, a dispel4py workflow is written, a user only has to select which mapping they would like to use, and everything else (parallelisation, distribution of data) is carried on by dispel4py without any cost to the user. Among all dispel4py features we would like to highlight the following: * The PEs are connected by streams and not by writing to and reading from intermediate files, avoiding many IO operations. * The PEs can be stored into a registry. Therefore, different users can recombine PEs in many different workflows. * dispel4py has been enriched with a provenance mechanism to support runtime provenance analysis. We have adopted the W3C-PROV data model, which is accessible via a prototypal browser-based user interface and a web API. It supports the users with the visualisation of graphical products and offers combined operations to access and download the data, which may be selectively stored at runtime, into dedicated data archives. dispel4py has been already used by seismologists in the VERCE project to develop different seismic workflows. One of them is the Seismic Ambient Noise Cross-Correlation workflow, which preprocesses and cross-correlates traces from several stations. First, this workflow was tested on a local machine by using a small number of stations as input data. Later, it was executed on different parallel platforms (SuperMUC cluster, and Terracorrelator machine), automatically scaling up by using MPI and multiprocessing mappings and up to 1000 stations as input data. The results show that the dispel4py achieves scalable performance in both mappings tested on different parallel platforms.

  20. The TeraShake Computational Platform for Large-Scale Earthquake Simulations

    NASA Astrophysics Data System (ADS)

    Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

    Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.

  1. Cloud-based Jupyter Notebooks for Water Data Analysis

    NASA Astrophysics Data System (ADS)

    Castronova, A. M.; Brazil, L.; Seul, M.

    2017-12-01

    The development and adoption of technologies by the water science community to improve our ability to openly collaborate and share workflows will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. Jupyter notebooks offer one solution by providing an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Adoption of this technology within the water sciences, coupled with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data intensive toolchains. Moreover, implementing this software stack in a cloud-based environment extends its native functionality to provide researchers a mechanism to build and execute toolchains that are too large or computationally demanding for typical desktop computers. Additionally, this cloud-based solution enables scientists to disseminate data processing routines alongside journal publications in an effort to support reproducibility. For example, these data collection and analysis toolchains can be shared, archived, and published using the HydroShare platform or downloaded and executed locally to reproduce scientific analysis. This work presents the design and implementation of a cloud-based Jupyter environment and its application for collecting, aggregating, and munging various datasets in a transparent, sharable, and self-documented manner. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss recent efforts towards achieving these goals, and describe the architectural design of the notebook server in an effort to support collaborative and reproducible science.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S.; Yoginath, Srikanth B.

    Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case study is presented here in reversing a computational core, namely, Basic Linear Algebra Subprograms, which is widely used in scientific applications. A new Reversible BLAS (RBLAS) library interface has been designed, and a prototype has been implemented with two modes: (1) amore » memory-mode in which reversibility is obtained by checkpointing to memory in forward and restoring from memory in reverse, and (2) a computational-mode in which nothing is saved in the forward, but restoration is done entirely via inverse computation in reverse. The article is focused on detailed performance benchmarking to evaluate the runtime dynamics and performance effects, comparing reversible computation with checkpointing on both traditional CPU platforms and recent GPU accelerator platforms. For BLAS Level-1 subprograms, data indicates over an order of magnitude better speed of reversible computation compared to checkpointing. For BLAS Level-2 and Level-3, a more complex tradeoff is observed between reversible computation and checkpointing, depending on computational and memory complexities of the subprograms.« less

  3. Cloud Computing for Geosciences--GeoCloud for standardized geospatial service platforms (Invited)

    NASA Astrophysics Data System (ADS)

    Nebert, D. D.; Huang, Q.; Yang, C.

    2013-12-01

    The 21st century geoscience faces challenges of Big Data, spike computing requirements (e.g., when natural disaster happens), and sharing resources through cyberinfrastructure across different organizations (Yang et al., 2011). With flexibility and cost-efficiency of computing resources a primary concern, cloud computing emerges as a promising solution to provide core capabilities to address these challenges. Many governmental and federal agencies are adopting cloud technologies to cut costs and to make federal IT operations more efficient (Huang et al., 2010). However, it is still difficult for geoscientists to take advantage of the benefits of cloud computing to facilitate the scientific research and discoveries. This presentation reports using GeoCloud to illustrate the process and strategies used in building a common platform for geoscience communities to enable the sharing, integration of geospatial data, information and knowledge across different domains. GeoCloud is an annual incubator project coordinated by the Federal Geographic Data Committee (FGDC) in collaboration with the U.S. General Services Administration (GSA) and the Department of Health and Human Services. It is designed as a staging environment to test and document the deployment of a common GeoCloud community platform that can be implemented by multiple agencies. With these standardized virtual geospatial servers, a variety of government geospatial applications can be quickly migrated to the cloud. In order to achieve this objective, multiple projects are nominated each year by federal agencies as existing public-facing geospatial data services. From the initial candidate projects, a set of common operating system and software requirements was identified as the baseline for platform as a service (PaaS) packages. Based on these developed common platform packages, each project deploys and monitors its web application, develops best practices, and documents cost and performance information. This paper presents the background, architectural design, and activities of GeoCloud in support of the Geospatial Platform Initiative. System security strategies and approval processes for migrating federal geospatial data, information, and applications into cloud, and cost estimation for cloud operations are covered. Finally, some lessons learned from the GeoCloud project are discussed as reference for geoscientists to consider in the adoption of cloud computing.

  4. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less

  5. Theoretical and technological building blocks for an innovation accelerator

    NASA Astrophysics Data System (ADS)

    van Harmelen, F.; Kampis, G.; Börner, K.; van den Besselaar, P.; Schultes, E.; Goble, C.; Groth, P.; Mons, B.; Anderson, S.; Decker, S.; Hayes, C.; Buecheler, T.; Helbing, D.

    2012-11-01

    Modern science is a main driver of technological innovation. The efficiency of the scientific system is of key importance to ensure the competitiveness of a nation or region. However, the scientific system that we use today was devised centuries ago and is inadequate for our current ICT-based society: the peer review system encourages conservatism, journal publications are monolithic and slow, data is often not available to other scientists, and the independent validation of results is limited. The resulting scientific process is hence slow and sloppy. Building on the Innovation Accelerator paper by Helbing and Balietti [1], this paper takes the initial global vision and reviews the theoretical and technological building blocks that can be used for implementing an innovation (in first place: science) accelerator platform driven by re-imagining the science system. The envisioned platform would rest on four pillars: (i) Redesign the incentive scheme to reduce behavior such as conservatism, herding and hyping; (ii) Advance scientific publications by breaking up the monolithic paper unit and introducing other building blocks such as data, tools, experiment workflows, resources; (iii) Use machine readable semantics for publications, debate structures, provenance etc. in order to include the computer as a partner in the scientific process, and (iv) Build an online platform for collaboration, including a network of trust and reputation among the different types of stakeholders in the scientific system: scientists, educators, funding agencies, policy makers, students and industrial innovators among others. Any such improvements to the scientific system must support the entire scientific process (unlike current tools that chop up the scientific process into disconnected pieces), must facilitate and encourage collaboration and interdisciplinarity (again unlike current tools), must facilitate the inclusion of intelligent computing in the scientific process, must facilitate not only the core scientific process, but also accommodate other stakeholders such science policy makers, industrial innovators, and the general public. We first describe the current state of the scientific system together with up to a dozen new key initiatives, including an analysis of the role of science as an innovation accelerator. Our brief survey will show that there exist many separate ideas and concepts and diverse stand-alone demonstrator systems for different components of the ecosystem with many parts are still unexplored, and overall integration lacking. By analyzing a matrix of stakeholders vs. functionalities, we identify the required innovations. We (non-exhaustively) discuss a few of them: Publications that are meaningful to machines, innovative reviewing processes, data publication, workflow archiving and reuse, alternative impact metrics, tools for the detection of trends, community formation and emergence, as well as modular publications, citation objects and debate graphs. To summarize, the core idea behind the Innovation Accelerator is to develop new incentive models, rules, and interaction mechanisms to stimulate true innovation, revolutionizing the way in which we create knowledge and disseminate information.

  6. Grids, virtualization, and clouds at Fermilab

    DOE PAGES

    Timm, S.; Chadwick, K.; Garzoglio, G.; ...

    2014-06-11

    Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture andmore » the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). Lastly, this work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.« less

  7. Grids, virtualization, and clouds at Fermilab

    NASA Astrophysics Data System (ADS)

    Timm, S.; Chadwick, K.; Garzoglio, G.; Noh, S.

    2014-06-01

    Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture and the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). This work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.

  8. Symmetrical compression distance for arrhythmia discrimination in cloud-based big-data services.

    PubMed

    Lillo-Castellano, J M; Mora-Jiménez, I; Santiago-Mozos, R; Chavarría-Asso, F; Cano-González, A; García-Alberola, A; Rojo-Álvarez, J L

    2015-07-01

    The current development of cloud computing is completely changing the paradigm of data knowledge extraction in huge databases. An example of this technology in the cardiac arrhythmia field is the SCOOP platform, a national-level scientific cloud-based big data service for implantable cardioverter defibrillators. In this scenario, we here propose a new methodology for automatic classification of intracardiac electrograms (EGMs) in a cloud computing system, designed for minimal signal preprocessing. A new compression-based similarity measure (CSM) is created for low computational burden, so-called weighted fast compression distance, which provides better performance when compared with other CSMs in the literature. Using simple machine learning techniques, a set of 6848 EGMs extracted from SCOOP platform were classified into seven cardiac arrhythmia classes and one noise class, reaching near to 90% accuracy when previous patient arrhythmia information was available and 63% otherwise, hence overcoming in all cases the classification provided by the majority class. Results show that this methodology can be used as a high-quality service of cloud computing, providing support to physicians for improving the knowledge on patient diagnosis.

  9. Wolfram technologies as an integrated scalable platform for interactive learning

    NASA Astrophysics Data System (ADS)

    Kaurov, Vitaliy

    2012-02-01

    We rely on technology profoundly with the prospect of even greater integration in the future. Well known challenges in education are a technology-inadequate curriculum and many software platforms that are difficult to scale or interconnect. We'll review an integrated technology, much of it free, that addresses these issues for individuals and small schools as well as for universities. Topics include: Mathematica, a programming environment that offers a diverse range of functionality; natural language programming for getting started quickly and accessing data from Wolfram|Alpha; quick and easy construction of interactive courseware and scientific applications; partnering with publishers to create interactive e-textbooks; course assistant apps for mobile platforms; the computable document format (CDF); teacher-student and student-student collaboration on interactive projects and web publishing at the Wolfram Demonstrations site.

  10. Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources

    NASA Astrophysics Data System (ADS)

    Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

    2014-12-01

    The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.

  11. Lawrence Livermore National Laboratories Perspective on Code Development and High Performance Computing Resources in Support of the National HED/ICF Effort

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clouse, C. J.; Edwards, M. J.; McCoy, M. G.

    2015-07-07

    Through its Advanced Scientific Computing (ASC) and Inertial Confinement Fusion (ICF) code development efforts, Lawrence Livermore National Laboratory (LLNL) provides a world leading numerical simulation capability for the National HED/ICF program in support of the Stockpile Stewardship Program (SSP). In addition the ASC effort provides high performance computing platform capabilities upon which these codes are run. LLNL remains committed to, and will work with, the national HED/ICF program community to help insure numerical simulation needs are met and to make those capabilities available, consistent with programmatic priorities and available resources.

  12. Methods for open innovation on a genome-design platform associating scientific, commercial, and educational communities in synthetic biology.

    PubMed

    Toyoda, Tetsuro

    2011-01-01

    Synthetic biology requires both engineering efficiency and compliance with safety guidelines and ethics. Focusing on the rational construction of biological systems based on engineering principles, synthetic biology depends on a genome-design platform to explore the combinations of multiple biological components or BIO bricks for quickly producing innovative devices. This chapter explains the differences among various platform models and details a methodology for promoting open innovation within the scope of the statutory exemption of patent laws. The detailed platform adopts a centralized evaluation model (CEM), computer-aided design (CAD) bricks, and a freemium model. It is also important for the platform to support the legal aspects of copyrights as well as patent and safety guidelines because intellectual work including DNA sequences designed rationally by human intelligence is basically copyrightable. An informational platform with high traceability, transparency, auditability, and security is required for copyright proof, safety compliance, and incentive management for open innovation in synthetic biology. GenoCon, which we have organized and explained here, is a competition-styled, open-innovation method involving worldwide participants from scientific, commercial, and educational communities that aims to improve the designs of genomic sequences that confer a desired function on an organism. Using only a Web browser, a participating contributor proposes a design expressed with CAD bricks that generate a relevant DNA sequence, which is then experimentally and intensively evaluated by the GenoCon organizers. The CAD bricks that comprise programs and databases as a Semantic Web are developed, executed, shared, reused, and well stocked on the secure Semantic Web platform called the Scientists' Networking System or SciNetS/SciNeS, based on which a CEM research center for synthetic biology and open innovation should be established. Copyright © 2011 Elsevier Inc. All rights reserved.

  13. On-demand Simulation of Atmospheric Transport Processes on the AlpEnDAC Cloud

    NASA Astrophysics Data System (ADS)

    Hachinger, S.; Harsch, C.; Meyer-Arnek, J.; Frank, A.; Heller, H.; Giemsa, E.

    2016-12-01

    The "Alpine Environmental Data Analysis Centre" (AlpEnDAC) develops a data-analysis platform for high-altitude research facilities within the "Virtual Alpine Observatory" project (VAO). This platform, with its web portal, will support use cases going much beyond data management: On user request, the data are augmented with "on-demand" simulation results, such as air-parcel trajectories for tracing down the source of pollutants when they appear in high concentration. The respective back-end mechanism uses the Compute Cloud of the Leibniz Supercomputing Centre (LRZ) to transparently calculate results requested by the user, as far as they have not yet been stored in AlpEnDAC. The queuing-system operation model common in supercomputing is replaced by a model in which Virtual Machines (VMs) on the cloud are automatically created/destroyed, providing the necessary computing power immediately on demand. From a security point of view, this allows to perform simulations in a sandbox defined by the VM configuration, without direct access to a computing cluster. Within few minutes, the user receives conveniently visualized results. The AlpEnDAC infrastructure is distributed among two participating institutes [front-end at German Aerospace Centre (DLR), simulation back-end at LRZ], requiring an efficient mechanism for synchronization of measured and augmented data. We discuss our iRODS-based solution for these data-management tasks as well as the general AlpEnDAC framework. Our cloud-based offerings aim at making scientific computing for our users much more convenient and flexible than it has been, and to allow scientists without a broad background in scientific computing to benefit from complex numerical simulations.

  14. Accuracy of the lattice-Boltzmann method using the Cell processor

    NASA Astrophysics Data System (ADS)

    Harvey, M. J.; de Fabritiis, G.; Giupponi, G.

    2008-11-01

    Accelerator processors like the new Cell processor are extending the traditional platforms for scientific computation, allowing orders of magnitude more floating-point operations per second (flops) compared to standard central processing units. However, they currently lack double-precision support and support for some IEEE 754 capabilities. In this work, we develop a lattice-Boltzmann (LB) code to run on the Cell processor and test the accuracy of this lattice method on this platform. We run tests for different flow topologies, boundary conditions, and Reynolds numbers in the range Re=6 350 . In one case, simulation results show a reduced mass and momentum conservation compared to an equivalent double-precision LB implementation. All other cases demonstrate the utility of the Cell processor for fluid dynamics simulations. Benchmarks on two Cell-based platforms are performed, the Sony Playstation3 and the QS20/QS21 IBM blade, obtaining a speed-up factor of 7 and 21, respectively, compared to the original PC version of the code, and a conservative sustained performance of 28 gigaflops per single Cell processor. Our results suggest that choice of IEEE 754 rounding mode is possibly as important as double-precision support for this specific scientific application.

  15. Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis.

    PubMed

    Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E

    2018-04-15

    Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele. darrell.hurt@nih.gov.

  16. Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis

    PubMed Central

    Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E

    2018-01-01

    Abstract Motivation Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. Results The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. Availability and implementation https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele Contact darrell.hurt@nih.gov PMID:29028892

  17. The Materials Commons: A Collaboration Platform and Information Repository for the Global Materials Community

    NASA Astrophysics Data System (ADS)

    Puchala, Brian; Tarcea, Glenn; Marquis, Emmanuelle. A.; Hedstrom, Margaret; Jagadish, H. V.; Allison, John E.

    2016-08-01

    Accelerating the pace of materials discovery and development requires new approaches and means of collaborating and sharing information. To address this need, we are developing the Materials Commons, a collaboration platform and information repository for use by the structural materials community. The Materials Commons has been designed to be a continuous, seamless part of the scientific workflow process. Researchers upload the results of experiments and computations as they are performed, automatically where possible, along with the provenance information describing the experimental and computational processes. The Materials Commons website provides an easy-to-use interface for uploading and downloading data and data provenance, as well as for searching and sharing data. This paper provides an overview of the Materials Commons. Concepts are also outlined for integrating the Materials Commons with the broader Materials Information Infrastructure that is evolving to support the Materials Genome Initiative.

  18. Los Alamos radiation transport code system on desktop computing platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Briesmeister, J.F.; Brinkley, F.W.; Clark, B.A.

    The Los Alamos Radiation Transport Code System (LARTCS) consists of state-of-the-art Monte Carlo and discrete ordinates transport codes and data libraries. These codes were originally developed many years ago and have undergone continual improvement. With a large initial effort and continued vigilance, the codes are easily portable from one type of hardware to another. The performance of scientific work-stations (SWS) has evolved to the point that such platforms can be used routinely to perform sophisticated radiation transport calculations. As the personal computer (PC) performance approaches that of the SWS, the hardware options for desk-top radiation transport calculations expands considerably. Themore » current status of the radiation transport codes within the LARTCS is described: MCNP, SABRINA, LAHET, ONEDANT, TWODANT, TWOHEX, and ONELD. Specifically, the authors discuss hardware systems on which the codes run and present code performance comparisons for various machines.« less

  19. Virtual Labs (Science Gateways) as platforms for Free and Open Source Science

    NASA Astrophysics Data System (ADS)

    Lescinsky, David; Car, Nicholas; Fraser, Ryan; Friedrich, Carsten; Kemp, Carina; Squire, Geoffrey

    2016-04-01

    The Free and Open Source Software (FOSS) movement promotes community engagement in software development, as well as provides access to a range of sophisticated technologies that would be prohibitively expensive if obtained commercially. However, as geoinformatics and eResearch tools and services become more dispersed, it becomes more complicated to identify and interface between the many required components. Virtual Laboratories (VLs, also known as Science Gateways) simplify the management and coordination of these components by providing a platform linking many, if not all, of the steps in particular scientific processes. These enable scientists to focus on their science, rather than the underlying supporting technologies. We describe a modular, open source, VL infrastructure that can be reconfigured to create VLs for a wide range of disciplines. Development of this infrastructure has been led by CSIRO in collaboration with Geoscience Australia and the National Computational Infrastructure (NCI) with support from the National eResearch Collaboration Tools and Resources (NeCTAR) and the Australian National Data Service (ANDS). Initially, the infrastructure was developed to support the Virtual Geophysical Laboratory (VGL), and has subsequently been repurposed to create the Virtual Hazards Impact and Risk Laboratory (VHIRL) and the reconfigured Australian National Virtual Geophysics Laboratory (ANVGL). During each step of development, new capabilities and services have been added and/or enhanced. We plan on continuing to follow this model using a shared, community code base. The VL platform facilitates transparent and reproducible science by providing access to both the data and methodologies used during scientific investigations. This is further enhanced by the ability to set up and run investigations using computational resources accessed through the VL. Data is accessed using registries pointing to catalogues within public data repositories (notably including the NCI National Environmental Research Data Interoperability Platform), or by uploading data directly from user supplied addresses or files. Similarly, scientific software is accessed through registries pointing to software repositories (e.g., GitHub). Runs are configured by using or modifying default templates designed by subject matter experts. After the appropriate computational resources are identified by the user, Virtual Machines (VMs) are spun up and jobs are submitted to service providers (currently the NeCTAR public cloud or Amazon Web Services). Following completion of the jobs the results can be reviewed and downloaded if desired. By providing a unified platform for science, the VL infrastructure enables sophisticated provenance capture and management. The source of input data (including both collection and queries), user information, software information (version and configuration details) and output information are all captured and managed as a VL resource which can be linked to output data sets. This provenance resource provides a mechanism for publication and citation for Free and Open Source Science.

  20. Virtual Exploitation Environment Demonstration for Atmospheric Missions

    NASA Astrophysics Data System (ADS)

    Natali, Stefano; Mantovani, Simone; Hirtl, Marcus; Santillan, Daniel; Triebnig, Gerhard; Fehr, Thorsten; Lopes, Cristiano

    2017-04-01

    The scientific and industrial communities are being confronted with a strong increase of Earth Observation (EO) satellite missions and related data. This is in particular the case for the Atmospheric Sciences communities, with the upcoming Copernicus Sentinel-5 Precursor, Sentinel-4, -5 and -3, and ESA's Earth Explorers scientific satellites ADM-Aeolus and EarthCARE. The challenge is not only to manage the large volume of data generated by each mission / sensor, but to process and analyze the data streams. Creating synergies among the different datasets will be key to exploit the full potential of the available information. As a preparation activity supporting scientific data exploitation for Earth Explorer and Sentinel atmospheric missions, ESA funded the "Technology and Atmospheric Mission Platform" (TAMP) [1] [2] project; a scientific and technological forum (STF) has been set-up involving relevant European entities from different scientific and operational fields to define the platforḿs requirements. Data access, visualization, processing and download services have been developed to satisfy useŕs needs; use cases defined with the STF, such as study of the SO2 emissions for the Holuhraun eruption (2014) by means of two numerical models, two satellite platforms and ground measurements, global Aerosol analyses from long time series of satellite data, and local Aerosol analysis using satellite and LIDAR, have been implemented to ensure acceptance of TAMP by the atmospheric sciences community. The platform pursues the "virtual workspace" concept: all resources (data, processing, visualization, collaboration tools) are provided as "remote services", accessible through a standard web browser, to avoid the download of big data volumes and for allowing utilization of provided infrastructure for computation, analysis and sharing of results. Data access and processing are achieved through standardized protocols (WCS, WPS). As evolution toward a pre-operational environment, the "Virtual Exploitation Environment Demonstration for Atmospheric Missions" (VEEDAM) aims at maintaining, running and evolving the platform, demonstrating e.g. the possibility to perform massive processing over heterogeneous data sources. This work presents the VEEDAM concepts, provides pre-operational examples, stressing on the interoperability achievable exposing standardized data access and processing services (e.g. making accessible data and processing resources from different VREs). [1] TAMP platform landing page http://vtpip.zamg.ac.at/ [2] TAMP introductory video https://www.youtube.com/watch?v=xWiy8h1oXQY

  1. Towards reversible basic linear algebra subprograms: A performance study

    DOE PAGES

    Perumalla, Kalyan S.; Yoginath, Srikanth B.

    2014-12-06

    Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case study is presented here in reversing a computational core, namely, Basic Linear Algebra Subprograms, which is widely used in scientific applications. A new Reversible BLAS (RBLAS) library interface has been designed, and a prototype has been implemented with two modes: (1) amore » memory-mode in which reversibility is obtained by checkpointing to memory in forward and restoring from memory in reverse, and (2) a computational-mode in which nothing is saved in the forward, but restoration is done entirely via inverse computation in reverse. The article is focused on detailed performance benchmarking to evaluate the runtime dynamics and performance effects, comparing reversible computation with checkpointing on both traditional CPU platforms and recent GPU accelerator platforms. For BLAS Level-1 subprograms, data indicates over an order of magnitude better speed of reversible computation compared to checkpointing. For BLAS Level-2 and Level-3, a more complex tradeoff is observed between reversible computation and checkpointing, depending on computational and memory complexities of the subprograms.« less

  2. Visualization and Interaction in Research, Teaching, and Scientific Communication

    NASA Astrophysics Data System (ADS)

    Ammon, C. J.

    2017-12-01

    Modern computing provides many tools for exploring observations, numerical calculations, and theoretical relationships. The number of options is, in fact, almost overwhelming. But the choices provide those with modest programming skills opportunities to create unique views of scientific information and to develop deeper insights into their data, their computations, and the underlying theoretical data-model relationships. I present simple examples of using animation and human-computer interaction to explore scientific data and scientific-analysis approaches. I illustrate how valuable a little programming ability can free scientists from the constraints of existing tools and can facilitate the development of deeper appreciation data and models. I present examples from a suite of programming languages ranging from C to JavaScript including the Wolfram Language. JavaScript is valuable for sharing tools and insight (hopefully) with others because it is integrated into one of the most powerful communication tools in human history, the web browser. Although too much of that power is often spent on distracting advertisements, the underlying computation and graphics engines are efficient, flexible, and almost universally available in desktop and mobile computing platforms. Many are working to fulfill the browser's potential to become the most effective tool for interactive study. Open-source frameworks for visualizing everything from algorithms to data are available, but advance rapidly. One strategy for dealing with swiftly changing tools is to adopt common, open data formats that are easily adapted (often by framework or tool developers). I illustrate the use of animation and interaction in research and teaching with examples from earthquake seismology.

  3. Immersive Virtual Reality Technologies as a New Platform for Science, Scholarship, and Education

    NASA Astrophysics Data System (ADS)

    Djorgovski, Stanislav G.; Hut, P.; McMillan, S.; Knop, R.; Vesperini, E.; Graham, M.; Portegies Zwart, S.; Farr, W.; Mahabal, A.; Donalek, C.; Longo, G.

    2010-01-01

    Immersive virtual reality (VR) and virtual worlds (VWs) are an emerging set of technologies which likely represent the next evolutionary step in the ways we use information technology to interact with the world of information and with other people, the roles now generally fulfilled by the Web and other common Internet applications. Currently, these technologies are mainly accessed through various VWs, e.g., the Second Life (SL), which are general platforms for a broad range of user activities. As an experiment in the utilization of these technologies for science, scholarship, education, and public outreach, we have formed the Meta-Institute for Computational Astrophysics (MICA; http://mica-vw.org), the first professional scientific organization based exclusively in VWs. The goals of MICA are: (1) Exploration, development and promotion of VWs and VR technologies for professional research in astronomy and related fields. (2) Providing and developing novel social networking venues and mechanisms for scientific collaboration and communications, including professional meetings, effective telepresence, etc. (3) Use of VWs and VR technologies for education and public outreach. (4) Exchange of ideas and joint efforts with other scientific disciplines in promoting these goals for science and scholarship in general. To this effect, we have a regular schedule of professional and public outreach events in SL, including technical seminars, workshops, journal club, collaboration meetings, public lectures, etc. We find that these technologies are already remarkably effective as a telepresence platform for scientific and scholarly discussions, meetings, etc. They can offer substantial savings of time and resources, and eliminate a lot of unnecessary travel. They are equally effective as a public outreach platform, reaching a world-wide audience. On the pure research front, we are currently exploring the use of these technologies as a venue for numerical simulations and their visualization, as well as the immersive and interactive visualization of highly-dimensional data sets.

  4. A Secure Web Application Providing Public Access to High-Performance Data Intensive Scientific Resources - ScalaBLAST Web Application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.

    2008-05-04

    This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less

  5. Software engineering and automatic continuous verification of scientific software

    NASA Astrophysics Data System (ADS)

    Piggott, M. D.; Hill, J.; Farrell, P. E.; Kramer, S. C.; Wilson, C. R.; Ham, D.; Gorman, G. J.; Bond, T.

    2011-12-01

    Software engineering of scientific code is challenging for a number of reasons including pressure to publish and a lack of awareness of the pitfalls of software engineering by scientists. The Applied Modelling and Computation Group at Imperial College is a diverse group of researchers that employ best practice software engineering methods whilst developing open source scientific software. Our main code is Fluidity - a multi-purpose computational fluid dynamics (CFD) code that can be used for a wide range of scientific applications from earth-scale mantle convection, through basin-scale ocean dynamics, to laboratory-scale classic CFD problems, and is coupled to a number of other codes including nuclear radiation and solid modelling. Our software development infrastructure consists of a number of free tools that could be employed by any group that develops scientific code and has been developed over a number of years with many lessons learnt. A single code base is developed by over 30 people for which we use bazaar for revision control, making good use of the strong branching and merging capabilities. Using features of Canonical's Launchpad platform, such as code review, blueprints for designing features and bug reporting gives the group, partners and other Fluidity uers an easy-to-use platform to collaborate and allows the induction of new members of the group into an environment where software development forms a central part of their work. The code repositoriy are coupled to an automated test and verification system which performs over 20,000 tests, including unit tests, short regression tests, code verification and large parallel tests. Included in these tests are build tests on HPC systems, including local and UK National HPC services. The testing of code in this manner leads to a continuous verification process; not a discrete event performed once development has ceased. Much of the code verification is done via the "gold standard" of comparisons to analytical solutions via the method of manufactured solutions. By developing and verifying code in tandem we avoid a number of pitfalls in scientific software development and advocate similar procedures for other scientific code applications.

  6. Open-Source 3-D Platform for Low-Cost Scientific Instrument Ecosystem.

    PubMed

    Zhang, C; Wijnen, B; Pearce, J M

    2016-08-01

    The combination of open-source software and hardware provides technically feasible methods to create low-cost, highly customized scientific research equipment. Open-source 3-D printers have proven useful for fabricating scientific tools. Here the capabilities of an open-source 3-D printer are expanded to become a highly flexible scientific platform. An automated low-cost 3-D motion control platform is presented that has the capacity to perform scientific applications, including (1) 3-D printing of scientific hardware; (2) laboratory auto-stirring, measuring, and probing; (3) automated fluid handling; and (4) shaking and mixing. The open-source 3-D platform not only facilities routine research while radically reducing the cost, but also inspires the creation of a diverse array of custom instruments that can be shared and replicated digitally throughout the world to drive down the cost of research and education further. © 2016 Society for Laboratory Automation and Screening.

  7. Reproducible Computing: a new Technology for Statistics Education and Educational Research

    NASA Astrophysics Data System (ADS)

    Wessa, Patrick

    2009-05-01

    This paper explains how the R Framework (http://www.wessa.net) and a newly developed Compendium Platform (http://www.freestatistics.org) allow us to create, use, and maintain documents that contain empirical research results which can be recomputed and reused in derived work. It is illustrated that this technological innovation can be used to create educational applications that can be shown to support effective learning of statistics and associated analytical skills. It is explained how a Compendium can be created by anyone, without the need to understand the technicalities of scientific word processing (L style="font-variant: small-caps">ATEX) or statistical computing (R code). The proposed Reproducible Computing system allows educational researchers to objectively measure key aspects of the actual learning process based on individual and constructivist activities such as: peer review, collaboration in research, computational experimentation, etc. The system was implemented and tested in three statistics courses in which the use of Compendia was used to create an interactive e-learning environment that simulated the real-world process of empirical scientific research.

  8. : A Scalable and Transparent System for Simulating MPI Programs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S

    2010-01-01

    is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less

  9. Web-GIS platform for monitoring and forecasting of regional climate and ecological changes

    NASA Astrophysics Data System (ADS)

    Gordov, E. P.; Krupchatnikov, V. N.; Lykosov, V. N.; Okladnikov, I.; Titov, A. G.; Shulgina, T. M.

    2012-12-01

    Growing volume of environmental data from sensors and model outputs makes development of based on modern information-telecommunication technologies software infrastructure for information support of integrated scientific researches in the field of Earth sciences urgent and important task (Gordov et al, 2012, van der Wel, 2005). It should be considered that original heterogeneity of datasets obtained from different sources and institutions not only hampers interchange of data and analysis results but also complicates their intercomparison leading to a decrease in reliability of analysis results. However, modern geophysical data processing techniques allow combining of different technological solutions for organizing such information resources. Nowadays it becomes a generally accepted opinion that information-computational infrastructure should rely on a potential of combined usage of web- and GIS-technologies for creating applied information-computational web-systems (Titov et al, 2009, Gordov et al. 2010, Gordov, Okladnikov and Titov, 2011). Using these approaches for development of internet-accessible thematic information-computational systems, and arranging of data and knowledge interchange between them is a very promising way of creation of distributed information-computation environment for supporting of multidiscipline regional and global research in the field of Earth sciences including analysis of climate changes and their impact on spatial-temporal vegetation distribution and state. Experimental software and hardware platform providing operation of a web-oriented production and research center for regional climate change investigations which combines modern web 2.0 approach, GIS-functionality and capabilities of running climate and meteorological models, large geophysical datasets processing, visualization, joint software development by distributed research groups, scientific analysis and organization of students and post-graduate students education is presented. Platform software developed (Shulgina et al, 2012, Okladnikov et al, 2012) includes dedicated modules for numerical processing of regional and global modeling results for consequent analysis and visualization. Also data preprocessing, run and visualization of modeling results of models WRF and «Planet Simulator» integrated into the platform is provided. All functions of the center are accessible by a user through a web-portal using common graphical web-browser in the form of an interactive graphical user interface which provides, particularly, capabilities of visualization of processing results, selection of geographical region of interest (pan and zoom) and data layers manipulation (order, enable/disable, features extraction). Platform developed provides users with capabilities of heterogeneous geophysical data analysis, including high-resolution data, and discovering of tendencies in climatic and ecosystem changes in the framework of different multidisciplinary researches (Shulgina et al, 2011). Using it even unskilled user without specific knowledge can perform computational processing and visualization of large meteorological, climatological and satellite monitoring datasets through unified graphical web-interface.

  10. Globus: Service and Platform for Research Data Lifecycle Management

    NASA Astrophysics Data System (ADS)

    Ananthakrishnan, R.; Foster, I.

    2017-12-01

    Globus offers a range of data management capabilities to the community as hosted services, encompassing data transfer and sharing, user identity and authorization, and data publication. Globus capabilities are accessible via both a web browser and REST APIs. Web access allows researchers to use Globus capabilities through a software-as-a-service model; and the REST APIs address the needs of developers of research services, who can now use Globus as a platform, outsourcing complex user and data management tasks to Globus services. In this presentation, we review Globus capabilities and outline how it is being applied as a platform for scientific services, and highlight work done to link computational analysis flows to the underlying data through an interactive Jupyter notebook environment to promote immediate data usability, reusability of these flows by other researchers, and future analysis extensibility.

  11. Advances in Data Management in Remote Sensing and Climate Modeling

    NASA Astrophysics Data System (ADS)

    Brown, P. G.

    2014-12-01

    Recent commercial interest in "Big Data" information systems has yielded little more than a sense of deja vu among scientists whose work has always required getting their arms around extremely large databases, and writing programs to explore and analyze it. On the flip side, there are some commercial DBMS startups building "Big Data" platform using techniques taken from earth science, astronomy, high energy physics and high performance computing. In this talk, we will introduce one such platform; Paradigm4's SciDB, the first DBMS designed from the ground up to combine the kinds of quality-of-service guarantees made by SQL DBMS platforms—high level data model, query languages, extensibility, transactions—with the kinds of functionality familiar to scientific users—arrays as structural building blocks, integrated linear algebra, and client language interfaces that minimize the learning curve. We will review how SciDB is used to manage and analyze earth science data by several teams of scientific users.

  12. Tri-Laboratory Linux Capacity Cluster 2007 SOW

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seager, M

    2007-03-22

    The Advanced Simulation and Computing (ASC) Program (formerly know as Accelerated Strategic Computing Initiative, ASCI) has led the world in capability computing for the last ten years. Capability computing is defined as a world-class platform (in the Top10 of the Top500.org list) with scientific simulations running at scale on the platform. Example systems are ASCI Red, Blue-Pacific, Blue-Mountain, White, Q, RedStorm, and Purple. ASC applications have scaled to multiple thousands of CPUs and accomplished a long list of mission milestones on these ASC capability platforms. However, the computing demands of the ASC and Stockpile Stewardship programs also include a vastmore » number of smaller scale runs for day-to-day simulations. Indeed, every 'hero' capability run requires many hundreds to thousands of much smaller runs in preparation and post processing activities. In addition, there are many aspects of the Stockpile Stewardship Program (SSP) that can be directly accomplished with these so-called 'capacity' calculations. The need for capacity is now so great within the program that it is increasingly difficult to allocate the computer resources required by the larger capability runs. To rectify the current 'capacity' computing resource shortfall, the ASC program has allocated a large portion of the overall ASC platforms budget to 'capacity' systems. In addition, within the next five to ten years the Life Extension Programs (LEPs) for major nuclear weapons systems must be accomplished. These LEPs and other SSP programmatic elements will further drive the need for capacity calculations and hence 'capacity' systems as well as future ASC capability calculations on 'capability' systems. To respond to this new workload analysis, the ASC program will be making a large sustained strategic investment in these capacity systems over the next ten years, starting with the United States Government Fiscal Year 2007 (GFY07). However, given the growing need for 'capability' systems as well, the budget demands are extreme and new, more cost effective ways of fielding these systems must be developed. This Tri-Laboratory Linux Capacity Cluster (TLCC) procurement represents the ASC first investment vehicle in these capacity systems. It also represents a new strategy for quickly building, fielding and integrating many Linux clusters of various sizes into classified and unclassified production service through a concept of Scalable Units (SU). The programmatic objective is to dramatically reduce the overall Total Cost of Ownership (TCO) of these 'capacity' systems relative to the best practices in Linux Cluster deployments today. This objective only makes sense in the context of these systems quickly becoming very robust and useful production clusters under the crushing load that will be inflicted on them by the ASC and SSP scientific simulation capacity workload.« less

  13. Publishing Platform for Scientific Software - Lessons Learned

    NASA Astrophysics Data System (ADS)

    Hammitzsch, Martin; Fritzsch, Bernadette; Reusser, Dominik; Brembs, Björn; Deinzer, Gernot; Loewe, Peter; Fenner, Martin; van Edig, Xenia; Bertelmann, Roland; Pampel, Heinz; Klump, Jens; Wächter, Joachim

    2015-04-01

    Scientific software has become an indispensable commodity for the production, processing and analysis of empirical data but also for modelling and simulation of complex processes. Software has a significant influence on the quality of research results. For strengthening the recognition of the academic performance of scientific software development, for increasing its visibility and for promoting the reproducibility of research results, concepts for the publication of scientific software have to be developed, tested, evaluated, and then transferred into operations. For this, the publication and citability of scientific software have to fulfil scientific criteria by means of defined processes and the use of persistent identifiers, similar to data publications. The SciForge project is addressing these challenges. Based on interviews a blueprint for a scientific software publishing platform and a systematic implementation plan has been designed. In addition, the potential of journals, software repositories and persistent identifiers have been evaluated to improve the publication and dissemination of reusable software solutions. It is important that procedures for publishing software as well as methods and tools for software engineering are reflected in the architecture of the platform, in order to improve the quality of the software and the results of research. In addition, it is necessary to work continuously on improving specific conditions that promote the adoption and sustainable utilization of scientific software publications. Among others, this would include policies for the development and publication of scientific software in the institutions but also policies for establishing the necessary competencies and skills of scientists and IT personnel. To implement the concepts developed in SciForge a combined bottom-up / top-down approach is considered that will be implemented in parallel in different scientific domains, e.g. in earth sciences, climate research and the life sciences. Based on the developed blueprints a scientific software publishing platform will be iteratively implemented, tested, and evaluated. Thus the platform should be developed continuously on the basis of gained experiences and results. The platform services will be extended one by one corresponding to the requirements of the communities. Thus the implemented platform for the publication of scientific software can be improved and stabilized incrementally as a tool with software, science, publishing, and user oriented features.

  14. Google Earth Engine: a new cloud-computing platform for global-scale earth observation data and analysis

    NASA Astrophysics Data System (ADS)

    Moore, R. T.; Hansen, M. C.

    2011-12-01

    Google Earth Engine is a new technology platform that enables monitoring and measurement of changes in the earth's environment, at planetary scale, on a large catalog of earth observation data. The platform offers intrinsically-parallel computational access to thousands of computers in Google's data centers. Initial efforts have focused primarily on global forest monitoring and measurement, in support of REDD+ activities in the developing world. The intent is to put this platform into the hands of scientists and developing world nations, in order to advance the broader operational deployment of existing scientific methods, and strengthen the ability for public institutions and civil society to better understand, manage and report on the state of their natural resources. Earth Engine currently hosts online nearly the complete historical Landsat archive of L5 and L7 data collected over more than twenty-five years. Newly-collected Landsat imagery is downloaded from USGS EROS Center into Earth Engine on a daily basis. Earth Engine also includes a set of historical and current MODIS data products. The platform supports generation, on-demand, of spatial and temporal mosaics, "best-pixel" composites (for example to remove clouds and gaps in satellite imagery), as well as a variety of spectral indices. Supervised learning methods are available over the Landsat data catalog. The platform also includes a new application programming framework, or "API", that allows scientists access to these computational and data resources, to scale their current algorithms or develop new ones. Under the covers of the Google Earth Engine API is an intrinsically-parallel image-processing system. Several forest monitoring applications powered by this API are currently in development and expected to be operational in 2011. Combining science with massive data and technology resources in a cloud-computing framework can offer advantages of computational speed, ease-of-use and collaboration, as well as transparency in data and methods. Methods developed for global processing of MODIS data to map land cover are being adopted for use with Landsat data. Specifically, the MODIS Vegetation Continuous Field product methodology has been applied for mapping forest extent and change at national scales using Landsat time-series data sets. Scaling this method to continental and global scales is enabled by Google Earth Engine computing capabilities. By combining the supervised learning VCF approach with the Landsat archive and cloud computing, unprecedented monitoring of land cover dynamics is enabled.

  15. A whole-process progressive training mode to foster optoelectronic students' innovative practical ability

    NASA Astrophysics Data System (ADS)

    Zhong, Hairong; Xu, Wei; Hu, Haojun; Duan, Chengfang

    2017-08-01

    This article analyzes the features of fostering optoelectronic students' innovative practical ability based on the knowledge structure of optoelectronic disciplines, which not only reveals the common law of cultivating students' innovative practical ability, but also considers the characteristics of the major: (1) The basic theory is difficult, and the close combination of science and technology is obvious; (2)With the integration of optics, mechanics, electronics and computer, the system technology is comprehensive; (3) It has both leading-edge theory and practical applications, so the benefit of cultivating optoelectronic students is high ; (4) The equipment is precise and the practice is costly. Considering the concept and structural characteristics of innovative and practical ability, and adhering to the idea of running practice through the whole process, we put forward the construction of three-dimensional innovation and practice platform which consists of "Synthetically Teaching Laboratory + Innovation Practice Base + Scientific Research Laboratory + Major Practice Base + Joint Teaching and Training Base", and meanwhile build a whole-process progressive training mode to foster optoelectronic students' innovative practical ability, following the process of "basic experimental skills training - professional experimental skills training - system design - innovative practice - scientific research project training - expanded training - graduation project": (1) To create an in - class practical ability cultivation environment that has distinctive characteristics of the major, with the teaching laboratory as the basic platform; (2) To create an extra-curricular innovation practice activities cultivation environment that is closely linked to the practical application, with the innovation practice base as a platform for improvement; (3) To create an innovation practice training cultivation environment that leads the development of cutting-edge, with the scientific research laboratory as a platform to explore; (4) To create an out-campus expanded training environment of optoelectronic major practice and optoelectronic system teaching and training, with the major practice base as an expansion of the platform; (5) To break students' "pre-job training barriers" between school and work, with graduation design as the comprehensive training and testing link.

  16. Analysis and design of energy monitoring platform for smart city

    NASA Astrophysics Data System (ADS)

    Wang, Hong-xia

    2016-09-01

    The development and utilization of energy has greatly promoted the development and progress of human society. It is the basic material foundation for human survival. City running is bound to consume energy inevitably, but it also brings a lot of waste discharge. In order to speed up the process of smart city, improve the efficiency of energy saving and emission reduction work, maintain the green and livable environment, a comprehensive management platform of energy monitoring for government departments is constructed based on cloud computing technology and 3-tier architecture in this paper. It is assumed that the system will provide scientific guidance for the environment management and decision making in smart city.

  17. DIVE: A Graph-based Visual Analytics Framework for Big Data

    PubMed Central

    Rysavy, Steven J.; Bromley, Dennis; Daggett, Valerie

    2014-01-01

    The need for data-centric scientific tools is growing; domains like biology, chemistry, and physics are increasingly adopting computational approaches. As a result, scientists must now deal with the challenges of big data. To address these challenges, we built a visual analytics platform named DIVE: Data Intensive Visualization Engine. DIVE is a data-agnostic, ontologically-expressive software framework capable of streaming large datasets at interactive speeds. Here we present the technical details of the DIVE platform, multiple usage examples, and a case study from the Dynameomics molecular dynamics project. We specifically highlight our novel contributions to structured data model manipulation and high-throughput streaming of large, structured datasets. PMID:24808197

  18. Fourier transform spectrometer controller for partitioned architectures

    NASA Astrophysics Data System (ADS)

    Tamas-Selicean, D.; Keymeulen, D.; Berisford, D.; Carlson, R.; Hand, K.; Pop, P.; Wadsworth, W.; Levy, R.

    The current trend in spacecraft computing is to integrate applications of different criticality levels on the same platform using no separation. This approach increases the complexity of the development, verification and integration processes, with an impact on the whole system life cycle. Researchers at ESA and NASA advocated for the use of partitioned architecture to reduce this complexity. Partitioned architectures rely on platform mechanisms to provide robust temporal and spatial separation between applications. Such architectures have been successfully implemented in several industries, such as avionics and automotive. In this paper we investigate the challenges of developing and the benefits of integrating a scientific instrument, namely a Fourier Transform Spectrometer, in such a partitioned architecture.

  19. Delivering Insight The History of the Accelerated Strategic Computing Initiative

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Larzelere II, A R

    2007-01-03

    The history of the Accelerated Strategic Computing Initiative (ASCI) tells of the development of computational simulation into a third fundamental piece of the scientific method, on a par with theory and experiment. ASCI did not invent the idea, nor was it alone in bringing it to fruition. But ASCI provided the wherewithal - hardware, software, environment, funding, and, most of all, the urgency - that made it happen. On October 1, 2005, the Initiative completed its tenth year of funding. The advances made by ASCI over its first decade are truly incredible. Lawrence Livermore, Los Alamos, and Sandia National Laboratories,more » along with leadership provided by the Department of Energy's Defense Programs Headquarters, fundamentally changed computational simulation and how it is used to enable scientific insight. To do this, astounding advances were made in simulation applications, computing platforms, and user environments. ASCI dramatically changed existing - and forged new - relationships, both among the Laboratories and with outside partners. By its tenth anniversary, despite daunting challenges, ASCI had accomplished all of the major goals set at its beginning. The history of ASCI is about the vision, leadership, endurance, and partnerships that made these advances possible.« less

  20. From the desktop to the grid: scalable bioinformatics via workflow conversion.

    PubMed

    de la Garza, Luis; Veit, Johannes; Szolek, Andras; Röttig, Marc; Aiche, Stephan; Gesing, Sandra; Reinert, Knut; Kohlbacher, Oliver

    2016-03-12

    Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community. We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources. Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We strongly believe that our efforts not only decrease the turnaround time to obtain scientific results but also have a positive impact on reproducibility, thus elevating the quality of obtained scientific results.

  1. Solving optimization problems on computational grids.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wright, S. J.; Mathematics and Computer Science

    2001-05-01

    Multiprocessor computing platforms, which have become more and more widely available since the mid-1980s, are now heavily used by organizations that need to solve very demanding computational problems. Parallel computing is now central to the culture of many research communities. Novel parallel approaches were developed for global optimization, network optimization, and direct-search methods for nonlinear optimization. Activity was particularly widespread in parallel branch-and-bound approaches for various problems in combinatorial and network optimization. As the cost of personal computers and low-end workstations has continued to fall, while the speed and capacity of processors and networks have increased dramatically, 'cluster' platforms havemore » become popular in many settings. A somewhat different type of parallel computing platform know as a computational grid (alternatively, metacomputer) has arisen in comparatively recent times. Broadly speaking, this term refers not to a multiprocessor with identical processing nodes but rather to a heterogeneous collection of devices that are widely distributed, possibly around the globe. The advantage of such platforms is obvious: they have the potential to deliver enormous computing power. Just as obviously, however, the complexity of grids makes them very difficult to use. The Condor team, headed by Miron Livny at the University of Wisconsin, were among the pioneers in providing infrastructure for grid computations. More recently, the Globus project has developed technologies to support computations on geographically distributed platforms consisting of high-end computers, storage and visualization devices, and other scientific instruments. In 1997, we started the metaneos project as a collaborative effort between optimization specialists and the Condor and Globus groups. Our aim was to address complex, difficult optimization problems in several areas, designing and implementing the algorithms and the software infrastructure need to solve these problems on computational grids. This article describes some of the results we have obtained during the first three years of the metaneos project. Our efforts have led to development of the runtime support library MW for implementing algorithms with master-worker control structure on Condor platforms. This work is discussed here, along with work on algorithms and codes for integer linear programming, the quadratic assignment problem, and stochastic linear programmming. Our experiences in the metaneos project have shown that cheap, powerful computational grids can be used to tackle large optimization problems of various types. In an industrial or commercial setting, the results demonstrate that one may not have to buy powerful computational servers to solve many of the large problems arising in areas such as scheduling, portfolio optimization, or logistics; the idle time on employee workstations (or, at worst, an investment in a modest cluster of PCs) may do the job. For the optimization research community, our results motivate further work on parallel, grid-enabled algorithms for solving very large problems of other types. The fact that very large problems can be solved cheaply allows researchers to better understand issues of 'practical' complexity and of the role of heuristics.« less

  2. Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

    NASA Astrophysics Data System (ADS)

    Alameda, J. C.

    2011-12-01

    Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.

  3. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGES

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; ...

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  4. Designing next-generation platforms for evaluating scientific output: what scientists can learn from the social web

    PubMed Central

    Yarkoni, Tal

    2012-01-01

    Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible. PMID:23060783

  5. The VISPA internet platform for outreach, education and scientific research in various experiments

    NASA Astrophysics Data System (ADS)

    van Asseldonk, D.; Erdmann, M.; Fischer, B.; Fischer, R.; Glaser, C.; Heidemann, F.; Müller, G.; Quast, T.; Rieger, M.; Urban, M.; Welling, C.

    2015-12-01

    VISPA provides a graphical front-end to computing infrastructures giving its users all functionality needed for working conditions comparable to a personal computer. It is a framework that can be extended with custom applications to support individual needs, e.g. graphical interfaces for experiment-specific software. By design, VISPA serves as a multipurpose platform for many disciplines and experiments as demonstrated in the following different use-cases. A GUI to the analysis framework OFFLINE of the Pierre Auger collaboration, submission and monitoring of computing jobs, university teaching of hundreds of students, and outreach activity, especially in CERN's open data initiative. Serving heterogeneous user groups and applications gave us lots of experience. This helps us in maturing the system, i.e. improving the robustness and responsiveness, and the interplay of the components. Among the lessons learned are the choice of a file system, the implementation of websockets, efficient load balancing, and the fine-tuning of existing technologies like the RPC over SSH. We present in detail the improved server setup and report on the performance, the user acceptance and the realized applications of the system.

  6. Use of cloud computing in biomedicine.

    PubMed

    Sobeslav, Vladimir; Maresova, Petra; Krejcar, Ondrej; Franca, Tanos C C; Kuca, Kamil

    2016-12-01

    Nowadays, biomedicine is characterised by a growing need for processing of large amounts of data in real time. This leads to new requirements for information and communication technologies (ICT). Cloud computing offers a solution to these requirements and provides many advantages, such as cost savings, elasticity and scalability of using ICT. The aim of this paper is to explore the concept of cloud computing and the related use of this concept in the area of biomedicine. Authors offer a comprehensive analysis of the implementation of the cloud computing approach in biomedical research, decomposed into infrastructure, platform and service layer, and a recommendation for processing large amounts of data in biomedicine. Firstly, the paper describes the appropriate forms and technological solutions of cloud computing. Secondly, the high-end computing paradigm of cloud computing aspects is analysed. Finally, the potential and current use of applications in scientific research of this technology in biomedicine is discussed.

  7. Multi-threaded ATLAS simulation on Intel Knights Landing processors

    NASA Astrophysics Data System (ADS)

    Farrell, Steven; Calafiura, Paolo; Leggett, Charles; Tsulaia, Vakhtang; Dotti, Andrea; ATLAS Collaboration

    2017-10-01

    The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), was delivered to its users in two phases with the first phase online at the end of 2015 and the second phase now online at the end of 2016. Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a good potential use-case for the KNL architecture and supercomputers like Cori. ATLAS simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this paper we will give an overview of the ATLAS simulation application with details on its multi-threaded design. Then, we will present a performance analysis of the application on KNL devices and compare it to a traditional x86 platform to demonstrate the capabilities of the architecture and evaluate the benefits of utilizing KNL platforms like Cori for ATLAS production.

  8. The StratusLab cloud distribution: Use-cases and support for scientific applications

    NASA Astrophysics Data System (ADS)

    Floros, E.

    2012-04-01

    The StratusLab project is integrating an open cloud software distribution that enables organizations to setup and provide their own private or public IaaS (Infrastructure as a Service) computing clouds. StratusLab distribution capitalizes on popular infrastructure virtualization solutions like KVM, the OpenNebula virtual machine manager, Claudia service manager and SlipStream deployment platform, which are further enhanced and expanded with additional components developed within the project. The StratusLab distribution covers the core aspects of a cloud IaaS architecture, namely Computing (life-cycle management of virtual machines), Storage, Appliance management and Networking. The resulting software stack provides a packaged turn-key solution for deploying cloud computing services. The cloud computing infrastructures deployed using StratusLab can support a wide range of scientific and business use cases. Grid computing has been the primary use case pursued by the project and for this reason the initial priority has been the support for the deployment and operation of fully virtualized production-level grid sites; a goal that has already been achieved by operating such a site as part of EGI's (European Grid Initiative) pan-european grid infrastructure. In this area the project is currently working to provide non-trivial capabilities like elastic and autonomic management of grid site resources. Although grid computing has been the motivating paradigm, StratusLab's cloud distribution can support a wider range of use cases. Towards this direction, we have developed and currently provide support for setting up general purpose computing solutions like Hadoop, MPI and Torque clusters. For what concerns scientific applications the project is collaborating closely with the Bioinformatics community in order to prepare VM appliances and deploy optimized services for bioinformatics applications. In a similar manner additional scientific disciplines like Earth Science can take advantage of StratusLab cloud solutions. Interested users are welcomed to join StratusLab's user community by getting access to the reference cloud services deployed by the project and offered to the public.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin

    The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less

  10. Collaborative workbench for cyberinfrastructure to accelerate science algorithm development

    NASA Astrophysics Data System (ADS)

    Ramachandran, R.; Maskey, M.; Kuo, K.; Lynnes, C.

    2013-12-01

    There are significant untapped resources for information and knowledge creation within the Earth Science community in the form of data, algorithms, services, analysis workflows or scripts, and the related knowledge about these resources. Despite the huge growth in social networking and collaboration platforms, these resources often reside on an investigator's workstation or laboratory and are rarely shared. A major reason for this is that there are very few scientific collaboration platforms, and those that exist typically require the use of a new set of analysis tools and paradigms to leverage the shared infrastructure. As a result, adoption of these collaborative platforms for science research is inhibited by the high cost to an individual scientist of switching from his or her own familiar environment and set of tools to a new environment and tool set. This presentation will describe an ongoing project developing an Earth Science Collaborative Workbench (CWB). The CWB approach will eliminate this barrier by augmenting a scientist's current research environment and tool set to allow him or her to easily share diverse data and algorithms. The CWB will leverage evolving technologies such as commodity computing and social networking to design an architecture for scalable collaboration that will support the emerging vision of an Earth Science Collaboratory. The CWB is being implemented on the robust and open source Eclipse framework and will be compatible with widely used scientific analysis tools such as IDL. The myScience Catalog built into CWB will capture and track metadata and provenance about data and algorithms for the researchers in a non-intrusive manner with minimal overhead. Seamless interfaces to multiple Cloud services will support sharing algorithms, data, and analysis results, as well as access to storage and computer resources. A Community Catalog will track the use of shared science artifacts and manage collaborations among researchers.

  11. Facilitating NASA Earth Science Data Processing Using Nebula Cloud Computing

    NASA Astrophysics Data System (ADS)

    Chen, A.; Pham, L.; Kempler, S.; Theobald, M.; Esfandiari, A.; Campino, J.; Vollmer, B.; Lynnes, C.

    2011-12-01

    Cloud Computing technology has been used to offer high-performance and low-cost computing and storage resources for both scientific problems and business services. Several cloud computing services have been implemented in the commercial arena, e.g. Amazon's EC2 & S3, Microsoft's Azure, and Google App Engine. There are also some research and application programs being launched in academia and governments to utilize Cloud Computing. NASA launched the Nebula Cloud Computing platform in 2008, which is an Infrastructure as a Service (IaaS) to deliver on-demand distributed virtual computers. Nebula users can receive required computing resources as a fully outsourced service. NASA Goddard Earth Science Data and Information Service Center (GES DISC) migrated several GES DISC's applications to the Nebula as a proof of concept, including: a) The Simple, Scalable, Script-based Science Processor for Measurements (S4PM) for processing scientific data; b) the Atmospheric Infrared Sounder (AIRS) data process workflow for processing AIRS raw data; and c) the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (GIOVANNI) for online access to, analysis, and visualization of Earth science data. This work aims to evaluate the practicability and adaptability of the Nebula. The initial work focused on the AIRS data process workflow to evaluate the Nebula. The AIRS data process workflow consists of a series of algorithms being used to process raw AIRS level 0 data and output AIRS level 2 geophysical retrievals. Migrating the entire workflow to the Nebula platform is challenging, but practicable. After installing several supporting libraries and the processing code itself, the workflow is able to process AIRS data in a similar fashion to its current (non-cloud) configuration. We compared the performance of processing 2 days of AIRS level 0 data through level 2 using a Nebula virtual computer and a local Linux computer. The result shows that Nebula has significantly better performance than the local machine. Much of the difference was due to newer equipment in the Nebula than the legacy computer, which is suggestive of a potential economic advantage beyond elastic power, i.e., access to up-to-date hardware vs. legacy hardware that must be maintained past its prime to amortize the cost. In addition to a trade study of advantages and challenges of porting complex processing to the cloud, a tutorial was developed to enable further progress in utilizing the Nebula for Earth Science applications and understanding better the potential for Cloud Computing in further data- and computing-intensive Earth Science research. In particular, highly bursty computing such as that experienced in the user-demand-driven Giovanni system may become more tractable in a Cloud environment. Our future work will continue to focus on migrating more GES DISC's applications/instances, e.g. Giovanni instances, to the Nebula platform and making matured migrated applications to be in operation on the Nebula.

  12. The InSAR Scientific Computing Environment

    NASA Technical Reports Server (NTRS)

    Rosen, Paul A.; Gurrola, Eric; Sacco, Gian Franco; Zebker, Howard

    2012-01-01

    We have developed a flexible and extensible Interferometric SAR (InSAR) Scientific Computing Environment (ISCE) for geodetic image processing. ISCE was designed from the ground up as a geophysics community tool for generating stacks of interferograms that lend themselves to various forms of time-series analysis, with attention paid to accuracy, extensibility, and modularity. The framework is python-based, with code elements rigorously componentized by separating input/output operations from the processing engines. This allows greater flexibility and extensibility in the data models, and creates algorithmic code that is less susceptible to unnecessary modification when new data types and sensors are available. In addition, the components support provenance and checkpointing to facilitate reprocessing and algorithm exploration. The algorithms, based on legacy processing codes, have been adapted to assume a common reference track approach for all images acquired from nearby orbits, simplifying and systematizing the geometry for time-series analysis. The framework is designed to easily allow user contributions, and is distributed for free use by researchers. ISCE can process data from the ALOS, ERS, EnviSAT, Cosmo-SkyMed, RadarSAT-1, RadarSAT-2, and TerraSAR-X platforms, starting from Level-0 or Level 1 as provided from the data source, and going as far as Level 3 geocoded deformation products. With its flexible design, it can be extended with raw/meta data parsers to enable it to work with radar data from other platforms

  13. ABA-Cloud: support for collaborative breath research

    PubMed Central

    Elsayed, Ibrahim; Ludescher, Thomas; King, Julian; Ager, Clemens; Trosin, Michael; Senocak, Uygar; Brezany, Peter; Feilhauer, Thomas; Amann, Anton

    2016-01-01

    This paper introduces the advanced breath analysis (ABA) platform, an innovative scientific research platform for the entire breath research domain. Within the ABA project, we are investigating novel data management concepts and semantic web technologies to document breath analysis studies for the long run as well as to enable their full automatic reproducibility. We propose several concept taxonomies (a hierarchical order of terms from a glossary of terms), which can be seen as a first step toward the definition of conceptualized terms commonly used by the international community of breath researchers. They build the basis for the development of an ontology (a concept from computer science used for communication between machines and/or humans and representation and reuse of knowledge) dedicated to breath research. PMID:23619467

  14. ABA-Cloud: support for collaborative breath research.

    PubMed

    Elsayed, Ibrahim; Ludescher, Thomas; King, Julian; Ager, Clemens; Trosin, Michael; Senocak, Uygar; Brezany, Peter; Feilhauer, Thomas; Amann, Anton

    2013-06-01

    This paper introduces the advanced breath analysis (ABA) platform, an innovative scientific research platform for the entire breath research domain. Within the ABA project, we are investigating novel data management concepts and semantic web technologies to document breath analysis studies for the long run as well as to enable their full automatic reproducibility. We propose several concept taxonomies (a hierarchical order of terms from a glossary of terms), which can be seen as a first step toward the definition of conceptualized terms commonly used by the international community of breath researchers. They build the basis for the development of an ontology (a concept from computer science used for communication between machines and/or humans and representation and reuse of knowledge) dedicated to breath research.

  15. wft4galaxy: a workflow testing tool for galaxy.

    PubMed

    Piras, Marco Enrico; Pireddu, Luca; Zanetti, Gianluigi

    2017-12-01

    Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. marcoenrico.piras@crs4.it. © The Author 2017. Published by Oxford University Press.

  16. Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

    NASA Astrophysics Data System (ADS)

    Moon, Hongsik

    What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.

  17. Cloud-based opportunities in scientific computing: insights from processing Suomi National Polar-Orbiting Partnership (S-NPP) Direct Broadcast data

    NASA Astrophysics Data System (ADS)

    Evans, J. D.; Hao, W.; Chettri, S.

    2013-12-01

    The cloud is proving to be a uniquely promising platform for scientific computing. Our experience with processing satellite data using Amazon Web Services highlights several opportunities for enhanced performance, flexibility, and cost effectiveness in the cloud relative to traditional computing -- for example: - Direct readout from a polar-orbiting satellite such as the Suomi National Polar-Orbiting Partnership (S-NPP) requires bursts of processing a few times a day, separated by quiet periods when the satellite is out of receiving range. In the cloud, by starting and stopping virtual machines in minutes, we can marshal significant computing resources quickly when needed, but not pay for them when not needed. To take advantage of this capability, we are automating a data-driven approach to the management of cloud computing resources, in which new data availability triggers the creation of new virtual machines (of variable size and processing power) which last only until the processing workflow is complete. - 'Spot instances' are virtual machines that run as long as one's asking price is higher than the provider's variable spot price. Spot instances can greatly reduce the cost of computing -- for software systems that are engineered to withstand unpredictable interruptions in service (as occurs when a spot price exceeds the asking price). We are implementing an approach to workflow management that allows data processing workflows to resume with minimal delays after temporary spot price spikes. This will allow systems to take full advantage of variably-priced 'utility computing.' - Thanks to virtual machine images, we can easily launch multiple, identical machines differentiated only by 'user data' containing individualized instructions (e.g., to fetch particular datasets or to perform certain workflows or algorithms) This is particularly useful when (as is the case with S-NPP data) we need to launch many very similar machines to process an unpredictable number of data files concurrently. Our experience shows the viability and flexibility of this approach to workflow management for scientific data processing. - Finally, cloud computing is a promising platform for distributed volunteer ('interstitial') computing, via mechanisms such as the Berkeley Open Infrastructure for Network Computing (BOINC) popularized with the SETI@Home project and others such as ClimatePrediction.net and NASA's Climate@Home. Interstitial computing faces significant challenges as commodity computing shifts from (always on) desktop computers towards smartphones and tablets (untethered and running on scarce battery power); but cloud computing offers significant slack capacity. This capacity includes virtual machines with unused RAM or underused CPUs; virtual storage volumes allocated (& paid for) but not full; and virtual machines that are paid up for the current hour but whose work is complete. We are devising ways to facilitate the reuse of these resources (i.e., cloud-based interstitial computing) for satellite data processing and related analyses. We will present our findings and research directions on these and related topics.

  18. the APL Balloonborne High Altitude Research Platform (HARP)

    NASA Astrophysics Data System (ADS)

    Adams, D.; Arnold, S.; Bernasconi, P.

    2015-09-01

    The Johns Hopkins University Applied Physics Laboratory (APL) has developed and demonstrated a multi-purpose stratospheric balloonborne gondola known as the High Altitude Research Platform (HARP). HARP provides the power, mechanical supports, thermal control, and data transmission for multiple forms of high-altitude scientific research equipment. The platform has been used for astronomy, cosmology and heliophysics experiments but can also be applied to atmospheric studies, space weather and other forms of high altitude research. HARP has executed five missions. The first was Flare Genesis from Antarctica in 1993 and the most recent was the Balloon Observation Platform for Planetary Science (BOPPS) from New Mexico in 2014. HARP will next be used to perform again the Stratospheric Terahertz Observatory mission, a mission that it first performed in 2009. The structure, composed of an aluminum framework is designed for easy transport and field assembly while providing ready access to the payload and supporting avionics. A light-weighted structure, capable of supporting Ultra-Long Duration Balloon (ULDB) flights that can last more than 100 days is available. Scientific research payloads as heavy as 600 kg (1322 pounds) and requiring up to 800 Watts electrical power can be supported. The platform comprises all subsystems required to support and operate the science payload, including both line-of-sight (LOS) and over-the-horizon (0TH) telecommunications, the latter provided by Iridium Pilot. Electrical power is produced by solar panels for multi-day missions and batteries for single-day missions. The avionics design is primarily single-string; however, use of ruggedized industrial components provides high reliability. The avionics features a Command and Control (C&C) computer and a Pointing Control System (PCS) computer housed within a common unpressurized unit. The avionics operates from ground pressure to 2 Torr and over a temperature range from —30 C to +85 C. Science data is stored on-board and also flows through the C&C computer where it is packetized for real-time downlink. The telecommunications system is capable of LOS downlink up to 3000 kbps and 0TH downlink up to 120 kbps. The pointing control system (PCS) provides three-axis attitude stability to 1 arcsec and can be used to aim at a fixed point for science observations, to perform science scans, and to track an object ephemeris. This paper provides a description of HARP, summarizes its performance on prior flights, describes its use on upcoming missions and outlines the characteristics that can be customized to meet the needs of the high altitude research community to support future missions.

  19. Bringing your tools to CyVerse Discovery Environment using Docker

    PubMed Central

    Devisetty, Upendra Kumar; Kennedy, Kathleen; Sarando, Paul; Merchant, Nirav; Lyons, Eric

    2016-01-01

    Docker has become a very popular container-based virtualization platform for software distribution that has revolutionized the way in which scientific software and software dependencies (software stacks) can be packaged, distributed, and deployed. Docker makes the complex and time-consuming installation procedures needed for scientific software a one-time process. Because it enables platform-independent installation, versioning of software environments, and easy redeployment and reproducibility, Docker is an ideal candidate for the deployment of identical software stacks on different compute environments such as XSEDE and Amazon AWS. CyVerse’s Discovery Environment also uses Docker for integrating its powerful, community-recommended software tools into CyVerse’s production environment for public use. This paper will help users bring their tools into CyVerse Discovery Environment (DE) which will not only allows users to integrate their tools with relative ease compared to the earlier method of tool deployment in DE but will also help users to share their apps with collaborators and release them for public use. PMID:27803802

  20. Bringing your tools to CyVerse Discovery Environment using Docker.

    PubMed

    Devisetty, Upendra Kumar; Kennedy, Kathleen; Sarando, Paul; Merchant, Nirav; Lyons, Eric

    2016-01-01

    Docker has become a very popular container-based virtualization platform for software distribution that has revolutionized the way in which scientific software and software dependencies (software stacks) can be packaged, distributed, and deployed. Docker makes the complex and time-consuming installation procedures needed for scientific software a one-time process. Because it enables platform-independent installation, versioning of software environments, and easy redeployment and reproducibility, Docker is an ideal candidate for the deployment of identical software stacks on different compute environments such as XSEDE and Amazon AWS. CyVerse's Discovery Environment also uses Docker for integrating its powerful, community-recommended software tools into CyVerse's production environment for public use. This paper will help users bring their tools into CyVerse Discovery Environment (DE) which will not only allows users to integrate their tools with relative ease compared to the earlier method of tool deployment in DE but will also help users to share their apps with collaborators and release them for public use.

  1. Development of a Computer-Assisted Instrumentation Curriculum for Physics Students: Using LabVIEW and Arduino Platform

    NASA Astrophysics Data System (ADS)

    Kuan, Wen-Hsuan; Tseng, Chi-Hung; Chen, Sufen; Wong, Ching-Chang

    2016-06-01

    We propose an integrated curriculum to establish essential abilities of computer programming for the freshmen of a physics department. The implementation of the graphical-based interfaces from Scratch to LabVIEW then to LabVIEW for Arduino in the curriculum `Computer-Assisted Instrumentation in the Design of Physics Laboratories' brings rigorous algorithm and syntax protocols together with imagination, communication, scientific applications and experimental innovation. The effectiveness of the curriculum was evaluated via statistical analysis of questionnaires, interview responses, the increase in student numbers majoring in physics, and performance in a competition. The results provide quantitative support that the curriculum remove huge barriers to programming which occur in text-based environments, helped students gain knowledge of programming and instrumentation, and increased the students' confidence and motivation to learn physics and computer languages.

  2. OnSight: Multi-platform Visualization of the Surface of Mars

    NASA Astrophysics Data System (ADS)

    Abercrombie, S. P.; Menzies, A.; Winter, A.; Clausen, M.; Duran, B.; Jorritsma, M.; Goddard, C.; Lidawer, A.

    2017-12-01

    A key challenge of planetary geology is to develop an understanding of an environment that humans cannot (yet) visit. Instead, scientists rely on visualizations created from images sent back by robotic explorers, such as the Curiosity Mars rover. OnSight is a multi-platform visualization tool that helps scientists and engineers to visualize the surface of Mars. Terrain visualization allows scientists to understand the scale and geometric relationships of the environment around the Curiosity rover, both for scientific understanding and for tactical consideration in safely operating the rover. OnSight includes a web-based 2D/3D visualization tool, as well as an immersive mixed reality visualization. In addition, OnSight offers a novel feature for communication among the science team. Using the multiuser feature of OnSight, scientists can meet virtually on Mars, to discuss geology in a shared spatial context. Combining web-based visualization with immersive visualization allows OnSight to leverage strengths of both platforms. This project demonstrates how 3D visualization can be adapted to either an immersive environment or a computer screen, and will discuss advantages and disadvantages of both platforms.

  3. AstroGrid-D: Grid technology for astronomical science

    NASA Astrophysics Data System (ADS)

    Enke, Harry; Steinmetz, Matthias; Adorf, Hans-Martin; Beck-Ratzka, Alexander; Breitling, Frank; Brüsemeister, Thomas; Carlson, Arthur; Ensslin, Torsten; Högqvist, Mikael; Nickelt, Iliya; Radke, Thomas; Reinefeld, Alexander; Reiser, Angelika; Scholl, Tobias; Spurzem, Rainer; Steinacker, Jürgen; Voges, Wolfgang; Wambsganß, Joachim; White, Steve

    2011-02-01

    We present status and results of AstroGrid-D, a joint effort of astrophysicists and computer scientists to employ grid technology for scientific applications. AstroGrid-D provides access to a network of distributed machines with a set of commands as well as software interfaces. It allows simple use of computer and storage facilities and to schedule or monitor compute tasks and data management. It is based on the Globus Toolkit middleware (GT4). Chapter 1 describes the context which led to the demand for advanced software solutions in Astrophysics, and we state the goals of the project. We then present characteristic astrophysical applications that have been implemented on AstroGrid-D in chapter 2. We describe simulations of different complexity, compute-intensive calculations running on multiple sites (Section 2.1), and advanced applications for specific scientific purposes (Section 2.2), such as a connection to robotic telescopes (Section 2.2.3). We can show from these examples how grid execution improves e.g. the scientific workflow. Chapter 3 explains the software tools and services that we adapted or newly developed. Section 3.1 is focused on the administrative aspects of the infrastructure, to manage users and monitor activity. Section 3.2 characterises the central components of our architecture: The AstroGrid-D information service to collect and store metadata, a file management system, the data management system, and a job manager for automatic submission of compute tasks. We summarise the successfully established infrastructure in chapter 4, concluding with our future plans to establish AstroGrid-D as a platform of modern e-Astronomy.

  4. Component-based integration of chemistry and optimization software.

    PubMed

    Kenny, Joseph P; Benson, Steven J; Alexeev, Yuri; Sarich, Jason; Janssen, Curtis L; McInnes, Lois Curfman; Krishnan, Manojkumar; Nieplocha, Jarek; Jurrus, Elizabeth; Fahlstrom, Carl; Windus, Theresa L

    2004-11-15

    Typical scientific software designs make rigid assumptions regarding programming language and data structures, frustrating software interoperability and scientific collaboration. Component-based software engineering is an emerging approach to managing the increasing complexity of scientific software. Component technology facilitates code interoperability and reuse. Through the adoption of methodology and tools developed by the Common Component Architecture Forum, we have developed a component architecture for molecular structure optimization. Using the NWChem and Massively Parallel Quantum Chemistry packages, we have produced chemistry components that provide capacity for energy and energy derivative evaluation. We have constructed geometry optimization applications by integrating the Toolkit for Advanced Optimization, Portable Extensible Toolkit for Scientific Computation, and Global Arrays packages, which provide optimization and linear algebra capabilities. We present a brief overview of the component development process and a description of abstract interfaces for chemical optimizations. The components conforming to these abstract interfaces allow the construction of applications using different chemistry and mathematics packages interchangeably. Initial numerical results for the component software demonstrate good performance, and highlight potential research enabled by this platform.

  5. Current trends for customized biomedical software tools.

    PubMed

    Khan, Haseeb Ahmad

    2017-01-01

    In the past, biomedical scientists were solely dependent on expensive commercial software packages for various applications. However, the advent of user-friendly programming languages and open source platforms has revolutionized the development of simple and efficient customized software tools for solving specific biomedical problems. Many of these tools are designed and developed by biomedical scientists independently or with the support of computer experts and often made freely available for the benefit of scientific community. The current trends for customized biomedical software tools are highlighted in this short review.

  6. Strain engineering of electronic and magnetic properties of Ga2S2 nanoribbons

    NASA Astrophysics Data System (ADS)

    Wang, Bao-Ji; Li, Xiao-Hua; Zhang, Li-Wei; Wang, Guo-Dong; Ke, San-Huang

    2017-05-01

    Not Available Project supported by the National Natural Science Foundation of China (Grant Nos. 11174220 and 11374226), the Key Scientific Research Project of the Henan Institutions of Higher Learning, China (Grant No. 16A140009), the Program for Innovative Research Team of Henan Polytechnic University, China (Grant Nos. T2015-3 and T2016-2), the Doctoral Foundation of Henan Polytechnic University, China (Grant No. B2015-46), and the High-performance Grid Computing Platform of Henan Polytechnic University, China.

  7. Web-based hydrodynamics computing

    NASA Astrophysics Data System (ADS)

    Shimoide, Alan; Lin, Luping; Hong, Tracie-Lynne; Yoon, Ilmi; Aragon, Sergio R.

    2005-01-01

    Proteins are long chains of amino acids that have a definite 3-d conformation and the shape of each protein is vital to its function. Since proteins are normally in solution, hydrodynamics (describes the movement of solvent around a protein as a function of shape and size of the molecule) can be used to probe the size and shape of proteins compared to those derived from X-ray crystallography. The computation chain needed for these hydrodynamics calculations consists of several separate programs by different authors on various platforms and often requires 3D visualizations of intermediate results. Due to the complexity, tools developed by a particular research group are not readily available for use by other groups, nor even by the non-experts within the same research group. To alleviate this situation, and to foment the easy and wide distribution of computational tools worldwide, we developed a web based interactive computational environment (WICE) including interactive 3D visualization that can be used with any web browser. Java based technologies were used to provide a platform neutral, user-friendly solution. Java Server Pages (JSP), Java Servlets, Java Beans, JOGL (Java bindings for OpenGL), and Java Web Start were used to create a solution that simplifies the computing chain for the user allowing the user to focus on their scientific research. WICE hides complexity from the user and provides robust and sophisticated visualization through a web browser.

  8. Web-based hydrodynamics computing

    NASA Astrophysics Data System (ADS)

    Shimoide, Alan; Lin, Luping; Hong, Tracie-Lynne; Yoon, Ilmi; Aragon, Sergio R.

    2004-12-01

    Proteins are long chains of amino acids that have a definite 3-d conformation and the shape of each protein is vital to its function. Since proteins are normally in solution, hydrodynamics (describes the movement of solvent around a protein as a function of shape and size of the molecule) can be used to probe the size and shape of proteins compared to those derived from X-ray crystallography. The computation chain needed for these hydrodynamics calculations consists of several separate programs by different authors on various platforms and often requires 3D visualizations of intermediate results. Due to the complexity, tools developed by a particular research group are not readily available for use by other groups, nor even by the non-experts within the same research group. To alleviate this situation, and to foment the easy and wide distribution of computational tools worldwide, we developed a web based interactive computational environment (WICE) including interactive 3D visualization that can be used with any web browser. Java based technologies were used to provide a platform neutral, user-friendly solution. Java Server Pages (JSP), Java Servlets, Java Beans, JOGL (Java bindings for OpenGL), and Java Web Start were used to create a solution that simplifies the computing chain for the user allowing the user to focus on their scientific research. WICE hides complexity from the user and provides robust and sophisticated visualization through a web browser.

  9. Peer-to-peer architectures for exascale computing : LDRD final report.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vorobeychik, Yevgeniy; Mayo, Jackson R.; Minnich, Ronald G.

    2010-09-01

    The goal of this research was to investigate the potential for employing dynamic, decentralized software architectures to achieve reliability in future high-performance computing platforms. These architectures, inspired by peer-to-peer networks such as botnets that already scale to millions of unreliable nodes, hold promise for enabling scientific applications to run usefully on next-generation exascale platforms ({approx} 10{sup 18} operations per second). Traditional parallel programming techniques suffer rapid deterioration of performance scaling with growing platform size, as the work of coping with increasingly frequent failures dominates over useful computation. Our studies suggest that new architectures, in which failures are treated as ubiquitousmore » and their effects are considered as simply another controllable source of error in a scientific computation, can remove such obstacles to exascale computing for certain applications. We have developed a simulation framework, as well as a preliminary implementation in a large-scale emulation environment, for exploration of these 'fault-oblivious computing' approaches. High-performance computing (HPC) faces a fundamental problem of increasing total component failure rates due to increasing system sizes, which threaten to degrade system reliability to an unusable level by the time the exascale range is reached ({approx} 10{sup 18} operations per second, requiring of order millions of processors). As computer scientists seek a way to scale system software for next-generation exascale machines, it is worth considering peer-to-peer (P2P) architectures that are already capable of supporting 10{sup 6}-10{sup 7} unreliable nodes. Exascale platforms will require a different way of looking at systems and software because the machine will likely not be available in its entirety for a meaningful execution time. Realistic estimates of failure rates range from a few times per day to more than once per hour for these platforms. P2P architectures give us a starting point for crafting applications and system software for exascale. In the context of the Internet, P2P applications (e.g., file sharing, botnets) have already solved this problem for 10{sup 6}-10{sup 7} nodes. Usually based on a fractal distributed hash table structure, these systems have proven robust in practice to constant and unpredictable outages, failures, and even subversion. For example, a recent estimate of botnet turnover (i.e., the number of machines leaving and joining) is about 11% per week. Nonetheless, P2P networks remain effective despite these failures: The Conficker botnet has grown to {approx} 5 x 10{sup 6} peers. Unlike today's system software and applications, those for next-generation exascale machines cannot assume a static structure and, to be scalable over millions of nodes, must be decentralized. P2P architectures achieve both, and provide a promising model for 'fault-oblivious computing'. This project aimed to study the dynamics of P2P networks in the context of a design for exascale systems and applications. Having no single point of failure, the most successful P2P architectures are adaptive and self-organizing. While there has been some previous work applying P2P to message passing, little attention has been previously paid to the tightly coupled exascale domain. Typically, the per-node footprint of P2P systems is small, making them ideal for HPC use. The implementation on each peer node cooperates en masse to 'heal' disruptions rather than relying on a controlling 'master' node. Understanding this cooperative behavior from a complex systems viewpoint is essential to predicting useful environments for the inextricably unreliable exascale platforms of the future. We sought to obtain theoretical insight into the stability and large-scale behavior of candidate architectures, and to work toward leveraging Sandia's Emulytics platform to test promising candidates in a realistic (ultimately {ge} 10{sup 7} nodes) setting. Our primary example applications are drawn from linear algebra: a Jacobi relaxation solver for the heat equation, and the closely related technique of value iteration in optimization. We aimed to apply P2P concepts in designing implementations capable of surviving an unreliable machine of 10{sup 6} nodes.« less

  10. Publication of science data on CD-ROM: A guide and example

    NASA Technical Reports Server (NTRS)

    Angelici, Gary; Skiles, J. W.

    1993-01-01

    CD-ROM (Compact Disk-Read Only Memory) is becoming the standard media not only in audio recording, but also in the publication of data and information accessible on many computer platforms. Little has been written about the complicated process involved in creating easy-to-use, high quality, and useful CD-ROM's containing scientific data. This document is a manual designed to aid those who are responsible for the publication of scientific data on CD-ROM. All aspects and steps of the procedure are covered, from feasibility assessment through disk design, data preparation, disc mastering, and CD-ROM distribution. General advice and actual examples are based on lessons learned from the publication of scientific data for an interdisciplinary field experiment. Appendices include actual files from a CD-ROM, a purchase request for CD-ROM mastering services, and the disk art for the first disk published for the project.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garzoglio, Gabriele

    The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center are working on a multi-year Collaborative Research and Development Agreement.With the knowledge developed in the first year on how to provision and manage a federation of virtual machines through Cloud management systems. In this second year, we expanded the work on provisioning and federation, increasing both scale and diversity of solutions, and we started to build on-demand services on the established fabric, introducing the paradigm of Platform as a Service to assist with the execution of scientific workflows. We have enabled scientific workflows ofmore » stakeholders to run on multiple cloud resources at the scale of 1,000 concurrent machines. The demonstrations have been in the areas of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federation of Cloud Resources, and (c) On-demand Services for ScientificWorkflows.« less

  12. The Science DMZ: A Network Design Pattern for Data-Intensive Science

    DOE PAGES

    Dart, Eli; Rotman, Lauren; Tierney, Brian; ...

    2014-01-01

    The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer performance, and impedes scientific progress. The Science DMZ paradigm comprises a proven set of network design patterns that collectively address these problems for scientists. We explain the Science DMZ model, including network architecture, system configuration, cybersecurity, and performance tools, that creates an optimized network environment for science. We describe use cases from universities, supercomputing centers andmore » research laboratories, highlighting the effectiveness of the Science DMZ model in diverse operational settings. In all, the Science DMZ model is a solid platform that supports any science workflow, and flexibly accommodates emerging network technologies. As a result, the Science DMZ vastly improves collaboration, accelerating scientific discovery.« less

  13. Scientific Programming Using Java: A Remote Sensing Example

    NASA Technical Reports Server (NTRS)

    Prados, Don; Mohamed, Mohamed A.; Johnson, Michael; Cao, Changyong; Gasser, Jerry

    1999-01-01

    This paper presents results of a project to port remote sensing code from the C programming language to Java. The advantages and disadvantages of using Java versus C as a scientific programming language in remote sensing applications are discussed. Remote sensing applications deal with voluminous data that require effective memory management, such as buffering operations, when processed. Some of these applications also implement complex computational algorithms, such as Fast Fourier Transformation analysis, that are very performance intensive. Factors considered include performance, precision, complexity, rapidity of development, ease of code reuse, ease of maintenance, memory management, and platform independence. Performance of radiometric calibration code written in Java for the graphical user interface and of using C for the domain model are also presented.

  14. Leveraging e-Science infrastructure for electrochemical research.

    PubMed

    Peachey, Tom; Mashkina, Elena; Lee, Chong-Yong; Enticott, Colin; Abramson, David; Bond, Alan M; Elton, Darrell; Gavaghan, David J; Stevenson, Gareth P; Kennedy, Gareth F

    2011-08-28

    As in many scientific disciplines, modern chemistry involves a mix of experimentation and computer-supported theory. Historically, these skills have been provided by different groups, and range from traditional 'wet' laboratory science to advanced numerical simulation. Increasingly, progress is made by global collaborations, in which new theory may be developed in one part of the world and applied and tested in the laboratory elsewhere. e-Science, or cyber-infrastructure, underpins such collaborations by providing a unified platform for accessing scientific instruments, computers and data archives, and collaboration tools. In this paper we discuss the application of advanced e-Science software tools to electrochemistry research performed in three different laboratories--two at Monash University in Australia and one at the University of Oxford in the UK. We show that software tools that were originally developed for a range of application domains can be applied to electrochemical problems, in particular Fourier voltammetry. Moreover, we show that, by replacing ad-hoc manual processes with e-Science tools, we obtain more accurate solutions automatically.

  15. Climate Modeling with a Million CPUs

    NASA Astrophysics Data System (ADS)

    Tobis, M.; Jackson, C. S.

    2010-12-01

    Michael Tobis, Ph.D. Research Scientist Associate University of Texas Institute for Geophysics Charles S. Jackson Research Scientist University of Texas Institute for Geophysics Meteorological, oceanographic, and climatological applications have been at the forefront of scientific computing since its inception. The trend toward ever larger and more capable computing installations is unabated. However, much of the increase in capacity is accompanied by an increase in parallelism and a concomitant increase in complexity. An increase of at least four additional orders of magnitude in the computational power of scientific platforms is anticipated. It is unclear how individual climate simulations can continue to make effective use of the largest platforms. Conversion of existing community codes to higher resolution, or to more complex phenomenology, or both, presents daunting design and validation challenges. Our alternative approach is to use the expected resources to run very large ensembles of simulations of modest size, rather than to await the emergence of very large simulations. We are already doing this in exploring the parameter space of existing models using the Multiple Very Fast Simulated Annealing algorithm, which was developed for seismic imaging. Our experiments have the dual intentions of tuning the model and identifying ranges of parameter uncertainty. Our approach is less strongly constrained by the dimensionality of the parameter space than are competing methods. Nevertheless, scaling up remains costly. Much could be achieved by increasing the dimensionality of the search and adding complexity to the search algorithms. Such ensemble approaches scale naturally to very large platforms. Extensions of the approach are anticipated. For example, structurally different models can be tuned to comparable effectiveness. This can provide an objective test for which there is no realistic precedent with smaller computations. We find ourselves inventing new code to manage our ensembles. Component computations involve tens to hundreds of CPUs and tens to hundreds of hours. The results of these moderately large parallel jobs influence the scheduling of subsequent jobs, and complex algorithms may be easily contemplated for this. The operating system concept of a "thread" re-emerges at a very coarse level, where each thread manages atomic computations of thousands of CPU-hours. That is, rather than multiple threads operating on a processor, at this level, multiple processors operate within a single thread. In collaboration with the Texas Advanced Computing Center, we are developing a software library at the system level, which should facilitate the development of computations involving complex strategies which invoke large numbers of moderately large multi-processor jobs. While this may have applications in other sciences, our key intent is to better characterize the coupled behavior of a very large set of climate model configurations.

  16. Low-cost Citizen Science Balloon Platform for Measuring Air Pollutants to Improve Satellite Retrieval Algorithms

    NASA Astrophysics Data System (ADS)

    Potosnak, M. J.; Beck-Winchatz, B.; Ritter, P.

    2016-12-01

    High-altitude balloons (HABs) are an engaging platform for citizen science and formal and informal STEM education. However, the logistics of launching, chasing and recovering a payload on a 1200 g or 1500 g balloon can be daunting for many novice school groups and citizen scientists, and the cost can be prohibitive. In addition, there are many interesting scientific applications that do not require reaching the stratosphere, including measuring atmospheric pollutants in the planetary boundary layer. With a large number of citizen scientist flights, these data can be used to constrain satellite retrieval algorithms. In this poster presentation, we discuss a novel approach based on small (30 g) balloons that are cheap and easy to handle, and low-cost tracking devices (SPOT trackers for hikers) that do not require a radio license. Our scientific goal is to measure air quality in the lower troposphere. For example, particulate matter (PM) is an air pollutant that varies on small spatial scales and has sources in rural areas like biomass burning and farming practices such as tilling. Our HAB platform test flight incorporates an optical PM sensor, an integrated single board computer that records the PM sensor signal in addition to flight parameters (pressure, location and altitude), and a low-cost tracking system. Our goal is for the entire platform to cost less than $500. While the datasets generated by these flights are typically small, integrating a network of flight data from citizen scientists into a form usable for comparison to satellite data will require big data techniques.

  17. Arkheia: Data Management and Communication for Open Computational Neuroscience

    PubMed Central

    Antolík, Ján; Davison, Andrew P.

    2018-01-01

    Two trends have been unfolding in computational neuroscience during the last decade. First, a shift of focus to increasingly complex and heterogeneous neural network models, with a concomitant increase in the level of collaboration within the field (whether direct or in the form of building on top of existing tools and results). Second, a general trend in science toward more open communication, both internally, with other potential scientific collaborators, and externally, with the wider public. This multi-faceted development toward more integrative approaches and more intense communication within and outside of the field poses major new challenges for modelers, as currently there is a severe lack of tools to help with automatic communication and sharing of all aspects of a simulation workflow to the rest of the community. To address this important gap in the current computational modeling software infrastructure, here we introduce Arkheia. Arkheia is a web-based open science platform for computational models in systems neuroscience. It provides an automatic, interactive, graphical presentation of simulation results, experimental protocols, and interactive exploration of parameter searches, in a web browser-based application. Arkheia is focused on automatic presentation of these resources with minimal manual input from users. Arkheia is written in a modular fashion with a focus on future development of the platform. The platform is designed in an open manner, with a clearly defined and separated API for database access, so that any project can write its own backend translating its data into the Arkheia database format. Arkheia is not a centralized platform, but allows any user (or group of users) to set up their own repository, either for public access by the general population, or locally for internal use. Overall, Arkheia provides users with an automatic means to communicate information about not only their models but also individual simulation results and the entire experimental context in an approachable graphical manner, thus facilitating the user's ability to collaborate in the field and outreach to a wider audience. PMID:29556187

  18. Arkheia: Data Management and Communication for Open Computational Neuroscience.

    PubMed

    Antolík, Ján; Davison, Andrew P

    2018-01-01

    Two trends have been unfolding in computational neuroscience during the last decade. First, a shift of focus to increasingly complex and heterogeneous neural network models, with a concomitant increase in the level of collaboration within the field (whether direct or in the form of building on top of existing tools and results). Second, a general trend in science toward more open communication, both internally, with other potential scientific collaborators, and externally, with the wider public. This multi-faceted development toward more integrative approaches and more intense communication within and outside of the field poses major new challenges for modelers, as currently there is a severe lack of tools to help with automatic communication and sharing of all aspects of a simulation workflow to the rest of the community. To address this important gap in the current computational modeling software infrastructure, here we introduce Arkheia. Arkheia is a web-based open science platform for computational models in systems neuroscience. It provides an automatic, interactive, graphical presentation of simulation results, experimental protocols, and interactive exploration of parameter searches, in a web browser-based application. Arkheia is focused on automatic presentation of these resources with minimal manual input from users. Arkheia is written in a modular fashion with a focus on future development of the platform. The platform is designed in an open manner, with a clearly defined and separated API for database access, so that any project can write its own backend translating its data into the Arkheia database format. Arkheia is not a centralized platform, but allows any user (or group of users) to set up their own repository, either for public access by the general population, or locally for internal use. Overall, Arkheia provides users with an automatic means to communicate information about not only their models but also individual simulation results and the entire experimental context in an approachable graphical manner, thus facilitating the user's ability to collaborate in the field and outreach to a wider audience.

  19. OWLS as platform technology in OPTOS satellite

    NASA Astrophysics Data System (ADS)

    Rivas Abalo, J.; Martínez Oter, J.; Arruego Rodríguez, I.; Martín-Ortega Rico, A.; de Mingo Martín, J. R.; Jiménez Martín, J. J.; Martín Vodopivec, B.; Rodríguez Bustabad, S.; Guerrero Padrón, H.

    2017-12-01

    The aim of this work is to show the Optical Wireless Link to intraSpacecraft Communications (OWLS) technology as a platform technology for space missions, and more specifically its use within the On-Board Communication system of OPTOS satellite. OWLS technology was proposed by Instituto Nacional de Técnica Aeroespacial (INTA) at the end of the 1990s and developed along 10 years through a number of ground demonstrations, technological developments and in-orbit experiments. Its main benefits are: mass reduction, flexibility, and simplification of the Assembly, Integration and Tests phases. The final step was to go from an experimental technology to a platform one. This step was carried out in the OPTOS satellite, which makes use of optical wireless links in a distributed network based on an OLWS implementation of the CAN bus. OPTOS is the first fully wireless satellite. It is based on the triple configuration (3U) of the popular Cubesat standard, and was completely built at INTA. It was conceived to procure a fast development, low cost, and yet reliable platform to the Spanish scientific community, acting as a test bed for space born science and technology. OPTOS presents a distributed OBDH architecture in which all satellite's subsystems and payloads incorporate a small Distributed On-Board Computer (OBC) Terminal (DOT). All DOTs (7 in total) communicate between them by means of the OWLS-CAN that enables full data sharing capabilities. This collaboration allows them to perform all tasks that would normally be carried out by a centralized On-Board Computer.

  20. Collective Framework and Performance Optimizations to Open MPI for Cray XT Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ladd, Joshua S; Gorentla Venkata, Manjunath; Shamis, Pavel

    2011-01-01

    The performance and scalability of collective operations plays a key role in the performance and scalability of many scientific applications. Within the Open MPI code base we have developed a general purpose hierarchical collective operations framework called Cheetah, and applied it at large scale on the Oak Ridge Leadership Computing Facility's Jaguar (OLCF) platform, obtaining better performance and scalability than the native MPI implementation. This paper discuss Cheetah's design and implementation, and optimizations to the framework for Cray XT 5 platforms. Our results show that the Cheetah's Broadcast and Barrier perform better than the native MPI implementation. For medium data,more » the Cheetah's Broadcast outperforms the native MPI implementation by 93% for 49,152 processes problem size. For small and large data, it out performs the native MPI implementation by 10% and 9%, respectively, at 24,576 processes problem size. The Cheetah's Barrier performs 10% better than the native MPI implementation for 12,288 processes problem size.« less

  1. SenSyF Experience on Integration of EO Services in a Generic, Cloud-Based EO Exploitation Platform

    NASA Astrophysics Data System (ADS)

    Almeida, Nuno; Catarino, Nuno; Gutierrez, Antonio; Grosso, Nuno; Andrade, Joao; Caumont, Herve; Goncalves, Pedro; Villa, Guillermo; Mangin, Antoine; Serra, Romain; Johnsen, Harald; Grydeland, Tom; Emsley, Stephen; Jauch, Eduardo; Moreno, Jose; Ruiz, Antonio

    2016-08-01

    SenSyF is a cloud-based data processing framework for EO- based services. It has been pioneer in addressing Big Data issues from the Earth Observation point of view, and is a precursor of several of the technologies and methodologies that will be deployed in ESA's Thematic Exploitation Platforms and other related systems.The SenSyF system focuses on developing fully automated data management, together with access to a processing and exploitation framework, including Earth Observation specific tools. SenSyF is both a development and validation platform for data intensive applications using Earth Observation data. With SenSyF, scientific, institutional or commercial institutions developing EO- based applications and services can take advantage of distributed computational and storage resources, tailored for applications dependent on big Earth Observation data, and without resorting to deep infrastructure and technological investments.This paper describes the integration process and the experience gathered from different EO Service providers during the project.

  2. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update

    PubMed Central

    Tian, Tian; Liu, Yue; Yan, Hengyu; You, Qi; Yi, Xin; Du, Zhou

    2017-01-01

    Abstract The agriGO platform, which has been serving the scientific community for >10 years, specifically focuses on gene ontology (GO) enrichment analyses of plant and agricultural species. We continuously maintain and update the databases and accommodate the various requests of our global users. Here, we present our updated agriGO that has a largely expanded number of supporting species (394) and datatypes (865). In addition, a larger number of species have been classified into groups covering crops, vegetables, fish, birds and insects closely related to the agricultural community. We further improved the computational efficiency, including the batch analysis and P-value distribution (PVD), and the user-friendliness of the web pages. More visualization features were added to the platform, including SEACOMPARE (cross comparison of singular enrichment analysis), direct acyclic graph (DAG) and Scatter Plots, which can be merged by choosing any significant GO term. The updated platform agriGO v2.0 is now publicly accessible at http://systemsbiology.cau.edu.cn/agriGOv2/. PMID:28472432

  3. UAH/NASA Workshop on Space Science Platform

    NASA Technical Reports Server (NTRS)

    Wu, S. T. (Editor); Morgan, S. (Editor)

    1978-01-01

    The scientific user requirements for a space science platform were defined. The potential user benefits, technological implications and cost of space platforms were examined. Cost effectiveness of the platforms' capabilities were also examined.

  4. The SBAS Sentinel-1 Surveillance service for automatic and systematic generation of Earth surface displacement within the GEP platform.

    NASA Astrophysics Data System (ADS)

    Casu, Francesco; De Luca, Claudio; Lanari, Riccardo; Manunta, Michele; Zinno, Ivana

    2017-04-01

    The Geohazards Exploitation Platform (GEP) is an ESA activity of the Earth Observation (EO) ground segment to demonstrate the benefit of new technologies for large scale processing of EO data. GEP aims at providing both on-demand processing services for scientific users of the geohazards community and an integration platform for new EO data analysis processors dedicated to scientists and other expert users. In the Remote Sensing scenario, a crucial role is played by the recently launched Sentinel-1 (S1) constellation that, with its global acquisition policy, has literally flooded the scientific community with a huge amount of data acquired over large part of the Earth on a regular basis (down to 6-days with both Sentinel-1A and 1B passes). Moreover, the S1 data, as part of the European Copernicus program, are openly and freely accessible, thus fostering their use for the development of tools for Earth surface monitoring. In particular, due to their specific SAR Interferometry (InSAR) design, Sentinel-1 satellites can be exploited to build up operational services for the generation of advanced interferometric products that can be very useful within risk management and natural hazard monitoring scenarios. Accordingly, in this work we present the activities carried out for the development, integration, and deployment of the SBAS Sentinel-1 Surveillance service of CNR-IREA within the GEP platform. This service is based on a parallel implementation of the SBAS approach, referred to as P-SBAS, able to effectively run in large distributed computing infrastructures (grid and cloud) and to allow for an efficient computation of large SAR data sequences with advanced DInSAR approaches. In particular, the Surveillance service developed on GEP platform consists on the systematic and automatic processing of Sentinel-1 data on selected Areas of Interest (AoI) to generate updated surface displacement time series via the SBAS-InSAR algorithm. We built up a system that is automatically triggered by every new S1 acquisition over the AoI, once it is available on the S1 catalogue. Then, tacking benefit from the SBAS results generated by previous runs of the service, the system processes the new acquisitions only, thus saving storage space and computing time and finally generating an updated SBAS time series. The same P-SBAS processor underlying the Surveillance service is also available through the GEP as a standard on-demand DInSAR service, thus allowing the scientific community to generate S1 SBAS time series on areas not covered by the Surveillance service itself. It is worth noting that the SBAS Sentinel-1 Surveillance service on GEP represents the core of the EPOSAR service, which will deliver S1 displacement time series of Earth surface on a regular basis for the European Plate Observing System (EPOS) Research Infrastructure community. In particular, the main goal of EPOSAR is to contribute with advanced technique and methods, which have already well demonstrated their effectiveness and relevance, in investigating the physical processes controlling earthquakes, volcanic eruptions and unrest episodes as well as those driving tectonics and Earth surface dynamics.

  5. Dynamic analysis, transformation, dissemination and applications of scientific multidimensional data in ArcGIS Platform

    NASA Astrophysics Data System (ADS)

    Shrestha, S. R.; Collow, T. W.; Rose, B.

    2016-12-01

    Scientific datasets are generated from various sources and platforms but they are typically produced either by earth observation systems or by modelling systems. These are widely used for monitoring, simulating, or analyzing measurements that are associated with physical, chemical, and biological phenomena over the ocean, atmosphere, or land. A significant subset of scientific datasets stores values directly as rasters or in a form that can be rasterized. This is where a value exists at every cell in a regular grid spanning the spatial extent of the dataset. Government agencies like NOAA, NASA, EPA, USGS produces large volumes of near real-time, forecast, and historical data that drives climatological and meteorological studies, and underpins operations ranging from weather prediction to sea ice loss. Modern science is computationally intensive because of the availability of an enormous amount of scientific data, the adoption of data-driven analysis, and the need to share these dataset and research results with the public. ArcGIS as a platform is sophisticated and capable of handling such complex domain. We'll discuss constructs and capabilities applicable to multidimensional gridded data that can be conceptualized as a multivariate space-time cube. Building on the concept of a two-dimensional raster, a typical multidimensional raster dataset could contain several "slices" within the same spatial extent. We will share a case from the NOAA Climate Forecast Systems Reanalysis (CFSR) multidimensional data as an example of how large collections of rasters can be efficiently organized and managed through a data model within a geodatabase called "Mosaic dataset" and dynamically transformed and analyzed using raster functions. A raster function is a lightweight, raster-valued transformation defined over a mixed set of raster and scalar input. That means, just like any tool, you can provide a raster function with input parameters. It enables dynamic processing of only the data that's being displayed on the screen or requested by an application. We will present the dynamic processing and analysis of CFSR data using the chains of raster function and share it as dynamic multidimensional image service. This workflow and capabilities can be easily applied to any scientific data formats that are supported in mosaic dataset.

  6. Java Performance for Scientific Applications on LLNL Computer Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kapfer, C; Wissink, A

    2002-05-10

    Languages in use for high performance computing at the laboratory--Fortran (f77 and f90), C, and C++--have many years of development behind them and are generally considered the fastest available. However, Fortran and C do not readily extend to object-oriented programming models, limiting their capability for very complex simulation software. C++ facilitates object-oriented programming but is a very complex and error-prone language. Java offers a number of capabilities that these other languages do not. For instance it implements cleaner (i.e., easier to use and less prone to errors) object-oriented models than C++. It also offers networking and security as part ofmore » the language standard, and cross-platform executables that make it architecture neutral, to name a few. These features have made Java very popular for industrial computing applications. The aim of this paper is to explain the trade-offs in using Java for large-scale scientific applications at LLNL. Despite its advantages, the computational science community has been reluctant to write large-scale computationally intensive applications in Java due to concerns over its poor performance. However, considerable progress has been made over the last several years. The Java Grande Forum [1] has been promoting the use of Java for large-scale computing. Members have introduced efficient array libraries, developed fast just-in-time (JIT) compilers, and built links to existing packages used in high performance parallel computing.« less

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allada, Veerendra, Benjegerdes, Troy; Bode, Brett

    Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as themore » workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.« less

  8. 78 FR 44136 - Submission for OMB review; 30-day Comment Request: National Cancer Institute (NCI) Cancer...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-23

    ...; 30-day Comment Request: National Cancer Institute (NCI) Cancer Nanotechnology Platform Partnership...: Dorothy Farrell, Center for Strategic Scientific Initiatives, Office of Cancer Nanotechnology Research... Cancer Institute (NCI) Alliance for Nanotechnology in Cancer Platform Partnership Scientific Progress...

  9. A sustainability model based on cloud infrastructures for core and downstream Copernicus services

    NASA Astrophysics Data System (ADS)

    Manunta, Michele; Calò, Fabiana; De Luca, Claudio; Elefante, Stefano; Farres, Jordi; Guzzetti, Fausto; Imperatore, Pasquale; Lanari, Riccardo; Lengert, Wolfgang; Zinno, Ivana; Casu, Francesco

    2014-05-01

    The incoming Sentinel missions have been designed to be the first remote sensing satellite system devoted to operational services. In particular, the Synthetic Aperture Radar (SAR) Sentinel-1 sensor, dedicated to globally acquire over land in the interferometric mode, guarantees an unprecedented capability to investigate and monitor the Earth surface deformations related to natural and man-made hazards. Thanks to the global coverage strategy and 12-day revisit time, jointly with the free and open access data policy, such a system will allow an extensive application of Differential Interferometric SAR (DInSAR) techniques. In such a framework, European Commission has been funding several projects through the GMES and Copernicus programs, aimed at preparing the user community to the operational and extensive use of Sentinel-1 products for risk mitigation and management purposes. Among them, the FP7-DORIS, an advanced GMES downstream service coordinated by Italian National Council of Research (CNR), is based on the fully exploitation of advanced DInSAR products in landslides and subsidence contexts. In particular, the DORIS project (www.doris-project.eu) has developed innovative scientific techniques and methodologies to support Civil Protection Authorities (CPA) during the pre-event, event, and post-event phases of the risk management cycle. Nonetheless, the huge data stream expected from the Sentinel-1 satellite may jeopardize the effective use of such data in emergency response and security scenarios. This potential bottleneck can be properly overcome through the development of modern infrastructures, able to efficiently provide computing resources as well as advanced services for big data management, processing and dissemination. In this framework, CNR and ESA have tightened up a cooperation to foster the use of GRID and cloud computing platforms for remote sensing data processing, and to make available to a large audience advanced and innovative tools for DInSAR products generation and exploitation. In particular, CNR is porting the multi-temporal DInSAR technique referred to as Small Baseline Subset (SBAS) into the ESA G-POD (Grid Processing On Demand) and CIOP (Cloud Computing Operational Pilot) platforms (Elefante et al., 2013) within the SuperSites Exploitation Platform (SSEP) project, which aim is contributing to the development of an ecosystem for big geo-data processing and dissemination. This work focuses on presenting the main results that have been achieved by the DORIS project concerning the use of advanced DInSAR products for supporting CPA during the risk management cycle. Furthermore, based on the DORIS experience, a sustainability model for Core and Downstream Copernicus services based on the effective exploitation of cloud platforms is proposed. In this framework, remote sensing community, both service providers and users, can significantly benefit from the Helix Nebula-The Science Cloud initiative, created by European scientific institutions, agencies, SMEs and enterprises to pave the way for the development and exploitation of a cloud computing infrastructure for science. REFERENCES Elefante, S., Imperatore, P. , Zinno, I., M. Manunta, E. Mathot, F. Brito, J. Farres, W. Lengert, R. Lanari, F. Casu, 2013, "SBAS-DINSAR Time series generation on cloud computing platforms". IEEE IGARSS Conference, Melbourne (AU), July 2013.

  10. A Lightweight I/O Scheme to Facilitate Spatial and Temporal Queries of Scientific Data Analytics

    NASA Technical Reports Server (NTRS)

    Tian, Yuan; Liu, Zhuo; Klasky, Scott; Wang, Bin; Abbasi, Hasan; Zhou, Shujia; Podhorszki, Norbert; Clune, Tom; Logan, Jeremy; Yu, Weikuan

    2013-01-01

    In the era of petascale computing, more scientific applications are being deployed on leadership scale computing platforms to enhance the scientific productivity. Many I/O techniques have been designed to address the growing I/O bottleneck on large-scale systems by handling massive scientific data in a holistic manner. While such techniques have been leveraged in a wide range of applications, they have not been shown as adequate for many mission critical applications, particularly in data post-processing stage. One of the examples is that some scientific applications generate datasets composed of a vast amount of small data elements that are organized along many spatial and temporal dimensions but require sophisticated data analytics on one or more dimensions. Including such dimensional knowledge into data organization can be beneficial to the efficiency of data post-processing, which is often missing from exiting I/O techniques. In this study, we propose a novel I/O scheme named STAR (Spatial and Temporal AggRegation) to enable high performance data queries for scientific analytics. STAR is able to dive into the massive data, identify the spatial and temporal relationships among data variables, and accordingly organize them into an optimized multi-dimensional data structure before storing to the storage. This technique not only facilitates the common access patterns of data analytics, but also further reduces the application turnaround time. In particular, STAR is able to enable efficient data queries along the time dimension, a practice common in scientific analytics but not yet supported by existing I/O techniques. In our case study with a critical climate modeling application GEOS-5, the experimental results on Jaguar supercomputer demonstrate an improvement up to 73 times for the read performance compared to the original I/O method.

  11. Simple re-instantiation of small databases using cloud computing.

    PubMed

    Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

    2013-01-01

    Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

  12. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

    DOE PAGES

    Basu, Protonu; Williams, Samuel; Van Straalen, Brian; ...

    2017-04-05

    GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less

  13. Simple re-instantiation of small databases using cloud computing

    PubMed Central

    2013-01-01

    Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380

  14. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Basu, Protonu; Williams, Samuel; Van Straalen, Brian

    GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less

  15. S-Genius, a universal software platform with versatile inverse problem resolution for scatterometry

    NASA Astrophysics Data System (ADS)

    Fuard, David; Troscompt, Nicolas; El Kalyoubi, Ismael; Soulan, Sébastien; Besacier, Maxime

    2013-05-01

    S-Genius is a new universal scatterometry platform, which gathers all the LTM-CNRS know-how regarding the rigorous electromagnetic computation and several inverse problem solver solutions. This software platform is built to be a userfriendly, light, swift, accurate, user-oriented scatterometry tool, compatible with any ellipsometric measurements to fit and any types of pattern. It aims to combine a set of inverse problem solver capabilities — via adapted Levenberg- Marquard optimization, Kriging, Neural Network solutions — that greatly improve the reliability and the velocity of the solution determination. Furthermore, as the model solution is mainly vulnerable to materials optical properties, S-Genius may be coupled with an innovative material refractive indices determination. This paper will a little bit more focuses on the modified Levenberg-Marquardt optimization, one of the indirect method solver built up in parallel with the total SGenius software coding by yours truly. This modified Levenberg-Marquardt optimization corresponds to a Newton algorithm with an adapted damping parameter regarding the definition domains of the optimized parameters. Currently, S-Genius is technically ready for scientific collaboration, python-powered, multi-platform (windows/linux/macOS), multi-core, ready for 2D- (infinite features along the direction perpendicular to the incident plane), conical, and 3D-features computation, compatible with all kinds of input data from any possible ellipsometers (angle or wavelength resolved) or reflectometers, and widely used in our laboratory for resist trimming studies, etching features characterization (such as complex stack) or nano-imprint lithography measurements for instance. The work about kriging solver, neural network solver and material refractive indices determination is done (or about to) by other LTM members and about to be integrated on S-Genius platform.

  16. The ImageJ ecosystem: an open platform for biomedical image analysis

    PubMed Central

    Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.

    2015-01-01

    Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368

  17. The ImageJ ecosystem: An open platform for biomedical image analysis.

    PubMed

    Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W

    2015-01-01

    Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.

  18. Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in ultrascale environments

    DOE PAGES

    Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...

    2016-10-06

    The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less

  19. NCI's Transdisciplinary High Performance Scientific Data Platform

    NASA Astrophysics Data System (ADS)

    Evans, Ben; Antony, Joseph; Bastrakova, Irina; Car, Nicholas; Cox, Simon; Druken, Kelsey; Evans, Bradley; Fraser, Ryan; Ip, Alex; Kemp, Carina; King, Edward; Minchin, Stuart; Larraondo, Pablo; Pugh, Tim; Richards, Clare; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

    2016-04-01

    The Australian National Computational Infrastructure (NCI) manages Earth Systems data collections sourced from several domains and organisations onto a single High Performance Data (HPD) Node to further Australia's national priority research and innovation agenda. The NCI HPD Node has rapidly established its value, currently managing over 10 PBytes of datasets from collections that span a wide range of disciplines including climate, weather, environment, geoscience, geophysics, water resources and social sciences. Importantly, in order to facilitate broad user uptake, maximise reuse and enable transdisciplinary access through software and standardised interfaces, the datasets, associated information systems and processes have been incorporated into the design and operation of a unified platform that NCI has called, the National Environmental Research Data Interoperability Platform (NERDIP). The key goal of the NERDIP is to regularise data access so that it is easily discoverable, interoperable for different domains and enabled for high performance methods. It adopts and implements international standards and data conventions, and promotes scientific integrity within a high performance computing and data analysis environment. NCI has established a rich and flexible computing environment to access to this data, through the NCI supercomputer; a private cloud that supports both domain focused virtual laboratories and in-common interactive analysis interfaces; as well as remotely through scalable data services. Data collections of this importance must be managed with careful consideration of both their current use and the needs of the end-communities, as well as its future potential use, such as transitioning to more advanced software and improved methods. It is therefore critical that the data platform is both well-managed and trusted for stable production use (including transparency and reproducibility), agile enough to incorporate new technological advances and additional communities practices, and a foundation for new exploratory developments. To that end, NCI is already participating in numerous current and emerging collaborations internationally including the Earth System Grid Federation (ESGF); Climate and Weather Data from International agencies such as NASA, NOAA, and UK Met Office; Remotely Sensed Satellite Earth Imaging through collaborations through GEOS and CEOS; EU-led Ocean Data Interoperability Platform (ODIP) and Horizon2020 Earth Server2 project; as well as broader data infrastructure community activities such as Research Data Alliance (RDA). Each research community is heavily engaged in international standards such as ISO, OGC and W3C, adopting community-led conventions for data, supporting improved data organisation such as controlled vocabularies, and creating workflows that use mature APIs and data services. NCI is engaging with these communities on NERDIP to ensure that such standards are applied uniformly and tested in practice by working with the variety of data and technologies. This includes benchmarking exemplar cases from individual communities, documenting their use of standards, and evaluating their practical use of the different technologies. Such a process fully establishes the functionality and performance, and is required to safely transition when improvements or rationalisation is required. Work is now underway to extend the NERDIP platform for better utilisation in the subsurface geophysical community, including maximizing national uptake, as well as better integration with international science platforms.

  20. Image2000: A Free, Innovative, Java Based Imaging Package

    NASA Technical Reports Server (NTRS)

    Pell, Nicholas; Wheeler, Phil; Cornwell, Carl; Matusow, David; Obenschain, Arthur F. (Technical Monitor)

    2001-01-01

    The National Aeronautics and Space Administration (NASA) Goddard Space Flight Center's (GSFC) Scientific and Educational Endeavors (SEE) and the Center for Image Processing in Education (CIPE) use satellite image processing as part of their science lessons developed for students and educators. The image processing products that they use, as part of these lessons, no longer fulfill the needs of SEE and CIPE because these products are either dependent on a particular computing platform, hard to customize and extend, or do not have enough functionality. SEE and CIPE began looking for what they considered the "perfect" image processing tool that was platform independent, rich in functionality and could easily be extended and customized for their purposes. At the request of SEE, NASA's GSFC, code 588 the Advanced Architectures and Automation Branch developed a powerful new Java based image processing endeavors.

  1. Automating NEURON Simulation Deployment in Cloud Resources.

    PubMed

    Stockton, David B; Santamaria, Fidel

    2017-01-01

    Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.

  2. Automating NEURON Simulation Deployment in Cloud Resources

    PubMed Central

    Santamaria, Fidel

    2016-01-01

    Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the Open-Stack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon’s proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model. PMID:27655341

  3. A survey of GPU-based medical image computing techniques

    PubMed Central

    Shi, Lin; Liu, Wen; Zhang, Heye; Xie, Yongming

    2012-01-01

    Medical imaging currently plays a crucial role throughout the entire clinical applications from medical scientific research to diagnostics and treatment planning. However, medical imaging procedures are often computationally demanding due to the large three-dimensional (3D) medical datasets to process in practical clinical applications. With the rapidly enhancing performances of graphics processors, improved programming support, and excellent price-to-performance ratio, the graphics processing unit (GPU) has emerged as a competitive parallel computing platform for computationally expensive and demanding tasks in a wide range of medical image applications. The major purpose of this survey is to provide a comprehensive reference source for the starters or researchers involved in GPU-based medical image processing. Within this survey, the continuous advancement of GPU computing is reviewed and the existing traditional applications in three areas of medical image processing, namely, segmentation, registration and visualization, are surveyed. The potential advantages and associated challenges of current GPU-based medical imaging are also discussed to inspire future applications in medicine. PMID:23256080

  4. An MPI-based MoSST core dynamics model

    NASA Astrophysics Data System (ADS)

    Jiang, Weiyuan; Kuang, Weijia

    2008-09-01

    Distributed systems are among the main cost-effective and expandable platforms for high-end scientific computing. Therefore scalable numerical models are important for effective use of such systems. In this paper, we present an MPI-based numerical core dynamics model for simulation of geodynamo and planetary dynamos, and for simulation of core-mantle interactions. The model is developed based on MPI libraries. Two algorithms are used for node-node communication: a "master-slave" architecture and a "divide-and-conquer" architecture. The former is easy to implement but not scalable in communication. The latter is scalable in both computation and communication. The model scalability is tested on Linux PC clusters with up to 128 nodes. This model is also benchmarked with a published numerical dynamo model solution.

  5. OSCAR: A Compact, Powerful and Versatile On Board Computer Based on LEON3 Core

    NASA Astrophysics Data System (ADS)

    Poupat, Jean-Luc; Lefevre, Aurelien; Koebel, Franck

    2011-08-01

    Satellites are controlled via a platform On Board Computer (OBC) that manages different parameters (attitude, orbit, modes, temperatures ...) with respect to its payload mission (telecommunication, earth observation, scientific mission). The platform OBC is connected to the satellite and the ground control via digital links, and executes on board software.The main functions of a platform OBC are to provide the satellite flight segment with the following features: o Processing resources for the flight mission software o TM/TC services and interfaces with the RF communication chaino General communication services with the Avionicsand payload equipments through an on-board communication bus based on the MIL-1553B standard or CANo Time synchronization and distributiono Failure tolerant architecture based on the use of redounded reconfiguration units and redundancyimplementationFrom a hardware point of view, it groups a lot of digital functions usually dispatched on numerous chips (processor, co-processor, digital links IP ...) together. In order to reach an ultimate level of integration, Astrium has designed an ASIC gathering on a single chip all the required digital functions: the SCOC3 ASIC.Astrium has developed an OBC based on this SCOC3 ASIC: the OSCAR (Optimized Spacecraft Computer Architecture with Reconfiguration). It is now available off-the-shelf as the new OBC product family of Astrium.This paper presents the major innovations introduced by Astrium for SCOC3 and OSCAR with the objective to save cost and mass through a solution compatible with any class quality project, using a unique software development environment for user.

  6. Science in support of the Deepwater Horizon response

    USGS Publications Warehouse

    Lubchenco, Jane; McNutt, Marcia K.; Dreyfus, Gabrielle; Murawski, Steven A.; Kennedy, David M.; Anastas, Paul T.; Chu, Steven; Hunter, Tom

    2012-01-01

    This introduction to the Special Feature presents the context for science during the Deepwater Horizon oil spill response, summarizes how scientific knowledge was integrated across disciplines and statutory responsibilities, identifies areas where scientific information was accurate and where it was not, and considers lessons learned and recommendations for future research and response. Scientific information was integrated within and across federal and state agencies, with input from nongovernmental scientists, across a diverse portfolio of needs—stopping the flow of oil, estimating the amount of oil, capturing and recovering the oil, tracking and forecasting surface oil, protecting coastal and oceanic wildlife and habitat, managing fisheries, and protecting the safety of seafood. Disciplines involved included atmospheric, oceanographic, biogeochemical, ecological, health, biological, and chemical sciences, physics, geology, and mechanical and chemical engineering. Platforms ranged from satellites and planes to ships, buoys, gliders, and remotely operated vehicles to laboratories and computer simulations. The unprecedented response effort depended directly on intense and extensive scientific and engineering data, information, and advice. Many valuable lessons were learned that should be applied to future events.

  7. Science in support of the Deepwater Horizon response

    PubMed Central

    Lubchenco, Jane; McNutt, Marcia K.; Dreyfus, Gabrielle; Murawski, Steven A.; Kennedy, David M.; Anastas, Paul T.; Chu, Steven; Hunter, Tom

    2012-01-01

    This introduction to the Special Feature presents the context for science during the Deepwater Horizon oil spill response, summarizes how scientific knowledge was integrated across disciplines and statutory responsibilities, identifies areas where scientific information was accurate and where it was not, and considers lessons learned and recommendations for future research and response. Scientific information was integrated within and across federal and state agencies, with input from nongovernmental scientists, across a diverse portfolio of needs—stopping the flow of oil, estimating the amount of oil, capturing and recovering the oil, tracking and forecasting surface oil, protecting coastal and oceanic wildlife and habitat, managing fisheries, and protecting the safety of seafood. Disciplines involved included atmospheric, oceanographic, biogeochemical, ecological, health, biological, and chemical sciences, physics, geology, and mechanical and chemical engineering. Platforms ranged from satellites and planes to ships, buoys, gliders, and remotely operated vehicles to laboratories and computer simulations. The unprecedented response effort depended directly on intense and extensive scientific and engineering data, information, and advice. Many valuable lessons were learned that should be applied to future events. PMID:23213250

  8. THE VIRTUAL INSTRUMENT: SUPPORT FOR GRID-ENABLED MCELL SIMULATIONS

    PubMed Central

    Casanova, Henri; Berman, Francine; Bartol, Thomas; Gokcay, Erhan; Sejnowski, Terry; Birnbaum, Adam; Dongarra, Jack; Miller, Michelle; Ellisman, Mark; Faerman, Marcio; Obertelli, Graziano; Wolski, Rich; Pomerantz, Stuart; Stiles, Joel

    2010-01-01

    Ensembles of widely distributed, heterogeneous resources, or Grids, have emerged as popular platforms for large-scale scientific applications. In this paper we present the Virtual Instrument project, which provides an integrated application execution environment that enables end-users to run and interact with running scientific simulations on Grids. This work is performed in the specific context of MCell, a computational biology application. While MCell provides the basis for running simulations, its capabilities are currently limited in terms of scale, ease-of-use, and interactivity. These limitations preclude usage scenarios that are critical for scientific advances. Our goal is to create a scientific “Virtual Instrument” from MCell by allowing its users to transparently access Grid resources while being able to steer running simulations. In this paper, we motivate the Virtual Instrument project and discuss a number of relevant issues and accomplishments in the area of Grid software development and application scheduling. We then describe our software design and report on the current implementation. We verify and evaluate our design via experiments with MCell on a real-world Grid testbed. PMID:20689618

  9. Macedonian journal of chemistry and chemical engineering: open journal systems--editor's perspective.

    PubMed

    Zdravkovski, Zoran

    2014-01-01

    The development and availability of personal computers and software as well as printing techniques in the last twenty years have made a profound change in the publication of scientific journals. Additionally, the Internet in the last decade has revolutionized the publication process to the point of changing the basic paradigm of printed journals. The Macedonian Journal of Chemistry and Chemical Engineering in its 40-year history has adopted and adapted to all these transformations. In order to keep up with the inevitable changes, as editor-in-chief I felt my responsibility was to introduce an electronic editorial managing of the journal. The choice was between commercial and open source platforms, and because of the limited funding of the journal we chose the latter. We decided on Open Journal Systems, which provided online submission and management of all content, had flexible configuration--requirements, sections, review process, etc., had options for comprehensive indexing, offered various reading tools, had email notification and commenting ability for readers, had an option for thesis abstracts and was installed locally. However, since there is limited support it requires a moderate computer knowledge/skills and effort in order to set up. Overall, it is an excellent editorial platform and a convenient solution for journals with a low budget or journals that do not want to spend their resources on commercial platforms or simply support the idea of open source software.

  10. Exploring quantum computing application to satellite data assimilation

    NASA Astrophysics Data System (ADS)

    Cheung, S.; Zhang, S. Q.

    2015-12-01

    This is an exploring work on potential application of quantum computing to a scientific data optimization problem. On classical computational platforms, the physical domain of a satellite data assimilation problem is represented by a discrete variable transform, and classical minimization algorithms are employed to find optimal solution of the analysis cost function. The computation becomes intensive and time-consuming when the problem involves large number of variables and data. The new quantum computer opens a very different approach both in conceptual programming and in hardware architecture for solving optimization problem. In order to explore if we can utilize the quantum computing machine architecture, we formulate a satellite data assimilation experimental case in the form of quadratic programming optimization problem. We find a transformation of the problem to map it into Quadratic Unconstrained Binary Optimization (QUBO) framework. Binary Wavelet Transform (BWT) will be applied to the data assimilation variables for its invertible decomposition and all calculations in BWT are performed by Boolean operations. The transformed problem will be experimented as to solve for a solution of QUBO instances defined on Chimera graphs of the quantum computer.

  11. Field: a new meta-authoring platform for data-intensive scientific visualization

    NASA Astrophysics Data System (ADS)

    Downie, M.; Ameres, E.; Fox, P. A.; Goebel, J.; Graves, A.; Hendler, J.

    2012-12-01

    This presentation will demonstrate a new platform for data-intensive scientific visualization, called Field, that rethinks the problem of visual data exploration. Several new opportunities for scientific visualization present themselves here at this moment in time. We believe that when taken together they may catalyze a transformation of the practice of science and to begin to seed a technical culture within science that fuses data analysis, programming and myriad visual strategies. It is at integrative levels that the principle challenges exist, for many fundamental technical components of our field are now well understood and widely available. File formats from CSV through HDF all have broad library support; low-level high-performance graphics APIs (OpenGL) are in a period of stable growth; and a dizzying ecosystem of analysis and machine learning libraries abound. The hardware of computer graphics offers unprecedented computing power within commodity components; programming languages and platforms are coalescing around a core set of umbrella runtimes. Each of these trends are each set to continue — computer graphics hardware is developing at a super-Moore-law rate, and trends in publication and dissemination point only towards an increasing amount of access to code and data. The critical opportunity here for scientific visualization is, we maintain, not a in developing a new statistical library, nor a new tool centered on a particular technique, but rather new visual, "live" programming environment that is promiscuous in its scope. We can identify the necessarily methodological practice and traditions required here not in science or engineering but in the "live-coding" practices prevalent in the fields of digital art and design. We can define this practice as an approach to programming that is live, iterative, integrative, speculative and exploratory. "Live" because it is exclusively practiced in real-time (often during performance); "iterative", because intermediate programs and this visual results are constantly being made and remade en route; "speculative", because these programs and images result out of mode of inquiry into image-making not unlike that of hypothesis formation and testing; "integrative" because this style draws deeply upon the libraries of algorithms and materials available online today; and "exploratory" because the results of these speculations are inherently open to the data and unforeseen out the outset. To this end our development environment — Field — comprises a minimal core and a powerful plug-in system that can be extended from within the environment itself. By providing a hybrid text editor that can incorporate text-based programming at the same time with graphical user-interface elements, its flexible and extensible interface provides space as necessary for notation, visualization, interface construction, and introspection. In addition, it provides an advanced GPU-accelerated graphics system ideal for large-scale data visualization. Since Field was created in the context of widely divergent interdisciplinary projects, its aim is to give its users not only the ability to work rapidly, but to shape their Field environment extensively and flexibly for their own demands.

  12. Metabolizing Data in the Cloud.

    PubMed

    Warth, Benedikt; Levin, Nadine; Rinehart, Duane; Teijaro, John; Benton, H Paul; Siuzdak, Gary

    2017-06-01

    Cloud-based bioinformatic platforms address the fundamental demands of creating a flexible scientific environment, facilitating data processing and general accessibility independent of a countries' affluence. These platforms have a multitude of advantages as demonstrated by omics technologies, helping to support both government and scientific mandates of a more open environment. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Scientific workflows as productivity tools for drug discovery.

    PubMed

    Shon, John; Ohkawa, Hitomi; Hammer, Juergen

    2008-05-01

    Large pharmaceutical companies annually invest tens to hundreds of millions of US dollars in research informatics to support their early drug discovery processes. Traditionally, most of these investments are designed to increase the efficiency of drug discovery. The introduction of do-it-yourself scientific workflow platforms has enabled research informatics organizations to shift their efforts toward scientific innovation, ultimately resulting in a possible increase in return on their investments. Unlike the handling of most scientific data and application integration approaches, researchers apply scientific workflows to in silico experimentation and exploration, leading to scientific discoveries that lie beyond automation and integration. This review highlights some key requirements for scientific workflow environments in the pharmaceutical industry that are necessary for increasing research productivity. Examples of the application of scientific workflows in research and a summary of recent platform advances are also provided.

  14. Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing

    NASA Astrophysics Data System (ADS)

    Meng, Xiang

    The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In particular, parallel computing are forms of computation operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently. In this dissertation, we report a series of new nanophotonic developments using the advanced parallel computing techniques. The applications include the structure optimizations at the nanoscale to control both the electromagnetic response of materials, and to manipulate nanoscale structures for enhanced field concentration, which enable breakthroughs in imaging, sensing systems (chapter 3 and 4) and improve the spatial-temporal resolutions of spectroscopies (chapter 5). We also report the investigations on the confinement study of optical-matter interactions at the quantum mechanical regime, where the size-dependent novel properties enhanced a wide range of technologies from the tunable and efficient light sources, detectors, to other nanophotonic elements with enhanced functionality (chapter 6 and 7).

  15. A bottom-up, scientist-based initiative for the communication of climate sciences with the general public

    NASA Astrophysics Data System (ADS)

    Bourqui, Michel; Bolduc, Cassandra; Paul, Charbonneau; Marie, Charrière; Daniel, Hill; Angelica, Lopez; Enrique, Loubet; Philippe, Roy; Barbara, Winter

    2015-04-01

    This talk introduces a scientists-initiated, new online platform whose aim is to contribute to making climate sciences become public knowledge. It takes a unique bottom-up approach, strictly founded on individual-based participation, high scientific standards and independence The main purpose is to build an open-access, multilingual and peer-reviewed journal publishing short climate articles in non-scientific language. The targeted public includes journalists, teachers, students, local politicians, economists, members of the agriculture sector, and any other citizens from around the world with an interest in climate sciences. This journal is meant to offer a simple and direct channel for scientists wishing to disseminate their research to the general public. A high standard of climate articles is ensured through: a) requiring that the main author is an active climate scientist, and b) an innovative peer-review process involving scientific and non-scientific referees with distinct roles. The platform fosters the direct participation of non-scientists through co-authoring, peer-reviewing, language translation. It furthermore engages the general public in the scientific inquiry by allowing non-scientists to invite manuscripts to be written on topics of their concern. The platform is currently being developed by a community of scientists and non-scientists. In this talk, I will present the basic ideas behind this new online platform, its current state and the plans for the next future. The beta version of the platform is available at: http://www.climateonline.bourquiconsulting.ch

  16. Future-saving audiovisual content for Data Science: Preservation of geoinformatics video heritage with the TIB|AV-Portal

    NASA Astrophysics Data System (ADS)

    Löwe, Peter; Plank, Margret; Ziedorn, Frauke

    2015-04-01

    In data driven research, the access to citation and preservation of the full triad consisting of journal article, research data and -software has started to become good scientific practice. To foster the adoption of this practice the significance of software tools has to be acknowledged, which enable scientists to harness auxiliary audiovisual content in their research work. The advent of ubiquitous computer-based audiovisual recording and corresponding Web 2.0 hosting platforms like Youtube, Slideshare and GitHub has created new ecosystems for contextual information related to scientific software and data, which continues to grow both in size and variety of content. The current Web 2.0 platforms lack capabilities for long term archiving and scientific citation, such as persistent identifiers allowing to reference specific intervals of the overall content. The audiovisual content currently shared by scientists ranges from commented howto-demonstrations on software handling, installation and data-processing, to aggregated visual analytics of the evolution of software projects over time. Such content are crucial additions to the scientific message, as they ensure that software-based data-processing workflows can be assessed, understood and reused in the future. In the context of data driven research, such content needs to be accessible by effective search capabilities, enabling the content to be retrieved and ensuring that the content producers receive credit for their efforts within the scientific community. Improved multimedia archiving and retrieval services for scientific audiovisual content which meet these requirements are currently implemented by the scientific library community. This paper exemplifies the existing challenges, requirements, benefits and the potential of the preservation, accessibility and citability of such audiovisual content for the Open Source communities based on the new audiovisual web service TIB|AV Portal of the German National Library of Science and Technology. The web-based portal allows for extended search capabilities based on enhanced metadata derived by automated video analysis. By combining state-of-the-art multimedia retrieval techniques such as speech-, text-, and image recognition with semantic analysis, content-based access to videos at the segment level is provided. Further, by using the open standard Media Fragment Identifier (MFID), a citable Digital Object Identifier is displayed for each video segment. In addition to the continuously growing footprint of contemporary content, the importance of vintage audiovisual information needs to be considered: This paper showcases the successful application of the TIB|AV-Portal in the preservation and provision of a newly discovered version of a GRASS GIS promotional video produced by US Army -Corps of Enginers Laboratory (US-CERL) in 1987. The video is provides insight into the constraints of the very early days of the GRASS GIS project, which is the oldest active Free and Open Source Software (FOSS) GIS project which has been active for over thirty years. GRASS itself has turned into a collaborative scientific platform and a repository of scientific peer-reviewed code and algorithm/knowledge hub for future generation of scientists [1]. This is a reference case for future preservation activities regarding semantic-enhanced Web 2.0 content from geospatial software projects within Academia and beyond. References: [1] Chemin, Y., Petras V., Petrasova, A., Landa, M., Gebbert, S., Zambelli, P., Neteler, M., Löwe, P.: GRASS GIS: a peer-reviewed scientific platform and future research Repository, Geophysical Research Abstracts, Vol. 17, EGU2015-8314-1, 2015 (submitted)

  17. Interactive Visualization of Large-Scale Hydrological Data using Emerging Technologies in Web Systems and Parallel Programming

    NASA Astrophysics Data System (ADS)

    Demir, I.; Krajewski, W. F.

    2013-12-01

    As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.

  18. Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.

    PubMed

    Yin, Zekun; Lan, Haidong; Tan, Guangming; Lu, Mian; Vasilakos, Athanasios V; Liu, Weiguo

    2017-01-01

    The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics.

  19. CernVM WebAPI - Controlling Virtual Machines from the Web

    NASA Astrophysics Data System (ADS)

    Charalampidis, I.; Berzano, D.; Blomer, J.; Buncic, P.; Ganis, G.; Meusel, R.; Segal, B.

    2015-12-01

    Lately, there is a trend in scientific projects to look for computing resources in the volunteering community. In addition, to reduce the development effort required to port the scientific software stack to all the known platforms, the use of Virtual Machines (VMs)u is becoming increasingly popular. Unfortunately their use further complicates the software installation and operation, restricting the volunteer audience to sufficiently expert people. CernVM WebAPI is a software solution addressing this specific case in a way that opens wide new application opportunities. It offers a very simple API for setting-up, controlling and interfacing with a VM instance in the users computer, while in the same time offloading the user from all the burden of downloading, installing and configuring the hypervisor. WebAPI comes with a lightweight javascript library that guides the user through the application installation process. Malicious usage is prohibited by offering a per-domain PKI validation mechanism. In this contribution we will overview this new technology, discuss its security features and examine some test cases where it is already in use.

  20. On Establishing Big Data Wave Breakwaters with Analytics (Invited)

    NASA Astrophysics Data System (ADS)

    Riedel, M.

    2013-12-01

    The Research Data Alliance Big Data Analytics (RDA-BDA) Interest Group seeks to develop community based recommendations on feasible data analytics approaches to address scientific community needs of utilizing large quantities of data. RDA-BDA seeks to analyze different scientific domain applications and their potential use of various big data analytics techniques. A systematic classification of feasible combinations of analysis algorithms, analytical tools, data and resource characteristics and scientific queries will be covered in these recommendations. These combinations are complex since a wide variety of different data analysis algorithms exist (e.g. specific algorithms using GPUs of analyzing brain images) that need to work together with multiple analytical tools reaching from simple (iterative) map-reduce methods (e.g. with Apache Hadoop or Twister) to sophisticated higher level frameworks that leverage machine learning algorithms (e.g. Apache Mahout). These computational analysis techniques are often augmented with visual analytics techniques (e.g. computational steering on large-scale high performance computing platforms) to put the human judgement into the analysis loop or new approaches with databases that are designed to support new forms of unstructured or semi-structured data as opposed to the rather tradtional structural databases (e.g. relational databases). More recently, data analysis and underpinned analytics frameworks also have to consider energy footprints of underlying resources. To sum up, the aim of this talk is to provide pieces of information to understand big data analytics in the context of science and engineering using the aforementioned classification as the lighthouse and as the frame of reference for a systematic approach. This talk will provide insights about big data analytics methods in context of science within varios communities and offers different views of how approaches of correlation and causality offer complementary methods to advance in science and engineering today. The RDA Big Data Analytics Group seeks to understand what approaches are not only technically feasible, but also scientifically feasible. The lighthouse Goal of the RDA Big Data Analytics Group is a classification of clever combinations of various Technologies and scientific applications in order to provide clear recommendations to the scientific community what approaches are technicalla and scientifically feasible.

  1. FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

    PubMed

    Wilson, Carolyn A; Simonyan, Vahan

    2014-01-01

    Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and bioinformatics tools to be able to store and analyze large amounts of data. To address this need, we have developed the High-performance Integrated Virtual Environment, or HIVE. HIVE uses a combination of distributed cloud storage and distributed cloud computations to provide a platform that is both rapid and responsive to support the growing and increasingly diverse scientific and regulatory needs of FDA scientists in their evaluation of NGS in research and ultimately for evaluation of NGS data in regulatory submissions. © PDA, Inc. 2014.

  2. iRODS-Based Climate Data Services and Virtualization-as-a-Service in the NASA Center for Climate Simulation

    NASA Astrophysics Data System (ADS)

    Schnase, J. L.; Duffy, D. Q.; Tamkin, G. S.; Strong, S.; Ripley, D.; Gill, R.; Sinno, S. S.; Shen, Y.; Carriere, L. E.; Brieger, L.; Moore, R.; Rajasekar, A.; Schroeder, W.; Wan, M.

    2011-12-01

    Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service. A virtual climate data server is an OAIS-compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have developed prototype vCDSs to manage NetCDF, HDF, and GeoTIF data products. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA's Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into these virtualized resources, multiple vCDSs can use iRODS's federation and realized object capabilities to create an integrated ecosystem of data servers that can scale and adapt to changing requirements. This approach enables platform- or software-as-a-service deployment of the vCDSs and allows the NCCS to offer virtualization-as-a-service, a capacity to respond in an agile way to new customer requests for data services, and a path for migrating existing services into the cloud. We have registered MODIS Atmosphere data products in a vCDS that contains 54 million registered files, 630TB of data, and over 300 million metadata values. We are now assembling IPCC AR5 data into a production vCDS that will provide the platform upon which NCCS's Earth System Grid (ESG) node publishes to the extended science community. In this talk, we describe our approach, experiences, lessons learned, and plans for the future.

  3. SHIWA Services for Workflow Creation and Sharing in Hydrometeorolog

    NASA Astrophysics Data System (ADS)

    Terstyanszky, Gabor; Kiss, Tamas; Kacsuk, Peter; Sipos, Gergely

    2014-05-01

    Researchers want to run scientific experiments on Distributed Computing Infrastructures (DCI) to access large pools of resources and services. To run these experiments requires specific expertise that they may not have. Workflows can hide resources and services as a virtualisation layer providing a user interface that researchers can use. There are many scientific workflow systems but they are not interoperable. To learn a workflow system and create workflows may require significant efforts. Considering these efforts it is not reasonable to expect that researchers will learn new workflow systems if they want to run workflows developed in other workflow systems. To overcome it requires creating workflow interoperability solutions to allow workflow sharing. The FP7 'Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs' (SHIWA) project developed the Coarse-Grained Interoperability concept (CGI). It enables recycling and sharing workflows of different workflow systems and executing them on different DCIs. SHIWA developed the SHIWA Simulation Platform (SSP) to implement the CGI concept integrating three major components: the SHIWA Science Gateway, the workflow engines supported by the CGI concept and DCI resources where workflows are executed. The science gateway contains a portal, a submission service, a workflow repository and a proxy server to support the whole workflow life-cycle. The SHIWA Portal allows workflow creation, configuration, execution and monitoring through a Graphical User Interface using the WS-PGRADE workflow system as the host workflow system. The SHIWA Repository stores the formal description of workflows and workflow engines plus executables and data needed to execute them. It offers a wide-range of browse and search operations. To support non-native workflow execution the SHIWA Submission Service imports the workflow and workflow engine from the SHIWA Repository. This service either invokes locally or remotely pre-deployed workflow engines or submits workflow engines with the workflow to local or remote resources to execute workflows. The SHIWA Proxy Server manages certificates needed to execute the workflows on different DCIs. Currently SSP supports sharing of ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflows. Further workflow systems can be added to the simulation platform as required by research communities. The FP7 'Building a European Research Community through Interoperable Workflows and Data' (ER-flow) project disseminates the achievements of the SHIWA project to build workflow user communities across Europe. ER-flow provides application supports to research communities within (Astrophysics, Computational Chemistry, Heliophysics and Life Sciences) and beyond (Hydrometeorology and Seismology) to develop, share and run workflows through the simulation platform. The simulation platform supports four usage scenarios: creating and publishing workflows in the repository, searching and selecting workflows in the repository, executing non-native workflows and creating and running meta-workflows. The presentation will outline the CGI concept, the SHIWA Simulation Platform, the ER-flow usage scenarios and how the Hydrometeorology research community runs simulations on SSP.

  4. Qualitative Epistemology: A Scientific Platform for the Study of Subjectivity from a Cultural-Historical Approach

    ERIC Educational Resources Information Center

    Patiño, José Fernando; Goulart, Daniel Magalhães

    2016-01-01

    This article contributes to the platform of thought proposed by González Rey in the development of qualitative epistemology and the theory of subjectivity. We discuss three core aspects: firstly, the general epistemological problems of modern science, with its non-critical, non-theoretical scientific ideals, and low reflexivity; secondly, we…

  5. SCOC3: A Brand New Heart for Space Mission

    NASA Astrophysics Data System (ADS)

    Poupat, Jean-Luc; Lefevre, Aurelien

    2012-08-01

    Satellites are controlled via a platform On Board Computer (OBC) that manages different parameters (attitude, orbit, modes, temperatures ...) with respect to its payload mission (telecommunication, earth observation, scientific mission). The platform OBC is connected to the satellite and the ground control via digital links, and executes on board software.The main functions of a platform OBC are to provide the satellite flight segment with the following features: o Processing resources for the flight mission softwareo TM/TC services and interfaces with the RF communication chaino General communication services with the Avionics and payload equipments through on- board communication buso Time synchronization and distributiono Failure tolerant architecture based on the use of redounded reconfiguration units and redundancy implementationIn order to reach an ultimate level of integration, Astrium has designed an ASIC gathering on a single chip all these required digital functions: the SCOC3 ASIC.This paper presents in a first part the major innovations introduced by Astrium for SCOC3, in a second part the development tools associated to SCOC3, and in a third part the status concerning its commercialization.

  6. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update.

    PubMed

    Tian, Tian; Liu, Yue; Yan, Hengyu; You, Qi; Yi, Xin; Du, Zhou; Xu, Wenying; Su, Zhen

    2017-07-03

    The agriGO platform, which has been serving the scientific community for >10 years, specifically focuses on gene ontology (GO) enrichment analyses of plant and agricultural species. We continuously maintain and update the databases and accommodate the various requests of our global users. Here, we present our updated agriGO that has a largely expanded number of supporting species (394) and datatypes (865). In addition, a larger number of species have been classified into groups covering crops, vegetables, fish, birds and insects closely related to the agricultural community. We further improved the computational efficiency, including the batch analysis and P-value distribution (PVD), and the user-friendliness of the web pages. More visualization features were added to the platform, including SEACOMPARE (cross comparison of singular enrichment analysis), direct acyclic graph (DAG) and Scatter Plots, which can be merged by choosing any significant GO term. The updated platform agriGO v2.0 is now publicly accessible at http://systemsbiology.cau.edu.cn/agriGOv2/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Long and short-term atmospheric radiation analyses based on coupled measurements at high altitude remote stations and extensive air shower modeling

    NASA Astrophysics Data System (ADS)

    Hubert, G.; Federico, C. A.; Pazianotto, M. T.; Gonzales, O. L.

    2016-02-01

    In this paper are described the ACROPOL and OPD high-altitude stations devoted to characterize the atmospheric radiation fields. The ACROPOL platform, located at the summit of the Pic du Midi in the French Pyrenees at 2885 m above sea level, exploits since May 2011 some scientific equipment, including a BSS neutron spectrometer, detectors based on semiconductor and scintillators. In the framework of a IEAv and ONERA collaboration, a second neutron spectrometer was simultaneously exploited since February 2015 at the summit of the Pico dos Dias in Brazil at 1864 m above the sea level. The both high station platforms allow for investigating the long period dynamics to analyze the spectral variation of cosmic-ray- induced neutron and effects of local and seasonal changes, but also the short term dynamics during solar flare events. This paper presents long and short-term analyses, including measurement and modeling investigations considering the both high altitude stations data. The modeling approach, based on ATMORAD computational platform, was used to link the both station measurements.

  8. GeoChronos: An On-line Collaborative Platform for Earth Observation Scientists

    NASA Astrophysics Data System (ADS)

    Gamon, J. A.; Kiddle, C.; Curry, R.; Markatchev, N.; Zonta-Pastorello, G., Jr.; Rivard, B.; Sanchez-Azofeifa, G. A.; Simmonds, R.; Tan, T.

    2009-12-01

    Recent advances in cyberinfrastructure are offering new solutions to the growing challenges of managing and sharing large data volumes. Web 2.0 and social networking technologies, provide the means for scientists to collaborate and share information more effectively. Cloud computing technologies can provide scientists with transparent and on-demand access to applications served over the Internet in a dynamic and scalable manner. Semantic Web technologies allow for data to be linked together in a manner understandable by machines, enabling greater automation. Combining all of these technologies together can enable the creation of very powerful platforms. GeoChronos (http://geochronos.org/), part of a CANARIE Network Enabled Platforms project, is an online collaborative platform that incorporates these technologies to enable members of the earth observation science community to share data and scientific applications and to collaborate more effectively. The GeoChronos portal is built on an open source social networking platform called Elgg. Elgg provides a full set of social networking functionalities similar to Facebook including blogs, tags, media/document sharing, wikis, friends/contacts, groups, discussions, message boards, calendars, status, activity feeds and more. An underlying cloud computing infrastructure enables scientists to access dynamically provisioned applications via the portal for visualizing and analyzing data. Users are able to access and run the applications from any computer that has a Web browser and Internet connectivity and do not need to manage and maintain the applications themselves. Semantic Web Technologies, such as the Resource Description Framework (RDF) are being employed for relating and linking together spectral, satellite, meteorological and other data. Social networking functionality plays an integral part in facilitating the sharing of data and applications. Examples of recent GeoChronos users during the early testing phase have included the IAI International Wireless Sensor Networking Summer School at the University of Alberta, and the IAI Tropi-Dry community. Current GeoChronos activities include the development of a web-based spectral library and related analytical and visualization tools, in collaboration with members of the SpecNet community. The GeoChronos portal will be open to all members of the earth observation science community when the project nears completion at the end of 2010.

  9. Leaf-GP: an open and automated software application for measuring growth phenotypes for arabidopsis and wheat.

    PubMed

    Zhou, Ji; Applegate, Christopher; Alonso, Albor Dobon; Reynolds, Daniel; Orford, Simon; Mackiewicz, Michal; Griffiths, Simon; Penfield, Steven; Pullen, Nick

    2017-01-01

    Plants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch image processing, multiple traits analyses, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic. Here, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different computing platforms. To facilitate diverse scientific communities, we provide three software versions, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and a well-commented interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat ( Triticum aestivum ) growth studies at the Norwich Research Park (NRP, UK). By quantifying a number of growth phenotypes over time, we have identified diverse plant growth patterns between different genotypes under several experimental conditions. As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices (e.g. smartphones and digital cameras) and still produced reliable biological outputs, we therefore believe that our automated analysis workflow and customised computer vision based feature extraction software implementation can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, we believe that our software not only can contribute to biological research, but also demonstrates how to utilise existing open numeric and scientific libraries (e.g. Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, in a efficient and effective way. Leaf-GP is a sophisticated software application that provides three approaches to quantify growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and HPC, with open Python-based scientific libraries preinstalled. Our work presents the advancement of how to integrate computer vision, image analysis, machine learning and software engineering in plant phenomics software implementation. To serve the plant research community, our modulated source code, detailed comments, executables (.exe for Windows; .app for Mac), and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.

  10. Storage quality-of-service in cloud-based scientific environments: a standardization approach

    NASA Astrophysics Data System (ADS)

    Millar, Paul; Fuhrmann, Patrick; Hardt, Marcus; Ertl, Benjamin; Brzezniak, Maciej

    2017-10-01

    When preparing the Data Management Plan for larger scientific endeavors, PIs have to balance between the most appropriate qualities of storage space along the line of the planned data life-cycle, its price and the available funding. Storage properties can be the media type, implicitly determining access latency and durability of stored data, the number and locality of replicas, as well as available access protocols or authentication mechanisms. Negotiations between the scientific community and the responsible infrastructures generally happen upfront, where the amount of storage space, media types, like: disk, tape and SSD and the foreseeable data life-cycles are negotiated. With the introduction of cloud management platforms, both in computing and storage, resources can be brokered to achieve the best price per unit of a given quality. However, in order to allow the platform orchestrator to programmatically negotiate the most appropriate resources, a standard vocabulary for different properties of resources and a commonly agreed protocol to communicate those, has to be available. In order to agree on a basic vocabulary for storage space properties, the storage infrastructure group in INDIGO-DataCloud together with INDIGO-associated and external scientific groups, created a working group under the umbrella of the Research Data Alliance (RDA). As communication protocol, to query and negotiate storage qualities, the Cloud Data Management Interface (CDMI) has been selected. Necessary extensions to CDMI are defined in regular meetings between INDIGO and the Storage Network Industry Association (SNIA). Furthermore, INDIGO is contributing to the SNIA CDMI reference implementation as the basis for interfacing the various storage systems in INDIGO to the agreed protocol and to provide an official Open-Source skeleton for systems not being maintained by INDIGO partners.

  11. Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gittens, Alex; Devarakonda, Aditya; Racah, Evan

    We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scalingmore » experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.« less

  12. The representations of novel neurotechnologies in social media: five case studies.

    PubMed

    Purcell-Davis, Allyson

    2013-01-01

    The research contained within this article was commissioned by the Nuffield Council on Bioethics, as part of the development of an ethical framework to guide the practice of those involved in novel neurotechnologies. The findings of this study are included in chapter 9 of the report Novel Neurotechnologies: Intervening in the Brain. The purpose of this research was to provide a 'snapshot' of the content found within postings on social media platforms, concerning the technologies of Deep Brain Stimulation, Brain Computer Interface and Neural Stem Cell Therapy. The methodology included an analysis of the postings found on Delicious, Twitter, Facebook, YouTube and blogs and found evidence that social media provided a platform for a variety of voices, including patients, medical personnel and neuroscientists. However, it additionally found evidence of the advertisement and promotion of neurotechnologies as potential medical interventions, the hype of scientific breakthroughs and the hope of cures for neurodegenerative diseases.

  13. Code Modernization of VPIC

    NASA Astrophysics Data System (ADS)

    Bird, Robert; Nystrom, David; Albright, Brian

    2017-10-01

    The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.

  14. The multi-modal Australian ScienceS Imaging and Visualization Environment (MASSIVE) high performance computing infrastructure: applications in neuroscience and neuroinformatics research

    PubMed Central

    Goscinski, Wojtek J.; McIntosh, Paul; Felzmann, Ulrich; Maksimenko, Anton; Hall, Christopher J.; Gureyev, Timur; Thompson, Darren; Janke, Andrew; Galloway, Graham; Killeen, Neil E. B.; Raniga, Parnesh; Kaluza, Owen; Ng, Amanda; Poudel, Govinda; Barnes, David G.; Nguyen, Toan; Bonnington, Paul; Egan, Gary F.

    2014-01-01

    The Multi-modal Australian ScienceS Imaging and Visualization Environment (MASSIVE) is a national imaging and visualization facility established by Monash University, the Australian Synchrotron, the Commonwealth Scientific Industrial Research Organization (CSIRO), and the Victorian Partnership for Advanced Computing (VPAC), with funding from the National Computational Infrastructure and the Victorian Government. The MASSIVE facility provides hardware, software, and expertise to drive research in the biomedical sciences, particularly advanced brain imaging research using synchrotron x-ray and infrared imaging, functional and structural magnetic resonance imaging (MRI), x-ray computer tomography (CT), electron microscopy and optical microscopy. The development of MASSIVE has been based on best practice in system integration methodologies, frameworks, and architectures. The facility has: (i) integrated multiple different neuroimaging analysis software components, (ii) enabled cross-platform and cross-modality integration of neuroinformatics tools, and (iii) brought together neuroimaging databases and analysis workflows. MASSIVE is now operational as a nationally distributed and integrated facility for neuroinfomatics and brain imaging research. PMID:24734019

  15. Models and Simulations as a Service: Exploring the Use of Galaxy for Delivering Computational Models

    PubMed Central

    Walker, Mark A.; Madduri, Ravi; Rodriguez, Alex; Greenstein, Joseph L.; Winslow, Raimond L.

    2016-01-01

    We describe the ways in which Galaxy, a web-based reproducible research platform, can be used for web-based sharing of complex computational models. Galaxy allows users to seamlessly customize and run simulations on cloud computing resources, a concept we refer to as Models and Simulations as a Service (MaSS). To illustrate this application of Galaxy, we have developed a tool suite for simulating a high spatial-resolution model of the cardiac Ca2+ spark that requires supercomputing resources for execution. We also present tools for simulating models encoded in the SBML and CellML model description languages, thus demonstrating how Galaxy’s reproducible research features can be leveraged by existing technologies. Finally, we demonstrate how the Galaxy workflow editor can be used to compose integrative models from constituent submodules. This work represents an important novel approach, to our knowledge, to making computational simulations more accessible to the broader scientific community. PMID:26958881

  16. Geospatial Service Platform for Education and Research

    NASA Astrophysics Data System (ADS)

    Gong, J.; Wu, H.; Jiang, W.; Guo, W.; Zhai, X.; Yue, P.

    2014-04-01

    We propose to advance the scientific understanding through applications of geospatial service platforms, which can help students and researchers investigate various scientific problems in a Web-based environment with online tools and services. The platform also offers capabilities for sharing data, algorithm, and problem-solving knowledge. To fulfil this goal, the paper introduces a new course, named "Geospatial Service Platform for Education and Research", to be held in the ISPRS summer school in May 2014 at Wuhan University, China. The course will share cutting-edge achievements of a geospatial service platform with students from different countries, and train them with online tools from the platform for geospatial data processing and scientific research. The content of the course includes the basic concepts of geospatial Web services, service-oriented architecture, geoprocessing modelling and chaining, and problem-solving using geospatial services. In particular, the course will offer a geospatial service platform for handson practice. There will be three kinds of exercises in the course: geoprocessing algorithm sharing through service development, geoprocessing modelling through service chaining, and online geospatial analysis using geospatial services. Students can choose one of them, depending on their interests and background. Existing geoprocessing services from OpenRS and GeoPW will be introduced. The summer course offers two service chaining tools, GeoChaining and GeoJModelBuilder, as instances to explain specifically the method for building service chains in view of different demands. After this course, students can learn how to use online service platforms for geospatial resource sharing and problem-solving.

  17. Opportunities and choice in a new vector era

    NASA Astrophysics Data System (ADS)

    Nowak, A.

    2014-06-01

    This work discusses the significant changes in computing landscape related to the progression of Moore's Law, and the implications on scientific computing. Particular attention is devoted to the High Energy Physics domain (HEP), which has always made good use of threading, but levels of parallelism closer to the hardware were often left underutilized. Findings of the CERN openlab Platform Competence Center are reported in the context of expanding "performance dimensions", and especially the resurgence of vectors. These suggest that data oriented designs are feasible in HEP and have considerable potential for performance improvements on multiple levels, but will rarely trump algorithmic enhancements. Finally, an analysis of upcoming hardware and software technologies identifies heterogeneity as a major challenge for software, which will require more emphasis on scalable, efficient design.

  18. Astrophysical Computation in Research, the Classroom and Beyond

    NASA Astrophysics Data System (ADS)

    Frank, Adam

    2009-03-01

    In this talk I review progress in the use of simulations as a tool for astronomical research, for education and public outreach. The talk will include the basic elements of numerical simulations as well as advances in algorithms which have led to recent dramatic progress such as the use of Adaptive Mesh Refinement methods. The scientific focus of the talk will be star formation jets and outflows while the educational emphasis will be on the use of advanced platforms for simulation based learning in lecture and integrated homework. Learning modules for science outreach websites such as DISCOVER magazine will also be highlighted.

  19. Advances in Spatial Data Infrastructure, Acquisition, Analysis, Archiving and Dissemination

    NASA Technical Reports Server (NTRS)

    Ramapriyan, Hampapuran K.; Rochon, Gilbert L.; Duerr, Ruth; Rank, Robert; Nativi, Stefano; Stocker, Erich Franz

    2010-01-01

    The authors review recent contributions to the state-of-thescience and benign proliferation of satellite remote sensing, spatial data infrastructure, near-real-time data acquisition, analysis on high performance computing platforms, sapient archiving, multi-modal dissemination and utilization for a wide array of scientific applications. The authors also address advances in Geoinformatics and its growing ubiquity, as evidenced by its inclusion as a focus area within the American Geophysical Union (AGU), European Geosciences Union (EGU), as well as by the evolution of the IEEE Geoscience and Remote Sensing Society's (GRSS) Data Archiving and Distribution Technical Committee (DAD TC).

  20. Metrics in academic profiles: a new addictive game for researchers?

    PubMed

    Orduna-Malea, Enrique; Martín-Martín, Alberto Martín-Martín; Delgado López-Cózar, Emilio

    2016-09-22

    This study aims to promote reflection and bring attention to the potential adverse effects of academic social networks on science. These academic social networks, where authors can display their publications, have become new scientific communication channels, accelerating the dissemination of research results, facilitating data sharing, and strongly promoting scientific collaboration, all at no cost to the user.One of the features that make them extremely attractive to researchers is the possibility to browse through a wide variety of bibliometric indicators. Going beyond publication and citation counts, they also measure usage, participation in the platform, social connectivity, and scientific, academic and professional impact. Using these indicators they effectively create a digital image of researchers and their reputations.However, although academic social platforms are useful applications that can help improve scientific communication, they also hide a less positive side: they are highly addictive tools that might be abused. By gamifying scientific impact using techniques originally developed for videogames, these platforms may get users hooked on them, like addicted academics, transforming what should only be a means into an end in itself.

  1. High Energy Physics Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and High Energy Physics, June 10-12, 2015, Bethesda, Maryland

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Habib, Salman; Roser, Robert; Gerber, Richard

    The U.S. Department of Energy (DOE) Office of Science (SC) Offices of High Energy Physics (HEP) and Advanced Scientific Computing Research (ASCR) convened a programmatic Exascale Requirements Review on June 10–12, 2015, in Bethesda, Maryland. This report summarizes the findings, results, and recommendations derived from that meeting. The high-level findings and observations are as follows. Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude — and in some cases greatermore » — than that available currently. The growth rate of data produced by simulations is overwhelming the current ability of both facilities and researchers to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. Data rates and volumes from experimental facilities are also straining the current HEP infrastructure in its ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. A close integration of high-performance computing (HPC) simulation and data analysis will greatly aid in interpreting the results of HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. Long-range planning between HEP and ASCR will be required to meet HEP’s research needs. To best use ASCR HPC resources, the experimental HEP program needs (1) an established, long-term plan for access to ASCR computational and data resources, (2) the ability to map workflows to HPC resources, (3) the ability for ASCR facilities to accommodate workflows run by collaborations potentially comprising thousands of individual members, (4) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, (5) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.« less

  2. Computational imaging of sperm locomotion.

    PubMed

    Daloglu, Mustafa Ugur; Ozcan, Aydogan

    2017-08-01

    Not only essential for scientific research, but also in the analysis of male fertility and for animal husbandry, sperm tracking and characterization techniques have been greatly benefiting from computational imaging. Digital image sensors, in combination with optical microscopy tools and powerful computers, have enabled the use of advanced detection and tracking algorithms that automatically map sperm trajectories and calculate various motility parameters across large data sets. Computational techniques are driving the field even further, facilitating the development of unconventional sperm imaging and tracking methods that do not rely on standard optical microscopes and objective lenses, which limit the field of view and volume of the semen sample that can be imaged. As an example, a holographic on-chip sperm imaging platform, only composed of a light-emitting diode and an opto-electronic image sensor, has emerged as a high-throughput, low-cost and portable alternative to lens-based traditional sperm imaging and tracking methods. In this approach, the sample is placed very close to the image sensor chip, which captures lensfree holograms generated by the interference of the background illumination with the light scattered from sperm cells. These holographic patterns are then digitally processed to extract both the amplitude and phase information of the spermatozoa, effectively replacing the microscope objective lens with computation. This platform has further enabled high-throughput 3D imaging of spermatozoa with submicron 3D positioning accuracy in large sample volumes, revealing various rare locomotion patterns. We believe that computational chip-scale sperm imaging and 3D tracking techniques will find numerous opportunities in both sperm related research and commercial applications. © The Authors 2017. Published by Oxford University Press on behalf of Society for the Study of Reproduction. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. A Collaborative Extensible User Environment for Simulation and Knowledge Management

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Freedman, Vicky L.; Lansing, Carina S.; Porter, Ellen A.

    2015-06-01

    In scientific simulation, scientists use measured data to create numerical models, execute simulations and analyze results from advanced simulators executing on high performance computing platforms. This process usually requires a team of scientists collaborating on data collection, model creation and analysis, and on authorship of publications and data. This paper shows that scientific teams can benefit from a user environment called Akuna that permits subsurface scientists in disparate locations to collaborate on numerical modeling and analysis projects. The Akuna user environment is built on the Velo framework that provides both a rich client environment for conducting and analyzing simulations andmore » a Web environment for data sharing and annotation. Akuna is an extensible toolset that integrates with Velo, and is designed to support any type of simulator. This is achieved through data-driven user interface generation, use of a customizable knowledge management platform, and an extensible framework for simulation execution, monitoring and analysis. This paper describes how the customized Velo content management system and the Akuna toolset are used to integrate and enhance an effective collaborative research and application environment. The extensible architecture of Akuna is also described and demonstrates its usage for creation and execution of a 3D subsurface simulation.« less

  4. An Online Prediction Platform to Support the Environmental ...

    EPA Pesticide Factsheets

    Historical QSAR models are currently utilized across a broad range of applications within the U.S. Environmental Protection Agency (EPA). These models predict basic physicochemical properties (e.g., logP, aqueous solubility, vapor pressure), which are then incorporated into exposure, fate and transport models. Whereas the classical manner of publishing results in peer-reviewed journals remains appropriate, there are substantial benefits to be gained by providing enhanced, open access to the training data sets and resulting models. Benefits include improved transparency, more flexibility to expand training sets and improve model algorithms, and greater ability to independently characterize model performance both globally and in local areas of chemistry. We have developed a web-based prediction platform that uses open-source descriptors and modeling algorithms, employs modern cheminformatics technologies, and is tailored for ease of use by the toxicology and environmental regulatory community. This tool also provides web-services to meet both EPA’s projects and the modeling community at-large. The platform hosts models developed within EPA’s National Center for Computational Toxicology, as well as those developed by other EPA scientists and the outside scientific community. Recognizing that there are other on-line QSAR model platforms currently available which have additional capabilities, we connect to such services, where possible, to produce an integrated

  5. Simulation Platform: a cloud-based online simulation environment.

    PubMed

    Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

    2011-09-01

    For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Reprint of: Simulation Platform: a cloud-based online simulation environment.

    PubMed

    Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

    2011-11-01

    For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Assessing present and future climate changes in Siberia and their regional socioeconomic consequences using a web-based big data geoprocessing platform

    NASA Astrophysics Data System (ADS)

    Alexeev, V. A.; Gordov, E. P.

    2016-12-01

    Recently initiated collaborative research project is presented. Its main objective is to develop high spatial and temporal resolution datasets for studying the ongoing and future climate changes in Siberia, caused by global and regional processes in the atmosphere and the ocean. This goal will be achieved by using a set of regional and global climate models for the analysis of the mechanisms of climate change and quantitative assessment of changes in key climate variables, including analysis of extreme weather and climate events and their dynamics, evaluation of the frequency, amplitude and the risks caused by the extreme events in the region. The main practical application of the project is to provide experts, stakeholders and the public with quantitative information about the future climate change in Siberia obtained on the base of a computational web- geoinformation platform. The thematic platform will be developed in order to facilitate processing and analysis of high resolution georeferenced datasets that will be delivered and made available to scientific community, policymakes and other end users as a result of the project. Software packages will be developed to implement calculation of various climatological indicators in order to characterize and diagnose climate change and its dynamics, as well as to archive results in digital form of electronic maps (GIS layers). By achieving these goals the project will provide science based tools necessary for developing mitigation measures for adapting to climate change and reducing negative impact on the population and infrastructure of the region. Financial support of the computational web- geoinformation platform prototype development by the RF Ministry of Education and Science under Agreement 14.613.21.0037 (RFMEFI61315X0037) is acknowledged.

  8. Avionics and Power Management for Low-Cost High-Altitude Balloon Science Platforms

    NASA Technical Reports Server (NTRS)

    Chin, Jeffrey; Roberts, Anthony; McNatt, Jeremiah

    2016-01-01

    High-altitude balloons (HABs) have become popular as educational and scientific platforms for planetary research. This document outlines key components for missions where low cost and rapid development are desired. As an alternative to ground-based vacuum and thermal testing, these systems can be flight tested at comparable costs. Communication, solar, space, and atmospheric sensing experiments often require environments where ground level testing can be challenging or impossible in certain cases. When performing HAB research the ability to monitor the status of the platform and gather data is key for both scientific and recoverability aspects of the mission. A few turnkey platform solutions are outlined that leverage rapidly evolving open-source engineering ecosystems. Rather than building custom components from scratch, these recommendations attempt to maximize simplicity and cost of HAB platforms to make launches more accessible to everyone.

  9. Giant Vehicles

    NASA Technical Reports Server (NTRS)

    Said, Magdi A; Schur, Willi W.; Gupta, Amit; Mock, Gary N.; Seyam, Abdelfattah M.; Theyson, Thomas

    2004-01-01

    Science and technology development from balloon-borne telescopes and experiments is a rich return on a relatively modest involvement of NASA resources. For the past three decades, the development of increasingly competitive and complex science payloads and observational programs from high altitude balloon-borne platforms has yielded significant scientific discoveries. The success and capabilities of scientific balloons are closely related to advancements in the textile and plastic industries. This paper will present an overview of scientific balloons as a viable and economical platform for transporting large telescopes and scientific instruments to the upper atmosphere to conduct scientific missions. Additionally, the paper sheds the light on the problems associated with UV degradation of high performance textile components that are used to support the payload of the balloon and proposes future research to reduce/eliminate Ultra Violet (UV) degradation in order to conduct long-term scientific missions.

  10. Bringing Web 2.0 to bioinformatics.

    PubMed

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  11. A feasibility study on porting the community land model onto accelerators using OpenACC

    DOE PAGES

    Wang, Dali; Wu, Wei; Winkler, Frank; ...

    2014-01-01

    As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we are facing enormous challenges regarding to porting those applications onto hybrid computing architecture. OpenACC appears as a very promising technology, therefore, we have conducted a feasibility analysis on porting the Community Land Model (CLM), a terrestrial ecosystem model within the Community Earth System Models (CESM)). Specifically, we used automatic function testing platform to extract a small computing kernel out of CLM, then we apply this kernel into the actually CLM dataflowmore » procedure, and investigate the strategy of data parallelization and the benefit of data movement provided by current implementation of OpenACC. Even it is a non-intensive kernel, on a single 16-core computing node, the performance (based on the actual computation time using one GPU) of OpenACC implementation is 2.3 time faster than that of OpenMP implementation using single OpenMP thread, but it is 2.8 times slower than the performance of OpenMP implementation using 16 threads. On multiple nodes, MPI_OpenACC implementation demonstrated very good scalability on up to 128 GPUs on 128 computing nodes. This study also provides useful information for us to look into the potential benefits of “deep copy” capability and “routine” feature of OpenACC standards. In conclusion, we believe that our experience on the environmental model, CLM, can be beneficial to many other scientific research programs who are interested to porting their large scale scientific code using OpenACC onto high-end computers, empowered by hybrid computing architecture.« less

  12. Smart article: application of intelligent platforms in next generation biomedical publications.

    PubMed

    Mohammadi, Babak; Saeedi, Marjan; Haghpanah, Vahid

    2017-01-01

    Production of scientific data has been accelerated exponentially though ease of access to the required knowledge is still challenging. Hence, the emergence of new frameworks to allow more efficient storage of information would be beneficial. Attaining intelligent platforms enable the smart article to serve as a forum for exchanging idea among experts of academic disciplines for a rapid and efficient scientific discourse.

  13. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.

    PubMed

    Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E

    2012-03-19

    A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.

  14. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community

    PubMed Central

    2012-01-01

    Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them. PMID:22429538

  15. sbv IMPROVER: Modern Approach to Systems Biology.

    PubMed

    Guryanova, Svetlana; Guryanova, Anna

    2017-01-01

    The increasing amount and variety of data in biosciences call for innovative methods of visualization, scientific verification, and pathway analysis. Novel approaches to biological networks and research quality control are important because of their role in development of new products, improvement, and acceleration of existing health policies and research for novel ways of solving scientific challenges. One such approach is sbv IMPROVER. It is a platform that uses crowdsourcing and verification to create biological networks with easy public access. It contains 120 networks built in Biological Expression Language (BEL) to interpret data from PubMed articles with high-quality verification available for free on the CBN database. Computable, human-readable biological networks with a structured syntax are a powerful way of representing biological information generated from high-density data. This article presents sbv IMPROVER, a crowd-verification approach for the visualization and expansion of biological networks.

  16. Integrating advanced visualization technology into the planetary Geoscience workflow

    NASA Astrophysics Data System (ADS)

    Huffman, John; Forsberg, Andrew; Loomis, Andrew; Head, James; Dickson, James; Fassett, Caleb

    2011-09-01

    Recent advances in computer visualization have allowed us to develop new tools for analyzing the data gathered during planetary missions, which is important, since these data sets have grown exponentially in recent years to tens of terabytes in size. As part of the Advanced Visualization in Solar System Exploration and Research (ADVISER) project, we utilize several advanced visualization techniques created specifically with planetary image data in mind. The Geoviewer application allows real-time active stereo display of images, which in aggregate have billions of pixels. The ADVISER desktop application platform allows fast three-dimensional visualization of planetary images overlain on digital terrain models. Both applications include tools for easy data ingest and real-time analysis in a programmatic manner. Incorporation of these tools into our everyday scientific workflow has proved important for scientific analysis, discussion, and publication, and enabled effective and exciting educational activities for students from high school through graduate school.

  17. FORCEnet Net Centric Architecture - A Standards View

    DTIC Science & Technology

    2006-06-01

    SHARED SERVICES NETWORKING/COMMUNICATIONS STORAGE COMPUTING PLATFORM DATA INTERCHANGE/INTEGRATION DATA MANAGEMENT APPLICATION...R V I C E P L A T F O R M S E R V I C E F R A M E W O R K USER-FACING SERVICES SHARED SERVICES NETWORKING/COMMUNICATIONS STORAGE COMPUTING PLATFORM...E F R A M E W O R K USER-FACING SERVICES SHARED SERVICES NETWORKING/COMMUNICATIONS STORAGE COMPUTING PLATFORM DATA INTERCHANGE/INTEGRATION

  18. The International Symposium on Grids and Clouds

    NASA Astrophysics Data System (ADS)

    The International Symposium on Grids and Clouds (ISGC) 2012 will be held at Academia Sinica in Taipei from 26 February to 2 March 2012, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). 2012 is the decennium anniversary of the ISGC which over the last decade has tracked the convergence, collaboration and innovation of individual researchers across the Asia Pacific region to a coherent community. With the continuous support and dedication from the delegates, ISGC has provided the primary international distributed computing platform where distinguished researchers and collaboration partners from around the world share their knowledge and experiences. The last decade has seen the wide-scale emergence of e-Infrastructure as a critical asset for the modern e-Scientist. The emergence of large-scale research infrastructures and instruments that has produced a torrent of electronic data is forcing a generational change in the scientific process and the mechanisms used to analyse the resulting data deluge. No longer can the processing of these vast amounts of data and production of relevant scientific results be undertaken by a single scientist. Virtual Research Communities that span organisations around the world, through an integrated digital infrastructure that connects the trust and administrative domains of multiple resource providers, have become critical in supporting these analyses. Topics covered in ISGC 2012 include: High Energy Physics, Biomedicine & Life Sciences, Earth Science, Environmental Changes and Natural Disaster Mitigation, Humanities & Social Sciences, Operations & Management, Middleware & Interoperability, Security and Networking, Infrastructure Clouds & Virtualisation, Business Models & Sustainability, Data Management, Distributed Volunteer & Desktop Grid Computing, High Throughput Computing, and High Performance, Manycore & GPU Computing.

  19. Using e-Learning Platforms for Mastery Learning in Developmental Mathematics Courses

    ERIC Educational Resources Information Center

    Boggs, Stacey; Shore, Mark; Shore, JoAnna

    2004-01-01

    Many colleges and universities have adopted e-learning platforms to utilize computers as an instructional tool in developmental (i.e., beginning and intermediate algebra) mathematics courses. An e-learning platform is a computer program used to enhance course instruction via computers and the Internet. Allegany College of Maryland is currently…

  20. Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers.

    PubMed

    Sochat, Vanessa V; Prybol, Cameron J; Kurtzer, Gregory M

    2017-01-01

    Here we present Singularity Hub, a framework to build and deploy Singularity containers for mobility of compute, and the singularity-python software with novel metrics for assessing reproducibility of such containers. Singularity containers make it possible for scientists and developers to package reproducible software, and Singularity Hub adds automation to this workflow by building, capturing metadata for, visualizing, and serving containers programmatically. Our novel metrics, based on custom filters of content hashes of container contents, allow for comparison of an entire container, including operating system, custom software, and metadata. First we will review Singularity Hub's primary use cases and how the infrastructure has been designed to support modern, common workflows. Next, we conduct three analyses to demonstrate build consistency, reproducibility metric and performance and interpretability, and potential for discovery. This is the first effort to demonstrate a rigorous assessment of measurable similarity between containers and operating systems. We provide these capabilities within Singularity Hub, as well as the source software singularity-python that provides the underlying functionality. Singularity Hub is available at https://singularity-hub.org, and we are excited to provide it as an openly available platform for building, and deploying scientific containers.

  1. Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers

    PubMed Central

    Prybol, Cameron J.; Kurtzer, Gregory M.

    2017-01-01

    Here we present Singularity Hub, a framework to build and deploy Singularity containers for mobility of compute, and the singularity-python software with novel metrics for assessing reproducibility of such containers. Singularity containers make it possible for scientists and developers to package reproducible software, and Singularity Hub adds automation to this workflow by building, capturing metadata for, visualizing, and serving containers programmatically. Our novel metrics, based on custom filters of content hashes of container contents, allow for comparison of an entire container, including operating system, custom software, and metadata. First we will review Singularity Hub’s primary use cases and how the infrastructure has been designed to support modern, common workflows. Next, we conduct three analyses to demonstrate build consistency, reproducibility metric and performance and interpretability, and potential for discovery. This is the first effort to demonstrate a rigorous assessment of measurable similarity between containers and operating systems. We provide these capabilities within Singularity Hub, as well as the source software singularity-python that provides the underlying functionality. Singularity Hub is available at https://singularity-hub.org, and we are excited to provide it as an openly available platform for building, and deploying scientific containers. PMID:29186161

  2. Evaluating Multi-core Architectures through Accelerating the Three-Dimensional Lax–Wendroff Correction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    You, Yang; Fu, Haohuan; Song, Shuaiwen

    2014-07-18

    Wave propagation forward modeling is a widely used computational method in oil and gas exploration. The iterative stencil loops in such problems have broad applications in scientific computing. However, executing such loops can be highly time time-consuming, which greatly limits application’s performance and power efficiency. In this paper, we accelerate the forward modeling technique on the latest multi-core and many-core architectures such as Intel Sandy Bridge CPUs, NVIDIA Fermi C2070 GPU, NVIDIA Kepler K20x GPU, and the Intel Xeon Phi Co-processor. For the GPU platforms, we propose two parallel strategies to explore the performance optimization opportunities for our stencil kernels.more » For Sandy Bridge CPUs and MIC, we also employ various optimization techniques in order to achieve the best.« less

  3. Boutiques: a flexible framework to integrate command-line applications in computing platforms.

    PubMed

    Glatard, Tristan; Kiar, Gregory; Aumentado-Armstrong, Tristan; Beck, Natacha; Bellec, Pierre; Bernard, Rémi; Bonnet, Axel; Brown, Shawn T; Camarasu-Pop, Sorina; Cervenansky, Frédéric; Das, Samir; Ferreira da Silva, Rafael; Flandin, Guillaume; Girard, Pascal; Gorgolewski, Krzysztof J; Guttmann, Charles R G; Hayot-Sasson, Valérie; Quirion, Pierre-Olivier; Rioux, Pierre; Rousseau, Marc-Étienne; Evans, Alan C

    2018-05-01

    We present Boutiques, a system to automatically publish, integrate, and execute command-line applications across computational platforms. Boutiques applications are installed through software containers described in a rich and flexible JSON language. A set of core tools facilitates the construction, validation, import, execution, and publishing of applications. Boutiques is currently supported by several distinct virtual research platforms, and it has been used to describe dozens of applications in the neuroinformatics domain. We expect Boutiques to improve the quality of application integration in computational platforms, to reduce redundancy of effort, to contribute to computational reproducibility, and to foster Open Science.

  4. geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling

    NASA Astrophysics Data System (ADS)

    Cowart, C.; Block, J.; Crawl, D.; Graham, J.; Gupta, A.; Nguyen, M.; de Callafon, R.; Smarr, L.; Altintas, I.

    2015-12-01

    The NSF-funded WIFIRE project has developed an open-source, online geospatial workflow platform for unifying geoprocessing tools and models for for fire and other geospatially dependent modeling applications. It is a product of WIFIRE's objective to build an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. geoKepler includes a set of reusable GIS components, or actors, for the Kepler Scientific Workflow System (https://kepler-project.org). Actors exist for reading and writing GIS data in formats such as Shapefile, GeoJSON, KML, and using OGC web services such as WFS. The actors also allow for calling geoprocessing tools in other packages such as GDAL and GRASS. Kepler integrates functions from multiple platforms and file formats into one framework, thus enabling optimal GIS interoperability, model coupling, and scalability. Products of the GIS actors can be fed directly to models such as FARSITE and WRF. Kepler's ability to schedule and scale processes using Hadoop and Spark also makes geoprocessing ultimately extensible and computationally scalable. The reusable workflows in geoKepler can be made to run automatically when alerted by real-time environmental conditions. Here, we show breakthroughs in the speed of creating complex data for hazard assessments with this platform. We also demonstrate geoKepler workflows that use Data Assimilation to ingest real-time weather data into wildfire simulations, and for data mining techniques to gain insight into environmental conditions affecting fire behavior. Existing machine learning tools and libraries such as R and MLlib are being leveraged for this purpose in Kepler, as well as Kepler's Distributed Data Parallel (DDP) capability to provide a framework for scalable processing. geoKepler workflows can be executed via an iPython notebook as a part of a Jupyter hub at UC San Diego for sharing and reporting of the scientific analysis and results from various runs of geoKepler workflows. The communication between iPython and Kepler workflow executions is established through an iPython magic function for Kepler that we have implemented. In summary, geoKepler is an ecosystem that makes geospatial processing and analysis of any kind programmable, reusable, scalable and sharable.

  5. GREEN SUPERCOMPUTING IN A DESKTOP BOX

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY

    2007-01-17

    The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less

  6. Interactive 3D visualization for theoretical virtual observatories

    NASA Astrophysics Data System (ADS)

    Dykes, T.; Hassan, A.; Gheller, C.; Croton, D.; Krokos, M.

    2018-06-01

    Virtual observatories (VOs) are online hubs of scientific knowledge. They encompass a collection of platforms dedicated to the storage and dissemination of astronomical data, from simple data archives to e-research platforms offering advanced tools for data exploration and analysis. Whilst the more mature platforms within VOs primarily serve the observational community, there are also services fulfilling a similar role for theoretical data. Scientific visualization can be an effective tool for analysis and exploration of data sets made accessible through web platforms for theoretical data, which often contain spatial dimensions and properties inherently suitable for visualization via e.g. mock imaging in 2D or volume rendering in 3D. We analyse the current state of 3D visualization for big theoretical astronomical data sets through scientific web portals and virtual observatory services. We discuss some of the challenges for interactive 3D visualization and how it can augment the workflow of users in a virtual observatory context. Finally we showcase a lightweight client-server visualization tool for particle-based data sets, allowing quantitative visualization via data filtering, highlighting two example use cases within the Theoretical Astrophysical Observatory.

  7. Better the Martian you know? Trust in the crowd vs. trust in the machine when using a Martian Citizen Science platform

    NASA Astrophysics Data System (ADS)

    Sprinks, James Christopher; Wardlaw, Jessica; Houghton, Robert; Bamford, Steven; Marsh, Stuart

    2016-10-01

    Citizen science platforms allow untrained volunteers to take part in scientific research across a range of disciplines, and often involve the analysis of remotely sensed imagery. The data collected by increasingly advanced and automated instruments has made planetary science a prime candidate for, and user of, citizen science online platforms. In order to process this large volume of information, such systems are increasingly performed in conjunction with data-mining analysis software, with varying configurations of computer and volunteer contribution. Despite citizen science being a relatively new approach, there has been a growing field of research considering the practice in its own right beyond the scientific problems they address, with studies involving interface HCI, platform functionality, and motivation particularly adding to a growing body of citizen science scholarship.Through iterations of the FP7 iMars project's 'Mars in Motion' platform, the work presented studied the effect that guidance information had on volunteers' accuracy and trust. Whilst analysing imagery for change, volunteers were told whether automated change detection software or the consensus of other citizen scientists had found change, with this information varying in terms of accuracy. Results showed that volunteers' ability to both identify change and the type of feature undergoing change was improved when both the software result and crowd opinion guidance information provided had a greater accuracy. However, when guidance information was less accurate volunteers' level of trust fell at a sharper rate when it came from the crowd than when it came from the algorithm, and participants reported more frustration - a counter-intuitive result compared to existing research. Citizen science practitioners need to consider the information they provide to volunteers and how they present it; the results of software analysis or the consensus of a crowd need to be conclusive and above all accurate in order to improve both the performance and engagement of their volunteer community.The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under iMars grant agreement 607379.

  8. OCCAM: a flexible, multi-purpose and extendable HPC cluster

    NASA Astrophysics Data System (ADS)

    Aldinucci, M.; Bagnasco, S.; Lusso, S.; Pasteris, P.; Rabellino, S.; Vallero, S.

    2017-10-01

    The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multipurpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Sezione di Torino of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing use cases, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others. Furthermore, it will serve as a platform for R&D activities on computational technologies themselves, with topics ranging from GPU acceleration to Cloud Computing technologies. A heterogeneous and reconfigurable system like this poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, containers, virtual farms, jobs, interactive bare-metal sessions, etc. This work describes some of the use cases that prompted the design and construction of the HPC cluster, its architecture and resource provisioning model, along with a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.

  9. Biological and Environmental Research Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and Biological and Environmental Research, March 28-31, 2016, Rockville, Maryland

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arkin, Adam; Bader, David C.; Coffey, Richard

    Understanding the fundamentals of genomic systems or the processes governing impactful weather patterns are examples of the types of simulation and modeling performed on the most advanced computing resources in America. High-performance computing and computational science together provide a necessary platform for the mission science conducted by the Biological and Environmental Research (BER) office at the U.S. Department of Energy (DOE). This report reviews BER’s computing needs and their importance for solving some of the toughest problems in BER’s portfolio. BER’s impact on science has been transformative. Mapping the human genome, including the U.S.-supported international Human Genome Project that DOEmore » began in 1987, initiated the era of modern biotechnology and genomics-based systems biology. And since the 1950s, BER has been a core contributor to atmospheric, environmental, and climate science research, beginning with atmospheric circulation studies that were the forerunners of modern Earth system models (ESMs) and by pioneering the implementation of climate codes onto high-performance computers. See http://exascaleage.org/ber/ for more information.« less

  10. Research and Deployment a Hospital Open Software Platform for e-Health on the Grid System at VAST/IAMI

    NASA Astrophysics Data System (ADS)

    van Tuyet, Dao; Tuan, Ngo Anh; van Lang, Tran

    Grid computing has been an increasing topic in recent years. It attracts the attention of many scientists from many fields. As a result, many Grid systems have been built for serving people's demands. At present, many tools for developing the Grid systems such as Globus, gLite, Unicore still developed incessantly. Especially, gLite - the Grid Middleware - was developed by the Europe Community scientific in recent years. Constant growth of Grid technology opened the way for new opportunities in term of information and data exchange in a secure and collaborative context. These new opportunities can be exploited to offer physicians new telemedicine services in order to improve their collaborative capacities. Our platform gives physicians an easy method to use telemedicine environment to manage and share patient's information (such as electronic medical record, images formatted DICOM) between remote locations. This paper presents the Grid Infrastructure based on gLite; some main components of gLite; the challenge scenario in which new applications can be developed to improve collaborative work between scientists; the process of deploying Hospital Open software Platform for E-health (HOPE) on the Grid.

  11. Optimizing Irregular Applications for Energy and Performance on the Tilera Many-core Architecture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chavarría-Miranda, Daniel; Panyala, Ajay R.; Halappanavar, Mahantesh

    Optimizing applications simultaneously for energy and performance is a complex problem. High performance, parallel, irregular applications are notoriously hard to optimize due to their data-dependent memory accesses, lack of structured locality and complex data structures and code patterns. Irregular kernels are growing in importance in applications such as machine learning, graph analytics and combinatorial scientific computing. Performance- and energy-efficient implementation of these kernels on modern, energy efficient, multicore and many-core platforms is therefore an important and challenging problem. We present results from optimizing two irregular applications { the Louvain method for community detection (Grappolo), and high-performance conjugate gradient (HPCCG) {more » on the Tilera many-core system. We have significantly extended MIT's OpenTuner auto-tuning framework to conduct a detailed study of platform-independent and platform-specific optimizations to improve performance as well as reduce total energy consumption. We explore the optimization design space along three dimensions: memory layout schemes, compiler-based code transformations, and optimization of parallel loop schedules. Using auto-tuning, we demonstrate whole node energy savings of up to 41% relative to a baseline instantiation, and up to 31% relative to manually optimized variants.« less

  12. A review of lighter-than-air progress in the United States and its technological significance

    NASA Technical Reports Server (NTRS)

    Mayer, N. J.; Krida, R. H.

    1977-01-01

    Lighter-than-air craft for transportation and communications systems are discussed, with attention given to tethered balloons used to provide stable platforms for airborne surveillance equipment, freight-carrying balloons, manned scientific research balloons such as Atmosat, high-altitude superpressure aerostats employed in satellite communications systems, airport feeder airships, and naval surveillance airships. In addition, technical problems associated with the development of advanced aerostats, including the aerodynamics of hybrid combinations of large rotor systems and aerostat hulls, the application of composites to balloon shells, computer analyses of the complex geometrical structures of aerostats and propulsion systems for airships, are considered.

  13. The EGS Data Collaboration Platform: Enabling Scientific Discovery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Weers, Jonathan D; Johnston, Henry; Huggins, Jay V

    Collaboration in the digital age has been stifled in recent years. Reasonable responses to legitimate security concerns have created a virtual landscape of silos and fortified castles incapable of sharing information efficiently. This trend is unfortunately opposed to the geothermal scientific community's migration toward larger, more collaborative projects. To facilitate efficient sharing of information between team members from multiple national labs, universities, and private organizations, the 'EGS Collab' team has developed a universally accessible, secure data collaboration platform and has fully integrated it with the U.S. Department of Energy's (DOE) Geothermal Data Repository (GDR) and the National Geothermal Data Systemmore » (NGDS). This paper will explore some of the challenges of collaboration in the modern digital age, highlight strategies for active data management, and discuss the integration of the EGS Collab data management platform with the GDR to enable scientific discovery through the timely dissemination of information.« less

  14. Instrumentino: An Open-Source Software for Scientific Instruments.

    PubMed

    Koenka, Israel Joel; Sáiz, Jorge; Hauser, Peter C

    2015-01-01

    Scientists often need to build dedicated computer-controlled experimental systems. For this purpose, it is becoming common to employ open-source microcontroller platforms, such as the Arduino. These boards and associated integrated software development environments provide affordable yet powerful solutions for the implementation of hardware control of transducers and acquisition of signals from detectors and sensors. It is, however, a challenge to write programs that allow interactive use of such arrangements from a personal computer. This task is particularly complex if some of the included hardware components are connected directly to the computer and not via the microcontroller. A graphical user interface framework, Instrumentino, was therefore developed to allow the creation of control programs for complex systems with minimal programming effort. By writing a single code file, a powerful custom user interface is generated, which enables the automatic running of elaborate operation sequences and observation of acquired experimental data in real time. The framework, which is written in Python, allows extension by users, and is made available as an open source project.

  15. Workload Characterization of a Leadership Class Storage Cluster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Youngjae; Gunasekaran, Raghul; Shipman, Galen M

    2010-01-01

    Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the scientific workloads of the world s fastest HPC (High Performance Computing) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). Spider provides an aggregate bandwidth of over 240 GB/s with over 10 petabytes of RAID 6 formatted capacity. OLCFs flagship petascale simulation platform, Jaguar, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize themore » system utilization, the demands of reads and writes, idle time, and the distribution of read requests to write requests for the storage system observed over a period of 6 months. From this study we develop synthesized workloads and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution.« less

  16. Computer-assisted learning and simulation systems in dentistry--a challenge to society.

    PubMed

    Welk, A; Splieth, Ch; Wierinck, E; Gilpatrick, R O; Meyer, G

    2006-07-01

    Computer technology is increasingly used in practical training at universities. However, in spite of their potential, computer-assisted learning (CAL) and computer-assisted simulation (CAS) systems still appear to be underutilized in dental education. Advantages, challenges, problems, and solutions of computer-assisted learning and simulation in dentistry are discussed by means of MEDLINE, open Internet platform searches, and key results of a study among German dental schools. The advantages of computer-assisted learning are seen for example in self-paced and self-directed learning and increased motivation. It is useful for both objective theoretical and practical tests and for training students to handle complex cases. CAL can lead to more structured learning and can support training in evidence-based decision-making. The reasons for the still relatively rare implementation of CAL/CAS systems in dental education include an inability to finance, lack of studies of CAL/CAS, and too much effort required to integrate CAL/CAS systems into the curriculum. To overcome the reasons for the relative low degree of computer technology use, we should strive for multicenter research and development projects monitored by the appropriate national and international scientific societies, so that the potential of computer technology can be fully realized in graduate, postgraduate, and continuing dental education.

  17. Boutiques: a flexible framework to integrate command-line applications in computing platforms

    PubMed Central

    Glatard, Tristan; Kiar, Gregory; Aumentado-Armstrong, Tristan; Beck, Natacha; Bellec, Pierre; Bernard, Rémi; Bonnet, Axel; Brown, Shawn T; Camarasu-Pop, Sorina; Cervenansky, Frédéric; Das, Samir; Ferreira da Silva, Rafael; Flandin, Guillaume; Girard, Pascal; Gorgolewski, Krzysztof J; Guttmann, Charles R G; Hayot-Sasson, Valérie; Quirion, Pierre-Olivier; Rioux, Pierre; Rousseau, Marc-Étienne; Evans, Alan C

    2018-01-01

    Abstract We present Boutiques, a system to automatically publish, integrate, and execute command-line applications across computational platforms. Boutiques applications are installed through software containers described in a rich and flexible JSON language. A set of core tools facilitates the construction, validation, import, execution, and publishing of applications. Boutiques is currently supported by several distinct virtual research platforms, and it has been used to describe dozens of applications in the neuroinformatics domain. We expect Boutiques to improve the quality of application integration in computational platforms, to reduce redundancy of effort, to contribute to computational reproducibility, and to foster Open Science. PMID:29718199

  18. The contribution of the Geohazards Exploitation Platform for the GEO Supersites community

    NASA Astrophysics Data System (ADS)

    Manunta, Michele; Caumont, Hervé; Mora, Oscar; Casu, Francesco; Zinno, Ivana; De Luca, Claudio; Pepe, Susi; Pepe, Antonio; Brito, Fabrice; Romero, Laia; Stumpf, Andre; Malet, Jean-Philippe; Brcic, Ramon; Rodriguez Gonzalez, Fernando; Musacchio, Massimo; Buongiorno, Fabrizia; Briole, Pierre

    2016-04-01

    The European Space Agency (ESA) initiative for the creation of an ecosystem of Thematic Exploitation Platforms (TEP) focuses on the capitalization of Ground Segment capabilities and ICT technologies to maximize the exploitation of EO data from past and future missions. A TEP refers to a computing platform that complies to a set of exploitation scenarios involving scientific users, data providers and ICT providers, aggregated around an Earth Science thematic area. The Exploitation Platforms are targeted to cover different capacities and they define, implement and validate a platform for effective data exploitation of EO data sources in a given thematic area. In this framework, the Geohazards Thematic Exploitation Platform or Geohazards TEP (GEP) aims at providing on-demand processing services for specific user needs as well as systematic processing services to address the need of the geohazards community for common information layers and, finally, to integrate newly developed processors for scientists and other expert users. The GEP has now on-boarded over 40 European early adopters and will transition during 2016 to pre-operations by developing six new Pilot applications that will significantly augment the Platform's capabilities for systematic production and community building. Each project on the Platform is concerned with either integrating an application, running on demand processing using an application available in the platform or systematically generating a new product collection. The platform will also expand its user base in 2016, to gradually reach a total of about 65 individual users. Under a Consortium lead by Terradue Srl, six new pilot projects have been taken on board: photogrammetric processing using optical EO data with University of Strasbourg (FR), optical based processing method for volcanic hazard monitoring with INGV (IT), systematic generation of deformation time-series with Sentinel-1 data with CNR IREA (IT), systematic processing of Sentinel-1 interferometric imagery with DLR (DE), precise terrain motion mapping with SPN Persistent Scatterers Interferometric chain of Altamira Information (ES) and a campaign to test and exploit GEP applications with the Corinth Rift Laboratory in which Greek and French experts of seismic hazards are engaged. The consortium is in charge of the resources and services management under a sustainable and fair governance model to ensure alignment of the Platform with user community needs, broad collaboration with main data and service providers in the domain, and excellence among user initiatives willing to contribute. In this work we show how the GEO Geohazards Supersites community can fully benefit from availability of an advanced IT infrastructure, where satellite and in-situ data, processing tools and web-based visualization instruments are at the disposal of users to address scientific questions. In particular, we focus on the contributions provided by GEP for the management of EO data, for the implementation of a European e-infrastructure, and for the monitoring and modelling of ground deformations and seismic activity.

  19. Preface: SciDAC 2006

    NASA Astrophysics Data System (ADS)

    Tang, William M., Dr.

    2006-01-01

    The second annual Scientific Discovery through Advanced Computing (SciDAC) Conference was held from June 25-29, 2006 at the new Hyatt Regency Hotel in Denver, Colorado. This conference showcased outstanding SciDAC-sponsored computational science results achieved during the past year across many scientific domains, with an emphasis on science at scale. Exciting computational science that has been accomplished outside of the SciDAC program both nationally and internationally was also featured to help foster communication between SciDAC computational scientists and those funded by other agencies. This was illustrated by many compelling examples of how domain scientists collaborated productively with applied mathematicians and computer scientists to effectively take advantage of terascale computers (capable of performing trillions of calculations per second) not only to accelerate progress in scientific discovery in a variety of fields but also to show great promise for being able to utilize the exciting petascale capabilities in the near future. The SciDAC program was originally conceived as an interdisciplinary computational science program based on the guiding principle that strong collaborative alliances between domain scientists, applied mathematicians, and computer scientists are vital to accelerated progress and associated discovery on the world's most challenging scientific problems. Associated verification and validation are essential in this successful program, which was funded by the US Department of Energy Office of Science (DOE OS) five years ago. As is made clear in many of the papers in these proceedings, SciDAC has fundamentally changed the way that computational science is now carried out in response to the exciting challenge of making the best use of the rapid progress in the emergence of more and more powerful computational platforms. In this regard, Dr. Raymond Orbach, Energy Undersecretary for Science at the DOE and Director of the OS has stated: `SciDAC has strengthened the role of high-end computing in furthering science. It is defining whole new fields for discovery.' (SciDAC Review, Spring 2006, p8). Application domains within the SciDAC 2006 conference agenda encompassed a broad range of science including: (i) the DOE core mission of energy research involving combustion studies relevant to fuel efficiency and pollution issues faced today and magnetic fusion investigations impacting prospects for future energy sources; (ii) fundamental explorations into the building blocks of matter, ranging from quantum chromodynamics - the basic theory that describes how quarks make up the protons and neutrons of all matter - to the design of modern high-energy accelerators; (iii) the formidable challenges of predicting and controlling the behavior of molecules in quantum chemistry and the complex biomolecules determining the evolution of biological systems; (iv) studies of exploding stars for insights into the nature of the universe; and (v) integrated climate modeling to enable realistic analysis of earth's changing climate. Associated research has made it quite clear that advanced computation is often the only means by which timely progress is feasible when dealing with these complex, multi-component physical, chemical, and biological systems operating over huge ranges of temporal and spatial scales. Working with the domain scientists, applied mathematicians and computer scientists have continued to develop the discretizations of the underlying equations and the complementary algorithms to enable improvements in solutions on modern parallel computing platforms as they evolve from the terascale toward the petascale regime. Moreover, the associated tremendous growth of data generated from the terabyte to the petabyte range demands not only the advanced data analysis and visualization methods to harvest the scientific information but also the development of efficient workflow strategies which can deal with the data input/output, management, movement, and storage challenges. If scientific discovery is expected to keep apace with the continuing progression from tera- to petascale platforms, the vital alliance between domain scientists, applied mathematicians, and computer scientists will be even more crucial. During the SciDAC 2006 Conference, some of the future challenges and opportunities in interdisciplinary computational science were emphasized in the Advanced Architectures Panel and by Dr. Victor Reis, Senior Advisor to the Secretary of Energy, who gave a featured presentation on `Simulation, Computation, and the Global Nuclear Energy Partnership.' Overall, the conference provided an excellent opportunity to highlight the rising importance of computational science in the scientific enterprise and to motivate future investment in this area. As Michael Strayer, SciDAC Program Director, has noted: `While SciDAC may have started out as a specific program, Scientific Discovery through Advanced Computing has become a powerful concept for addressing some of the biggest challenges facing our nation and our world.' Looking forward to next year, the SciDAC 2007 Conference will be held from June 24-28 at the Westin Copley Plaza in Boston, Massachusetts. Chairman: David Keyes, Columbia University. The Organizing Committee for the SciDAC 2006 Conference would like to acknowledge the individuals whose talents and efforts were essential to the success of the meeting. Special thanks go to Betsy Riley for her leadership in building the infrastructure support for the conference, for identifying and then obtaining contributions from our corporate sponsors, for coordinating all media communications, and for her efforts in organizing and preparing the conference proceedings for publication; to Tim Jones for handling the hotel scouting, subcontracts, and exhibits and stage production; to Angela Harris for handling supplies, shipping, and tracking, poster sessions set-up, and for her efforts in coordinating and scheduling the promotional activities that took place during the conference; to John Bui and John Smith for their superb wireless networking and A/V set-up and support; to Cindy Latham for Web site design, graphic design, and quality control of proceedings submissions; and to Pamelia Nixon-Hartje of Ambassador for budget and quality control of catering. We are grateful for the highly professional dedicated efforts of all of these individuals, who were the cornerstones of the SciDAC 2006 Conference. Thanks also go to Angela Beach of the ORNL Conference Center for her efforts in executing the contracts with the hotel, Carolyn James of Colorado State for on-site registration supervision, Lora Wolfe and Brittany Hagen for administrative support at ORNL, and Dami Rich and Andrew Sproles for graphic design and production. We are also most grateful to the Oak Ridge National Laboratory, especially Jeff Nichols, and to our corporate sponsors, Data Direct Networks, Cray, IBM, SGI, and Institute of Physics Publishing for their support. We especially express our gratitude to the featured speakers, invited oral speakers, invited poster presenters, session chairs, and advanced architecture panelists and chair for their excellent contributions on behalf of SciDAC 2006. We would like to express our deep appreciation to Lali Chatterjee, Graham Douglas, Margaret Smith, and the production team of Institute of Physics Publishing, who worked tirelessly to publish the final conference proceedings in a timely manner. Finally, heartfelt thanks are extended to Michael Strayer, Associate Director for OASCR and SciDAC Director, and to the DOE program managers associated with SciDAC for their continuing enthusiasm and strong support for the annual SciDAC Conferences as a special venue to showcase the exciting scientific discovery achievements enabled by the interdisciplinary collaborations championed by the SciDAC program.

  20. Analysis of lipid experiments (ALEX): a software framework for analysis of high-resolution shotgun lipidomics data.

    PubMed

    Husen, Peter; Tarasov, Kirill; Katafiasz, Maciej; Sokol, Elena; Vogt, Johannes; Baumgart, Jan; Nitsch, Robert; Ekroos, Kim; Ejsing, Christer S

    2013-01-01

    Global lipidomics analysis across large sample sizes produces high-content datasets that require dedicated software tools supporting lipid identification and quantification, efficient data management and lipidome visualization. Here we present a novel software-based platform for streamlined data processing, management and visualization of shotgun lipidomics data acquired using high-resolution Orbitrap mass spectrometry. The platform features the ALEX framework designed for automated identification and export of lipid species intensity directly from proprietary mass spectral data files, and an auxiliary workflow using database exploration tools for integration of sample information, computation of lipid abundance and lipidome visualization. A key feature of the platform is the organization of lipidomics data in "database table format" which provides the user with an unsurpassed flexibility for rapid lipidome navigation using selected features within the dataset. To demonstrate the efficacy of the platform, we present a comparative neurolipidomics study of cerebellum, hippocampus and somatosensory barrel cortex (S1BF) from wild-type and knockout mice devoid of the putative lipid phosphate phosphatase PRG-1 (plasticity related gene-1). The presented framework is generic, extendable to processing and integration of other lipidomic data structures, can be interfaced with post-processing protocols supporting statistical testing and multivariate analysis, and can serve as an avenue for disseminating lipidomics data within the scientific community. The ALEX software is available at www.msLipidomics.info.

  1. Design of a Data Catalogue for Perdigão-2017 Field Experiment: Establishing the Relevant Parameters, Post-Processing Techniques and Users Access

    NASA Astrophysics Data System (ADS)

    Palma, J. L.; Belo-Pereira, M.; Leo, L. S.; Fernando, J.; Wildmann, N.; Gerz, T.; Rodrigues, C. V.; Lopes, A. S.; Lopes, J. C.

    2017-12-01

    Perdigão is the largest of a series of wind-mapping studies embedded in the on-going NEWA (New European Wind Atlas) Project. The intensive observational period of the Perdigão field experiment resulted in an unprecedented volume of data, covering several wind conditions through 46 consecutive days between May and June 2017. For researchers looking into specific events, it is time consuming to scrutinise the datasets looking for appropriate conditions. Such task becomes harder if the parameters of interest were not measured directly, instead requiring their computation from the raw datasets. This work will present the e-Science platform developed by University of Porto for the Perdigao dataset. The platform will assist scientists of Perdigao and the larger scientific community in extrapolating the datasets associated to specific flow regimes of interest as well as automatically performing post-processing/filtering operations internally in the platform. We will illustrate the flow regime categories identified in Perdigao based on several parameters such as weather type classification, cloud characteristics, as well as stability regime indicators (Brunt-Väisälä frequency, Scorer parameter, potential temperature inversion heights, dimensionless Richardson and Froude numbers) and wind regime indicators. Examples of some of the post-processing techniques available in the e-Science platform, such as the Savitzky-Golay low-pass filtering technique, will be also presented.

  2. Satellite Cloud and Radiative Property Processing and Distribution System on the NASA Langley ASDC OpenStack and OpenShift Cloud Platform

    NASA Astrophysics Data System (ADS)

    Nguyen, L.; Chee, T.; Palikonda, R.; Smith, W. L., Jr.; Bedka, K. M.; Spangenberg, D.; Vakhnin, A.; Lutz, N. E.; Walter, J.; Kusterer, J.

    2017-12-01

    Cloud Computing offers new opportunities for large-scale scientific data producers to utilize Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) IT resources to process and deliver data products in an operational environment where timely delivery, reliability, and availability are critical. The NASA Langley Research Center Atmospheric Science Data Center (ASDC) is building and testing a private and public facing cloud for users in the Science Directorate to utilize as an everyday production environment. The NASA SatCORPS (Satellite ClOud and Radiation Property Retrieval System) team processes and derives near real-time (NRT) global cloud products from operational geostationary (GEO) satellite imager datasets. To deliver these products, we will utilize the public facing cloud and OpenShift to deploy a load-balanced webserver for data storage, access, and dissemination. The OpenStack private cloud will host data ingest and computational capabilities for SatCORPS processing. This paper will discuss the SatCORPS migration towards, and usage of, the ASDC Cloud Services in an operational environment. Detailed lessons learned from use of prior cloud providers, specifically the Amazon Web Services (AWS) GovCloud and the Government Cloud administered by the Langley Managed Cloud Environment (LMCE) will also be discussed.

  3. The BioExtract Server: a web-based bioinformatic workflow platform

    PubMed Central

    Lushbough, Carol M.; Jennewein, Douglas M.; Brendel, Volker P.

    2011-01-01

    The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet. PMID:21546552

  4. The Image Data Resource: A Bioimage Data Integration and Publication Platform.

    PubMed

    Williams, Eleanor; Moore, Josh; Li, Simon W; Rustici, Gabriella; Tarkowska, Aleksandra; Chessel, Anatole; Leo, Simone; Antal, Bálint; Ferguson, Richard K; Sarkans, Ugis; Brazma, Alvis; Salas, Rafael E Carazo; Swedlow, Jason R

    2017-08-01

    Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR) that collects and integrates imaging data acquired across many different imaging modalities. IDR links data from several imaging modalities, including high-content screening, super-resolution and time-lapse microscopy, digital pathology, public genetic or chemical databases, and cell and tissue phenotypes expressed using controlled ontologies. Using this integration, IDR facilitates the analysis of gene networks and reveals functional interactions that are inaccessible to individual studies. To enable re-analysis, we also established a computational resource based on Jupyter notebooks that allows remote access to the entire IDR. IDR is also an open source platform that others can use to publish their own image data. Thus IDR provides both a novel on-line resource and a software infrastructure that promotes and extends publication and re-analysis of scientific image data.

  5. 76 FR 41234 - Advanced Scientific Computing Advisory Committee Charter Renewal

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-13

    ... Secretariat, General Services Administration, notice is hereby given that the Advanced Scientific Computing... advice and recommendations concerning the Advanced Scientific Computing program in response only to... Advanced Scientific Computing Research program and recommendations based thereon; --Advice on the computing...

  6. Moving Virtual Research Environments from high maintenance Stovepipes to Multi-purpose Sustainable Service-oriented Science Platforms

    NASA Astrophysics Data System (ADS)

    Klump, Jens; Fraser, Ryan; Wyborn, Lesley; Friedrich, Carsten; Squire, Geoffrey; Barker, Michelle; Moloney, Glenn

    2017-04-01

    The researcher of today is likely to be part of a team distributed over multiple sites that will access data from an external repository and then process the data on a public or private cloud or even on a large centralised supercomputer. They are increasingly likely to use a mixture of their own code, third party software and libraries, or even access global community codes. These components will be connected into a Virtual Research Environments (VREs) that will enable members of the research team who are not co-located to actively work together at various scales to share data, models, tools, software, workflows, best practices, infrastructures, etc. Many VRE's are built in isolation: designed to meet a specific research program with components tightly coupled and not capable of being repurposed for other use cases - they are becoming 'stovepipes'. The limited number of users of some VREs also means that the cost of maintenance per researcher can be unacceptably high. The alternative is to develop service-oriented Science Platforms that enable multiple communities to develop specialised solutions for specific research programs. The platforms can offer access to data, software tools and processing infrastructures (cloud, supercomputers) through globally distributed, interconnected modules. In Australia, the Virtual Geophysics Laboratory (VGL) was initially built to enable a specific set of researchers in government agencies access to specific data sets and a limited number of tools, that is now rapidly evolving into a multi-purpose Earth science platform with access to an increased variety of data, a broader range of tools, users from more sectors and a diversity of computational infrastructures. The expansion has been relatively easy, because of the architecture whereby data, tools and compute resources are loosely coupled via interfaces that are built on international standards and accessed as services wherever possible. In recent years, investments in discoverability and accessibility of data via online services in Australia mean that data resources can be easily added to the virtual environments as and when required. Another key to increasing to reusability and uptake of the VRE is the capability to capturing workflows so that they can be reused and repurposed both within and beyond the community that that defined the original use case. Unfortunately, Software-as-a-Service in the research sector is not yet mature. In response, we developed a Scientific Software solutions Center (SSSC) that enables researchers to discover, deploy and then share computational codes, code snippets or processes both in a human and machine-readable manner. Growth has come not only from within the Earth science community but from the Australian Virtual Laboratory community which is building VREs for a diversity of communities such as astronomy, genomics, environment, humanities, climate etc. Components such as access control, provenance, visualisation, accounting etc. are common to all scientific domains and sharing of these across multiple domains reduces costs, but more importantly increases the ability to undertake interdisciplinary science. These efforts are transitioning VREs to more sustainable Service-oriented Science Platforms that can be delivered in an agile, adaptable manner for broader community interests.

  7. Formal design and verification of a reliable computing platform for real-time control. Phase 1: Results

    NASA Technical Reports Server (NTRS)

    Divito, Ben L.; Butler, Ricky W.; Caldwell, James L.

    1990-01-01

    A high-level design is presented for a reliable computing platform for real-time control applications. Design tradeoffs and analyses related to the development of the fault-tolerant computing platform are discussed. The architecture is formalized and shown to satisfy a key correctness property. The reliable computing platform uses replicated processors and majority voting to achieve fault tolerance. Under the assumption of a majority of processors working in each frame, it is shown that the replicated system computes the same results as a single processor system not subject to failures. Sufficient conditions are obtained to establish that the replicated system recovers from transient faults within a bounded amount of time. Three different voting schemes are examined and proved to satisfy the bounded recovery time conditions.

  8. Quantum game theory and open access publishing

    NASA Astrophysics Data System (ADS)

    Hanauske, Matthias; Bernius, Steffen; Dugall, Berndt

    2007-08-01

    The digital revolution of the information age and in particular the sweeping changes of scientific communication brought about by computing and novel communication technology, potentiate global, high grade scientific information for free. The arXiv, for example, is the leading scientific communication platform, mainly for mathematics and physics, where everyone in the world has free access on. While in some scientific disciplines the open access way is successfully realized, other disciplines (e.g. humanities and social sciences) dwell on the traditional path, even though many scientists belonging to these communities approve the open access principle. In this paper we try to explain these different publication patterns by using a game theoretical approach. Based on the assumption, that the main goal of scientists is the maximization of their reputation, we model different possible game settings, namely a zero sum game, the prisoners’ dilemma case and a version of the stag hunt game, that show the dilemma of scientists belonging to “non-open access communities”. From an individual perspective, they have no incentive to deviate from the Nash equilibrium of traditional publishing. By extending the model using the quantum game theory approach it can be shown, that if the strength of entanglement exceeds a certain value, the scientists will overcome the dilemma and terminate to publish only traditionally in all three settings.

  9. User definition and mission requirements for unmanned airborne platforms, revised

    NASA Technical Reports Server (NTRS)

    Kuhner, M. B.; Mcdowell, J. R.

    1979-01-01

    The airborne measurement requirements of the scientific and applications experiment user community were assessed with respect to the suitability of proposed strawman airborne platforms. These platforms provide a spectrum of measurement capabilities supporting associated mission tradeoffs such as payload weight, operating altitude, range, duration, flight profile control, deployment flexibility, quick response, and recoverability. The results of the survey are used to examine whether the development of platforms is warranted and to determine platform system requirements as well as research and technology needs.

  10. The Swiss Data Science Center on a mission to empower reproducible, traceable and reusable science

    NASA Astrophysics Data System (ADS)

    Schymanski, Stanislaus; Bouillet, Eric; Verscheure, Olivier

    2017-04-01

    Our abilities to collect, store and analyse scientific data have sky-rocketed in the past decades, but at the same time, a disconnect between data scientists, domain experts and data providers has begun to emerge. Data scientists are developing more and more powerful algorithms for data mining and analysis, while data providers are making more and more data publicly available, and yet many, if not most, discoveries are based on specific data and/or algorithms that "are available from the authors upon request". In the strong belief that scientific progress would be much faster if reproduction and re-use of such data and algorithms was made easier, the Swiss Data Science Center (SDSC) has committed to provide an open framework for the handling and tracking of scientific data and algorithms, from raw data and first principle equations to final data products and visualisations, modular simulation models and benchmark evaluation algorithms. Led jointly by EPFL and ETH Zurich, the SDSC is composed of a distributed multi-disciplinary team of data scientists and experts in select domains. The center aims to federate data providers, data and computer scientists, and subject-matter experts around a cutting-edge analytics platform offering user-friendly tooling and services to help with the adoption of Open Science, fostering research productivity and excellence. In this presentation, we will discuss our vision of a high-scalable open but secure community-based platform for sharing, accessing, exploring, and analyzing scientific data in easily reproducible workflows, augmented by automated provenance and impact tracking, knowledge graphs, fine-grained access right and digital right management, and a variety of domain-specific software tools. For maximum interoperability, transparency and ease of use, we plan to utilize notebook interfaces wherever possible, such as Apache Zeppelin and Jupyter. Feedback and suggestions from the audience will be gratefully considered.

  11. An Interview with Matthew P. Greving, PhD. Interview by Vicki Glaser.

    PubMed

    Greving, Matthew P

    2011-10-01

    Matthew P. Greving is Chief Scientific Officer at Nextval Inc., a company founded in early 2010 that has developed a discovery platform called MassInsight™.. He received his PhD in Biochemistry from Arizona State University, and prior to that he spent nearly 7 years working as a software engineer. This experience in solving complex computational problems fueled his interest in developing technologies and algorithms related to acquisition and analysis of high-dimensional biochemical data. To address the existing problems associated with label-based microarray readouts, he beganwork on a technique for label-free mass spectrometry (MS) microarray readout compatible with both matrix-assisted laser/desorption ionization (MALDI) and matrix-free nanostructure initiator mass spectrometry (NIMS). This is the core of Nextval’s MassInsight technology, which utilizes picoliter noncontact deposition of high-density arrays on mass-readout substrates along with computational algorithms for high-dimensional data processingand reduction.

  12. Asian consortium on computational materials science theme meeting on ;first principles analysis & experiment: Role in energy research; 22-24 september 2016, SRM University, Kattankulathur, Chennai, India (ACCMS-TM 2016)

    NASA Astrophysics Data System (ADS)

    Thapa, Ranjit; Kawazoe, Yoshiyuki

    2017-10-01

    The main objective of this meeting was to provide a platform for theoreticians and experimentalists working in the area of materials to come together and carry out cutting edge research in the field of energy by showcasing their ideas and innovations. The theme meeting was successful in attracting young researchers from both fields, sharing common research interests. Participation of more than 250 researchers in ACCMS-TM 2016 has successfully paved the way towards exchange of mutual research insights and establishment of promising research collaborations. To encourage the young participants' research efforts, three best posters, each named as ;KAWAZOE PRIZE; in theoretical category and two best posters named ;ACCMS-TM 2016 POSTER AWARD; for experimental contributions was selected. A new award named ;ACCMS MID-CAREER AWARD; for outstanding scientific contribution in the area of Computational Materials Science was constituted.

  13. ePMV embeds molecular modeling into professional animation software environments.

    PubMed

    Johnson, Graham T; Autin, Ludovic; Goodsell, David S; Sanner, Michel F; Olson, Arthur J

    2011-03-09

    Increasingly complex research has made it more difficult to prepare data for publication, education, and outreach. Many scientists must also wade through black-box code to interface computational algorithms from diverse sources to supplement their bench work. To reduce these barriers we have developed an open-source plug-in, embedded Python Molecular Viewer (ePMV), that runs molecular modeling software directly inside of professional 3D animation applications (hosts) to provide simultaneous access to the capabilities of these newly connected systems. Uniting host and scientific algorithms into a single interface allows users from varied backgrounds to assemble professional quality visuals and to perform computational experiments with relative ease. By enabling easy exchange of algorithms, ePMV can facilitate interdisciplinary research, smooth communication between broadly diverse specialties, and provide a common platform to frame and visualize the increasingly detailed intersection(s) of cellular and molecular biology. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. ePMV Embeds Molecular Modeling into Professional Animation Software Environments

    PubMed Central

    Johnson, Graham T.; Autin, Ludovic; Goodsell, David S.; Sanner, Michel F.; Olson, Arthur J.

    2011-01-01

    SUMMARY Increasingly complex research has made it more difficult to prepare data for publication, education, and outreach. Many scientists must also wade through black-box code to interface computational algorithms from diverse sources to supplement their bench work. To reduce these barriers, we have developed an open-source plug-in, embedded Python Molecular Viewer (ePMV), that runs molecular modeling software directly inside of professional 3D animation applications (hosts) to provide simultaneous access to the capabilities of these newly connected systems. Uniting host and scientific algorithms into a single interface allows users from varied backgrounds to assemble professional quality visuals and to perform computational experiments with relative ease. By enabling easy exchange of algorithms, ePMV can facilitate interdisciplinary research, smooth communication between broadly diverse specialties and provide a common platform to frame and visualize the increasingly detailed intersection(s) of cellular and molecular biology. PMID:21397181

  15. Open source software and low cost sensors for teaching UAV science

    NASA Astrophysics Data System (ADS)

    Kefauver, S. C.; Sanchez-Bragado, R.; El-Haddad, G.; Araus, J. L.

    2016-12-01

    Drones, also known as UASs (unmanned aerial systems), UAVs (Unmanned Aerial Vehicles) or RPAS (Remotely piloted aircraft systems), are both useful advanced scientific platforms and recreational toys that are appealing to younger generations. As such, they can make for excellent education tools as well as low-cost scientific research project alternatives. However, the process of taking pretty pictures to remote sensing science can be daunting if one is presented with only expensive software and sensor options. There are a number of open-source tools and low cost platform and sensor options available that can provide excellent scientific research results, and, by often requiring more user-involvement than commercial software and sensors, provide even greater educational benefits. Scale-invariant feature transform (SIFT) algorithm implementations, such as the Microsoft Image Composite Editor (ICE), which can create quality 2D image mosaics with some motion and terrain adjustments and VisualSFM (Structure from Motion), which can provide full image mosaicking with movement and orthorectification capacities. RGB image quantification using alternate color space transforms, such as the BreedPix indices, can be calculated via plugins in the open-source software Fiji (http://fiji.sc/Fiji; http://github.com/george-haddad/CIMMYT). Recent analyses of aerial images from UAVs over different vegetation types and environments have shown RGB metrics can outperform more costly commercial sensors. Specifically, Hue-based pixel counts, the Triangle Greenness Index (TGI), and the Normalized Green Red Difference Index (NGRDI) consistently outperformed NDVI in estimating abiotic and biotic stress impacts on crop health. Also, simple kits are available for NDVI camera conversions. Furthermore, suggestions for multivariate analyses of the different RGB indices in the "R program for statistical computing", such as classification and regression trees can allow for a more approachable interpretation of results in the classroom.

  16. An automated and integrated framework for dust storm detection based on ogc web processing services

    NASA Astrophysics Data System (ADS)

    Xiao, F.; Shea, G. Y. K.; Wong, M. S.; Campbell, J.

    2014-11-01

    Dust storms are known to have adverse effects on public health. Atmospheric dust loading is also one of the major uncertainties in global climatic modelling as it is known to have a significant impact on the radiation budget and atmospheric stability. The complexity of building scientific dust storm models is coupled with the scientific computation advancement, ongoing computing platform development, and the development of heterogeneous Earth Observation (EO) networks. It is a challenging task to develop an integrated and automated scheme for dust storm detection that combines Geo-Processing frameworks, scientific models and EO data together to enable the dust storm detection and tracking processes in a dynamic and timely manner. This study develops an automated and integrated framework for dust storm detection and tracking based on the Web Processing Services (WPS) initiated by Open Geospatial Consortium (OGC). The presented WPS framework consists of EO data retrieval components, dust storm detecting and tracking component, and service chain orchestration engine. The EO data processing component is implemented based on OPeNDAP standard. The dust storm detecting and tracking component combines three earth scientific models, which are SBDART model (for computing aerosol optical depth (AOT) of dust particles), WRF model (for simulating meteorological parameters) and HYSPLIT model (for simulating the dust storm transport processes). The service chain orchestration engine is implemented based on Business Process Execution Language for Web Service (BPEL4WS) using open-source software. The output results, including horizontal and vertical AOT distribution of dust particles as well as their transport paths, were represented using KML/XML and displayed in Google Earth. A serious dust storm, which occurred over East Asia from 26 to 28 Apr 2012, is used to test the applicability of the proposed WPS framework. Our aim here is to solve a specific instance of a complex EO data and scientific model integration problem by using a framework and scientific workflow approach together. The experimental result shows that this newly automated and integrated framework can be used to give advance near real-time warning of dust storms, for both environmental authorities and public. The methods presented in this paper might be also generalized to other types of Earth system models, leading to improved ease of use and flexibility.

  17. 76 FR 31945 - Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-02

    ... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... teleconference meeting of the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal [email protected] . FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing...

  18. 75 FR 9887 - Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-04

    ... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building...

  19. 76 FR 9765 - Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-22

    ... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Office of Science... Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub. L. 92... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research, SC-21/Germantown Building...

  20. 77 FR 45345 - DOE/Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-31

    ... Recompetition results for Scientific Discovery through Advanced Computing (SciDAC) applications Co-design Public... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Office of... the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub...

  1. 75 FR 64720 - DOE/Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-20

    ... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Department of... the Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L.... FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21...

  2. Computing through Scientific Abstractions in SysBioPS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, George; Stephan, Eric G.; Gracio, Deborah K.

    2004-10-13

    Today, biologists and bioinformaticists have a tremendous amount of computational power at their disposal. With the availability of supercomputers, burgeoning scientific databases and digital libraries such as GenBank and PubMed, and pervasive computational environments such as the Grid, biologists have access to a wealth of computational capabilities and scientific data at hand. Yet, the rapid development of computational technologies has far exceeded the typical biologist’s ability to effectively apply the technology in their research. Computational sciences research and development efforts such as the Biology Workbench, BioSPICE (Biological Simulation Program for Intra-Cellular Evaluation), and BioCoRE (Biological Collaborative Research Environment) are importantmore » in connecting biologists and their scientific problems to computational infrastructures. On the Computational Cell Environment and Heuristic Entity-Relationship Building Environment projects at the Pacific Northwest National Laboratory, we are jointly developing a new breed of scientific problem solving environment called SysBioPSE that will allow biologists to access and apply computational resources in the scientific research context. In contrast to other computational science environments, SysBioPSE operates as an abstraction layer above a computational infrastructure. The goal of SysBioPSE is to allow biologists to apply computational resources in the context of the scientific problems they are addressing and the scientific perspectives from which they conduct their research. More specifically, SysBioPSE allows biologists to capture and represent scientific concepts and theories and experimental processes, and to link these views to scientific applications, data repositories, and computer systems.« less

  3. Development of a computer model to predict platform station keeping requirements in the Gulf of Mexico using remote sensing data

    NASA Technical Reports Server (NTRS)

    Barber, Bryan; Kahn, Laura; Wong, David

    1990-01-01

    Offshore operations such as oil drilling and radar monitoring require semisubmersible platforms to remain stationary at specific locations in the Gulf of Mexico. Ocean currents, wind, and waves in the Gulf of Mexico tend to move platforms away from their desired locations. A computer model was created to predict the station keeping requirements of a platform. The computer simulation uses remote sensing data from satellites and buoys as input. A background of the project, alternate approaches to the project, and the details of the simulation are presented.

  4. Traffic information computing platform for big data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.

  5. Selling science 2.0: What scientific projects receive crowdfunding online?

    PubMed

    Schäfer, Mike S; Metag, Julia; Feustle, Jessica; Herzog, Livia

    2016-09-19

    Crowdfunding has emerged as an additional source for financing research in recent years. The study at hand identifies and tests explanatory factors influencing the success of scientific crowdfunding projects by drawing on news value theory, the "reputation signaling" approach, and economic theories of online payment. A standardized content analysis of 371 projects on English- and German-language platforms reveals that each theory provides factors influencing crowdfunding success. It shows that projects presented on science-only crowdfunding platforms have a higher success rate. At the same time, projects are more likely to be successful if their presentation includes visualizations and humor, the lower their targeted funding is, the less personal data potential donors have to relinquish and the more interaction between researchers and donors is possible. This suggests that after donors decide to visit a scientific crowdfunding platform, factors unrelated to science matter more for subsequent funding decisions, raising questions about the potential and implications of crowdfunding science. © The Author(s) 2016.

  6. Back to the future: virtualization of the computing environment at the W. M. Keck Observatory

    NASA Astrophysics Data System (ADS)

    McCann, Kevin L.; Birch, Denny A.; Holt, Jennifer M.; Randolph, William B.; Ward, Josephine A.

    2014-07-01

    Over its two decades of science operations, the W.M. Keck Observatory computing environment has evolved to contain a distributed hybrid mix of hundreds of servers, desktops and laptops of multiple different hardware platforms, O/S versions and vintages. Supporting the growing computing capabilities to meet the observatory's diverse, evolving computing demands within fixed budget constraints, presents many challenges. This paper describes the significant role that virtualization is playing in addressing these challenges while improving the level and quality of service as well as realizing significant savings across many cost areas. Starting in December 2012, the observatory embarked on an ambitious plan to incrementally test and deploy a migration to virtualized platforms to address a broad range of specific opportunities. Implementation to date has been surprisingly glitch free, progressing well and yielding tangible benefits much faster than many expected. We describe here the general approach, starting with the initial identification of some low hanging fruit which also provided opportunity to gain experience and build confidence among both the implementation team and the user community. We describe the range of challenges, opportunities and cost savings potential. Very significant among these was the substantial power savings which resulted in strong broad support for moving forward. We go on to describe the phasing plan, the evolving scalable architecture, some of the specific technical choices, as well as some of the individual technical issues encountered along the way. The phased implementation spans Windows and Unix servers for scientific, engineering and business operations, virtualized desktops for typical office users as well as more the more demanding graphics intensive CAD users. Other areas discussed in this paper include staff training, load balancing, redundancy, scalability, remote access, disaster readiness and recovery.

  7. Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Gorelick, Noel

    2013-04-01

    The Google Earth Engine platform is a system designed to enable petabyte-scale, scientific analysis and visualization of geospatial datasets. Earth Engine provides a consolidated environment including a massive data catalog co-located with thousands of computers for analysis. The user-friendly front-end provides a workbench environment to allow interactive data and algorithm development and exploration and provides a convenient mechanism for scientists to share data, visualizations and analytic algorithms via URLs. The Earth Engine data catalog contains a wide variety of popular, curated datasets, including the world's largest online collection of Landsat scenes (> 2.0M), numerous MODIS collections, and many vector-based data sets. The platform provides a uniform access mechanism to a variety of data types, independent of their bands, projection, bit-depth, resolution, etc..., facilitating easy multi-sensor analysis. Additionally, a user is able to add and curate their own data and collections. Using a just-in-time, distributed computation model, Earth Engine can rapidly process enormous quantities of geo-spatial data. All computation is performed lazily; nothing is computed until it's required either for output or as input to another step. This model allows real-time feedback and preview during algorithm development, supporting a rapid algorithm development, test, and improvement cycle that scales seamlessly to large-scale production data processing. Through integration with a variety of other services, Earth Engine is able to bring to bear considerable analytic and technical firepower in a transparent fashion, including: AI-based classification via integration with Google's machine learning infrastructure, publishing and distribution at Google scale through integration with the Google Maps API, Maps Engine and Google Earth, and support for in-the-field activities such as validation, ground-truthing, crowd-sourcing and citizen science though the Android Open Data Kit.

  8. Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Gorelick, N.

    2012-12-01

    The Google Earth Engine platform is a system designed to enable petabyte-scale, scientific analysis and visualization of geospatial datasets. Earth Engine provides a consolidated environment including a massive data catalog co-located with thousands of computers for analysis. The user-friendly front-end provides a workbench environment to allow interactive data and algorithm development and exploration and provides a convenient mechanism for scientists to share data, visualizations and analytic algorithms via URLs. The Earth Engine data catalog contains a wide variety of popular, curated datasets, including the world's largest online collection of Landsat scenes (> 2.0M), numerous MODIS collections, and many vector-based data sets. The platform provides a uniform access mechanism to a variety of data types, independent of their bands, projection, bit-depth, resolution, etc..., facilitating easy multi-sensor analysis. Additionally, a user is able to add and curate their own data and collections. Using a just-in-time, distributed computation model, Earth Engine can rapidly process enormous quantities of geo-spatial data. All computation is performed lazily; nothing is computed until it's required either for output or as input to another step. This model allows real-time feedback and preview during algorithm development, supporting a rapid algorithm development, test, and improvement cycle that scales seamlessly to large-scale production data processing. Through integration with a variety of other services, Earth Engine is able to bring to bear considerable analytic and technical firepower in a transparent fashion, including: AI-based classification via integration with Google's machine learning infrastructure, publishing and distribution at Google scale through integration with the Google Maps API, Maps Engine and Google Earth, and support for in-the-field activities such as validation, ground-truthing, crowd-sourcing and citizen science though the Android Open Data Kit.

  9. 75 FR 43518 - Advanced Scientific Computing Advisory Committee; Meeting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-26

    ... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee; Meeting AGENCY: Office of... Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463, 86 Stat. 770...: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building; U. S...

  10. Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

    NASA Astrophysics Data System (ADS)

    Beck, Jeffrey; Bos, Jeremy P.

    2017-05-01

    We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.

  11. qPortal: A platform for data-driven biomedical research.

    PubMed

    Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

    2018-01-01

    Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on high-performance computing systems via coupling of workflow management systems. Integration of project and data management as well as workflow resources in one place present clear advantages over existing solutions.

  12. Interactive mission planning for a Space Shuttle flight experiment - A case history

    NASA Technical Reports Server (NTRS)

    Harris, H. M.

    1986-01-01

    Scientific experiments which use the Space Shuttle as a platform require the development of new operations techniques for the command and control of the instrument. Principal among these is the ability to simulate the complex maneuvers of the orbiter's path realistically. Computer generated graphics provide a window into the actual and predicted performance of the instrument and allow sophisticated control of the instrument under varying conditions. In October of 1984 the Shuttle carried a synthetic aperture radar built by JPL for the purpose of recording images of the earth surface. The mission deviated from planned operation in almost every conceivable way and provided an exacting test bed for concepts of interactive mission planning.

  13. Prototyping Instruments for Chemical Laboratory Using Inexpensive Electronic Modules.

    PubMed

    Urban, Pawel L

    2018-05-15

    Open-source electronics and programming can augment chemical and biomedical research. Currently, chemists can choose from a broad range of low-cost universal electronic modules (microcontroller boards and single-board computers) and use them to assemble working prototypes of scientific tools to address specific experimental problems and to support daily research work. The learning time can be as short as a few hours, and the required budget is often as low as 50 USD. Prototyping instruments using low-cost electronic modules gives chemists enormous flexibility to design and construct customized instrumentation, which can reduce the delays caused by limited access to high-end commercial platforms. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Large-scale deep learning for robotically gathered imagery for science

    NASA Astrophysics Data System (ADS)

    Skinner, K.; Johnson-Roberson, M.; Li, J.; Iscar, E.

    2016-12-01

    With the explosion of computing power, the intelligence and capability of mobile robotics has dramatically increased over the last two decades. Today, we can deploy autonomous robots to achieve observations in a variety of environments ripe for scientific exploration. These platforms are capable of gathering a volume of data previously unimaginable. Additionally, optical cameras, driven by mobile phones and consumer photography, have rapidly improved in size, power consumption, and quality making their deployment cheaper and easier. Finally, in parallel we have seen the rise of large-scale machine learning approaches, particularly deep neural networks (DNNs), increasing the quality of the semantic understanding that can be automatically extracted from optical imagery. In concert this enables new science using a combination of machine learning and robotics. This work will discuss the application of new low-cost high-performance computing approaches and the associated software frameworks to enable scientists to rapidly extract useful science data from millions of robotically gathered images. The automated analysis of imagery on this scale opens up new avenues of inquiry unavailable using more traditional manual or semi-automated approaches. We will use a large archive of millions of benthic images gathered with an autonomous underwater vehicle to demonstrate how these tools enable new scientific questions to be posed.

  15. Causal biological network database: a comprehensive platform of causal biological network models focused on the pulmonary and vascular systems

    PubMed Central

    Boué, Stéphanie; Talikka, Marja; Westra, Jurjen Willem; Hayes, William; Di Fabio, Anselmo; Park, Jennifer; Schlage, Walter K.; Sewer, Alain; Fields, Brett; Ansari, Sam; Martin, Florian; Veljkovic, Emilija; Kenney, Renee; Peitsch, Manuel C.; Hoeng, Julia

    2015-01-01

    With the wealth of publications and data available, powerful and transparent computational approaches are required to represent measured data and scientific knowledge in a computable and searchable format. We developed a set of biological network models, scripted in the Biological Expression Language, that reflect causal signaling pathways across a wide range of biological processes, including cell fate, cell stress, cell proliferation, inflammation, tissue repair and angiogenesis in the pulmonary and cardiovascular context. This comprehensive collection of networks is now freely available to the scientific community in a centralized web-based repository, the Causal Biological Network database, which is composed of over 120 manually curated and well annotated biological network models and can be accessed at http://causalbionet.com. The website accesses a MongoDB, which stores all versions of the networks as JSON objects and allows users to search for genes, proteins, biological processes, small molecules and keywords in the network descriptions to retrieve biological networks of interest. The content of the networks can be visualized and browsed. Nodes and edges can be filtered and all supporting evidence for the edges can be browsed and is linked to the original articles in PubMed. Moreover, networks may be downloaded for further visualization and evaluation. Database URL: http://causalbionet.com PMID:25887162

  16. UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lockwood, Glenn K.; Yoo, Wucherl; Byna, Suren

    I/O efficiency is essential to productivity in scientific computing, especially as many scientific domains become more data-intensive. Many characterization tools have been used to elucidate specific aspects of parallel I/O performance, but analyzing components of complex I/O subsystems in isolation fails to provide insight into critical questions: how do the I/O components interact, what are reasonable expectations for application performance, and what are the underlying causes of I/O performance problems? To address these questions while capitalizing on existing component-level characterization tools, we propose an approach that combines on-demand, modular synthesis of I/O characterization data into a unified monitoring and metricsmore » interface (UMAMI) to provide a normalized, holistic view of I/O behavior. We evaluate the feasibility of this approach by applying it to a month-long benchmarking study on two distinct largescale computing platforms. We present three case studies that highlight the importance of analyzing application I/O performance in context with both contemporaneous and historical component metrics, and we provide new insights into the factors affecting I/O performance. By demonstrating the generality of our approach, we lay the groundwork for a production-grade framework for holistic I/O analysis.« less

  17. Synthesis meets theory: Past, present and future of rational chemistry

    NASA Astrophysics Data System (ADS)

    Fianchini, Mauro

    2017-11-01

    Chemical synthesis has its roots in the empirical approach of alchemy. Nonetheless, the birth of the scientific method, the technical and technological advances (exploiting revolutionary discoveries in physics) and the improved management and sharing of growing databases greatly contributed to the evolution of chemistry from an esoteric ground into a mature scientific discipline during these last 400 years. Furthermore, thanks to the evolution of computational resources, platforms and media in the last 40 years, theoretical chemistry has added to the puzzle the final missing tile in the process of "rationalizing" chemistry. The use of mathematical models of chemical properties, behaviors and reactivities is nowadays ubiquitous in literature. Theoretical chemistry has been successful in the difficult task of complementing and explaining synthetic results and providing rigorous insights when these are otherwise unattainable by experiment. The first part of this review walks the reader through a concise historical overview on the evolution of the "model" in chemistry. Salient milestones have been highlighted and briefly discussed. The second part focuses more on the general description of recent state-of-the-art computational techniques currently used worldwide by chemists to produce synergistic models between theory and experiment. Each section is complemented by key-examples taken from the literature that illustrate the application of the technique discussed therein.

  18. The OSG Open Facility: an on-ramp for opportunistic scientific computing

    NASA Astrophysics Data System (ADS)

    Jayatilaka, B.; Levshina, T.; Sehgal, C.; Gardner, R.; Rynge, M.; Würthwein, F.

    2017-10-01

    The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource owners and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.

  19. The OSG Open Facility: An On-Ramp for Opportunistic Scientific Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jayatilaka, B.; Levshina, T.; Sehgal, C.

    The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource ownersmore » and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.« less

  20. PRoViScout: a planetary scouting rover demonstrator

    NASA Astrophysics Data System (ADS)

    Paar, Gerhard; Woods, Mark; Gimkiewicz, Christiane; Labrosse, Frédéric; Medina, Alberto; Tyler, Laurence; Barnes, David P.; Fritz, Gerald; Kapellos, Konstantinos

    2012-01-01

    Mobile systems exploring Planetary surfaces in future will require more autonomy than today. The EU FP7-SPACE Project ProViScout (2010-2012) establishes the building blocks of such autonomous exploration systems in terms of robotics vision by a decision-based combination of navigation and scientific target selection, and integrates them into a framework ready for and exposed to field demonstration. The PRoViScout on-board system consists of mission management components such as an Executive, a Mars Mission On-Board Planner and Scheduler, a Science Assessment Module, and Navigation & Vision Processing modules. The platform hardware consists of the rover with the sensors and pointing devices. We report on the major building blocks and their functions & interfaces, emphasizing on the computer vision parts such as image acquisition (using a novel zoomed 3D-Time-of-Flight & RGB camera), mapping from 3D-TOF data, panoramic image & stereo reconstruction, hazard and slope maps, visual odometry and the recognition of potential scientifically interesting targets.

  1. Interactive Computer-Assisted Instruction in Acid-Base Physiology for Mobile Computer Platforms

    ERIC Educational Resources Information Center

    Longmuir, Kenneth J.

    2014-01-01

    In this project, the traditional lecture hall presentation of acid-base physiology in the first-year medical school curriculum was replaced by interactive, computer-assisted instruction designed primarily for the iPad and other mobile computer platforms. Three learning modules were developed, each with ~20 screens of information, on the subjects…

  2. Issues in undergraduate education in computational science and high performance computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marchioro, T.L. II; Martin, D.

    1994-12-31

    The ever increasing need for mathematical and computational literacy within their society and among members of the work force has generated enormous pressure to revise and improve the teaching of related subjects throughout the curriculum, particularly at the undergraduate level. The Calculus Reform movement is perhaps the best known example of an organized initiative in this regard. The UCES (Undergraduate Computational Engineering and Science) project, an effort funded by the Department of Energy and administered through the Ames Laboratory, is sponsoring an informal and open discussion of the salient issues confronting efforts to improve and expand the teaching of computationalmore » science as a problem oriented, interdisciplinary approach to scientific investigation. Although the format is open, the authors hope to consider pertinent questions such as: (1) How can faculty and research scientists obtain the recognition necessary to further excellence in teaching the mathematical and computational sciences? (2) What sort of educational resources--both hardware and software--are needed to teach computational science at the undergraduate level? Are traditional procedural languages sufficient? Are PCs enough? Are massively parallel platforms needed? (3) How can electronic educational materials be distributed in an efficient way? Can they be made interactive in nature? How should such materials be tied to the World Wide Web and the growing ``Information Superhighway``?« less

  3. Helios: Understanding Solar Evolution Through Text Analytics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Randazzese, Lucien

    This proof-of-concept project focused on developing, testing, and validating a range of bibliometric, text analytic, and machine-learning based methods to explore the evolution of three photovoltaic (PV) technologies: Cadmium Telluride (CdTe), Dye-Sensitized solar cells (DSSC), and Multi-junction solar cells. The analytical approach to the work was inspired by previous work by the same team to measure and predict the scientific prominence of terms and entities within specific research domains. The goal was to create tools that could assist domain-knowledgeable analysts in investigating the history and path of technological developments in general, with a focus on analyzing step-function changes in performance,more » or “breakthroughs,” in particular. The text-analytics platform developed during this project was dubbed Helios. The project relied on computational methods for analyzing large corpora of technical documents. For this project we ingested technical documents from the following sources into Helios: Thomson Scientific Web of Science (papers), the U.S. Patent & Trademark Office (patents), the U.S. Department of Energy (technical documents), the U.S. National Science Foundation (project funding summaries), and a hand curated set of full-text documents from Thomson Scientific and other sources.« less

  4. Research on the Construction Management and Sustainable Development of Large-Scale Scientific Facilities in China

    NASA Astrophysics Data System (ADS)

    Guiquan, Xi; Lin, Cong; Xuehui, Jin

    2018-05-01

    As an important platform for scientific and technological development, large -scale scientific facilities are the cornerstone of technological innovation and a guarantee for economic and social development. Researching management of large-scale scientific facilities can play a key role in scientific research, sociology and key national strategy. This paper reviews the characteristics of large-scale scientific facilities, and summarizes development status of China's large-scale scientific facilities. At last, the construction, management, operation and evaluation of large-scale scientific facilities is analyzed from the perspective of sustainable development.

  5. Integrated instrumentation & computation environment for GRACE

    NASA Astrophysics Data System (ADS)

    Dhekne, P. S.

    2002-03-01

    The project GRACE (Gamma Ray Astrophysics with Coordinated Experiments) aims at setting up a state of the art Gamma Ray Observatory at Mt. Abu, Rajasthan for undertaking comprehensive scientific exploration over a wide spectral window (10's keV - 100's TeV) from a single location through 4 coordinated experiments. The cumulative data collection rate of all the telescopes is expected to be about 1 GB/hr, necessitating innovations in the data management environment. As real-time data acquisition and control as well as off-line data processing, analysis and visualization environment of these systems is based on the us cutting edge and affordable technologies in the field of computers, communications and Internet. We propose to provide a single, unified environment by seamless integration of instrumentation and computations by taking advantage of the recent advancements in Web based technologies. This new environment will allow researchers better acces to facilities, improve resource utilization and enhance collaborations by having identical environments for online as well as offline usage of this facility from any location. We present here a proposed implementation strategy for a platform independent web-based system that supplements automated functions with video-guided interactive and collaborative remote viewing, remote control through virtual instrumentation console, remote acquisition of telescope data, data analysis, data visualization and active imaging system. This end-to-end web-based solution will enhance collaboration among researchers at the national and international level for undertaking scientific studies, using the telescope systems of the GRACE project.

  6. Autonomous self-organizing resource manager for multiple networked platforms

    NASA Astrophysics Data System (ADS)

    Smith, James F., III

    2002-08-01

    A fuzzy logic based expert system for resource management has been developed that automatically allocates electronic attack (EA) resources in real-time over many dissimilar autonomous naval platforms defending their group against attackers. The platforms can be very general, e.g., ships, planes, robots, land based facilities, etc. Potential foes the platforms deal with can also be general. This paper provides an overview of the resource manager including the four fuzzy decision trees that make up the resource manager; the fuzzy EA model; genetic algorithm based optimization; co-evolutionary data mining through gaming; and mathematical, computational and hardware based validation. Methods of automatically designing new multi-platform EA techniques are considered. The expert system runs on each defending platform rendering it an autonomous system requiring no human intervention. There is no commanding platform. Instead the platforms work cooperatively as a function of battlespace geometry; sensor data such as range, bearing, ID, uncertainty measures for sensor output; intelligence reports; etc. Computational experiments will show the defending networked platform's ability to self- organize. The platforms' ability to self-organize is illustrated through the output of the scenario generator, a software package that automates the underlying data mining problem and creates a computer movie of the platforms' interaction for evaluation.

  7. MPHASYS: a mouse phenotype analysis system

    PubMed Central

    Calder, R Brent; Beems, Rudolf B; van Steeg, Harry; Mian, I Saira; Lohman, Paul HM; Vijg, Jan

    2007-01-01

    Background Systematic, high-throughput studies of mouse phenotypes have been hampered by the inability to analyze individual animal data from a multitude of sources in an integrated manner. Studies generally make comparisons at the level of genotype or treatment thereby excluding associations that may be subtle or involve compound phenotypes. Additionally, the lack of integrated, standardized ontologies and methodologies for data exchange has inhibited scientific collaboration and discovery. Results Here we introduce a Mouse Phenotype Analysis System (MPHASYS), a platform for integrating data generated by studies of mouse models of human biology and disease such as aging and cancer. This computational platform is designed to provide a standardized methodology for working with animal data; a framework for data entry, analysis and sharing; and ontologies and methodologies for ensuring accurate data capture. We describe the tools that currently comprise MPHASYS, primarily ones related to mouse pathology, and outline its use in a study of individual animal-specific patterns of multiple pathology in mice harboring a specific germline mutation in the DNA repair and transcription-specific gene Xpd. Conclusion MPHASYS is a system for analyzing multiple data types from individual animals. It provides a framework for developing data analysis applications, and tools for collecting and distributing high-quality data. The software is platform independent and freely available under an open-source license [1]. PMID:17553167

  8. Cultivating engineering innovation ability based on optoelectronic experimental platform

    NASA Astrophysics Data System (ADS)

    Li, Dangjuan; Wu, Shenjiang

    2017-08-01

    As the supporting experimental platform of the Xi'an Technological University education reform experimental class, "optical technological innovation experimental platform" integrated the design and comprehensive experiments of the optical multi-class courses. On the basis of summing up the past two years teaching experience, platform pilot projects were improve. It has played a good role by making the use of an open teaching model in the cultivating engineering innovation spirit and scientific thinking of the students.

  9. Space Station

    NASA Image and Video Library

    1981-12-01

    During 1980 and the first half of 1981, the Marshall Space Flight Center conducted studies concerned with a relatively low-cost, near-term, manned space platform to satisfy current user needs, yet capable of evolutionary growth to meet future needs. The Science and Application Manned Space Platform (SAMSP) studies were to serve as a test bed for developing scientific and operational capabilities required by later, more advanced manned platforms while accomplishing early science and operations. This concept illustrates a manned space platform.

  10. Functional requirements document for the Earth Observing System Data and Information System (EOSDIS) Scientific Computing Facilities (SCF) of the NASA/MSFC Earth Science and Applications Division, 1992

    NASA Technical Reports Server (NTRS)

    Botts, Michael E.; Phillips, Ron J.; Parker, John V.; Wright, Patrick D.

    1992-01-01

    Five scientists at MSFC/ESAD have EOS SCF investigator status. Each SCF has unique tasks which require the establishment of a computing facility dedicated to accomplishing those tasks. A SCF Working Group was established at ESAD with the charter of defining the computing requirements of the individual SCFs and recommending options for meeting these requirements. The primary goal of the working group was to determine which computing needs can be satisfied using either shared resources or separate but compatible resources, and which needs require unique individual resources. The requirements investigated included CPU-intensive vector and scalar processing, visualization, data storage, connectivity, and I/O peripherals. A review of computer industry directions and a market survey of computing hardware provided information regarding important industry standards and candidate computing platforms. It was determined that the total SCF computing requirements might be most effectively met using a hierarchy consisting of shared and individual resources. This hierarchy is composed of five major system types: (1) a supercomputer class vector processor; (2) a high-end scalar multiprocessor workstation; (3) a file server; (4) a few medium- to high-end visualization workstations; and (5) several low- to medium-range personal graphics workstations. Specific recommendations for meeting the needs of each of these types are presented.

  11. Rotating Desk for Collaboration by Two Computer Programmers

    NASA Technical Reports Server (NTRS)

    Riley, John Thomas

    2005-01-01

    A special-purpose desk has been designed to facilitate collaboration by two computer programmers sharing one desktop computer or computer terminal. The impetus for the design is a trend toward what is known in the software industry as extreme programming an approach intended to ensure high quality without sacrificing the quantity of computer code produced. Programmers working in pairs is a major feature of extreme programming. The present desk design minimizes the stress of the collaborative work environment. It supports both quality and work flow by making it unnecessary for programmers to get in each other s way. The desk (see figure) includes a rotating platform that supports a computer video monitor, keyboard, and mouse. The desk enables one programmer to work on the keyboard for any amount of time and then the other programmer to take over without breaking the train of thought. The rotating platform is supported by a turntable bearing that, in turn, is supported by a weighted base. The platform contains weights to improve its balance. The base includes a stand for a computer, and is shaped and dimensioned to provide adequate foot clearance for both users. The platform includes an adjustable stand for the monitor, a surface for the keyboard and mouse, and spaces for work papers, drinks, and snacks. The heights of the monitor, keyboard, and mouse are set to minimize stress. The platform can be rotated through an angle of 40 to give either user a straight-on view of the monitor and full access to the keyboard and mouse. Magnetic latches keep the platform preferentially at either of the two extremes of rotation. To switch between users, one simply grabs the edge of the platform and pulls it around. The magnetic latch is easily released, allowing the platform to rotate freely to the position of the other user

  12. Micromagnetics on high-performance workstation and mobile computational platforms

    NASA Astrophysics Data System (ADS)

    Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

    2015-05-01

    The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.

  13. Science at a Variety of Scientific Regions at Titan Using Aerial Platforms

    NASA Astrophysics Data System (ADS)

    Pauken, M. T.; Hall, J. L.; Matthies, L.; Malaska, M.; Cutts, J. A.; Tokumaru, P.; Goldman, B.; De Jong, M.

    2017-02-01

    Titan has an abundant supply of organic species and could harbor exotic forms of life. Aerial platforms are ideal for performing reconnaissance and in situ analysis. We describe a range of vehicles in development for exploring Titan.

  14. Continuous measurement of breast tumor hormone receptor expression: a comparison of two computational pathology platforms

    PubMed Central

    Ahern, Thomas P.; Beck, Andrew H.; Rosner, Bernard A.; Glass, Ben; Frieling, Gretchen; Collins, Laura C.; Tamimi, Rulla M.

    2017-01-01

    Background Computational pathology platforms incorporate digital microscopy with sophisticated image analysis to permit rapid, continuous measurement of protein expression. We compared two computational pathology platforms on their measurement of breast tumor estrogen receptor (ER) and progesterone receptor (PR) expression. Methods Breast tumor microarrays from the Nurses’ Health Study were stained for ER (n=592) and PR (n=187). One expert pathologist scored cases as positive if ≥1% of tumor nuclei exhibited stain. ER and PR were then measured with the Definiens Tissue Studio (automated) and Aperio Digital Pathology (user-supervised) platforms. Platform-specific measurements were compared using boxplots, scatter plots, and correlation statistics. Classification of ER and PR positivity by platform-specific measurements was evaluated with areas under receiver operating characteristic curves (AUC) from univariable logistic regression models, using expert pathologist classification as the standard. Results Both platforms showed considerable overlap in continuous measurements of ER and PR between positive and negative groups classified by expert pathologist. Platform-specific measurements were strongly and positively correlated with one another (rho≥0.77). The user-supervised Aperio workflow performed slightly better than the automated Definiens workflow at classifying ER positivity (AUCAperio=0.97; AUCDefiniens=0.90; difference=0.07, 95% CI: 0.05, 0.09) and PR positivity (AUCAperio=0.94; AUCDefiniens=0.87; difference=0.07, 95% CI: 0.03, 0.12). Conclusion Paired hormone receptor expression measurements from two different computational pathology platforms agreed well with one another. The user-supervised workflow yielded better classification accuracy than the automated workflow. Appropriately validated computational pathology algorithms enrich molecular epidemiology studies with continuous protein expression data and may accelerate tumor biomarker discovery. PMID:27729430

  15. Study on the application of mobile internet cloud computing platform

    NASA Astrophysics Data System (ADS)

    Gong, Songchun; Fu, Songyin; Chen, Zheng

    2012-04-01

    The innovative development of computer technology promotes the application of the cloud computing platform, which actually is the substitution and exchange of a sort of resource service models and meets the needs of users on the utilization of different resources after changes and adjustments of multiple aspects. "Cloud computing" owns advantages in many aspects which not merely reduce the difficulties to apply the operating system and also make it easy for users to search, acquire and process the resources. In accordance with this point, the author takes the management of digital libraries as the research focus in this paper, and analyzes the key technologies of the mobile internet cloud computing platform in the operation process. The popularization and promotion of computer technology drive people to create the digital library models, and its core idea is to strengthen the optimal management of the library resource information through computers and construct an inquiry and search platform with high performance, allowing the users to access to the necessary information resources at any time. However, the cloud computing is able to promote the computations within the computers to distribute in a large number of distributed computers, and hence implement the connection service of multiple computers. The digital libraries, as a typical representative of the applications of the cloud computing, can be used to carry out an analysis on the key technologies of the cloud computing.

  16. Building Student Proficiency with Scientific Literature Using the Zotero Reference Manager Platform

    ERIC Educational Resources Information Center

    Kim, Thomas

    2011-01-01

    While mastery of the scientific literature is a strongly desirable trait for undergraduate students, the sheer volume of the current literature has complicated the challenge of teaching scientific literacy. Part of the response to this ever-increasing volume of resources includes formal instruction in the use of reference manager software while…

  17. The HARNESS Workbench: Unified and Adaptive Access to Diverse HPC Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sunderam, Vaidy S.

    2012-03-20

    The primary goal of the Harness WorkBench (HWB) project is to investigate innovative software environments that will help enhance the overall productivity of applications science on diverse HPC platforms. Two complementary frameworks were designed: one, a virtualized command toolkit for application building, deployment, and execution, that provides a common view across diverse HPC systems, in particular the DOE leadership computing platforms (Cray, IBM, SGI, and clusters); and two, a unified runtime environment that consolidates access to runtime services via an adaptive framework for execution-time and post processing activities. A prototype of the first was developed based on the concept ofmore » a 'system-call virtual machine' (SCVM), to enhance portability of the HPC application deployment process across heterogeneous high-end machines. The SCVM approach to portable builds is based on the insertion of toolkit-interpretable directives into original application build scripts. Modifications resulting from these directives preserve the semantics of the original build instruction flow. The execution of the build script is controlled by our toolkit that intercepts build script commands in a manner transparent to the end-user. We have applied this approach to a scientific production code (Gamess-US) on the Cray-XT5 machine. The second facet, termed Unibus, aims to facilitate provisioning and aggregation of multifaceted resources from resource providers and end-users perspectives. To achieve that, Unibus proposes a Capability Model and mediators (resource drivers) to virtualize access to diverse resources, and soft and successive conditioning to enable automatic and user-transparent resource provisioning. A proof of concept implementation has demonstrated the viability of this approach on high end machines, grid systems and computing clouds.« less

  18. Preface: SciDAC 2005

    NASA Astrophysics Data System (ADS)

    Mezzacappa, Anthony

    2005-01-01

    On 26-30 June 2005 at the Grand Hyatt on Union Square in San Francisco several hundred computational scientists from around the world came together for what can certainly be described as a celebration of computational science. Scientists from the SciDAC Program and scientists from other agencies and nations were joined by applied mathematicians and computer scientists to highlight the many successes in the past year where computation has led to scientific discovery in a variety of fields: lattice quantum chromodynamics, accelerator modeling, chemistry, biology, materials science, Earth and climate science, astrophysics, and combustion and fusion energy science. Also highlighted were the advances in numerical methods and computer science, and the multidisciplinary collaboration cutting across science, mathematics, and computer science that enabled these discoveries. The SciDAC Program was conceived and funded by the US Department of Energy Office of Science. It is the Office of Science's premier computational science program founded on what is arguably the perfect formula: the priority and focus is science and scientific discovery, with the understanding that the full arsenal of `enabling technologies' in applied mathematics and computer science must be brought to bear if we are to have any hope of attacking and ultimately solving today's computational Grand Challenge problems. The SciDAC Program has been in existence for four years, and many of the computational scientists funded by this program will tell you that the program has given them the hope of addressing their scientific problems in full realism for the very first time. Many of these scientists will also tell you that SciDAC has also fundamentally changed the way they do computational science. We begin this volume with one of DOE's great traditions, and core missions: energy research. As we will see, computation has been seminal to the critical advances that have been made in this arena. Of course, to understand our world, whether it is to understand its very nature or to understand it so as to control it for practical application, will require explorations on all of its scales. Computational science has been no less an important tool in this arena than it has been in the arena of energy research. From explorations of quantum chromodynamics, the fundamental theory that describes how quarks make up the protons and neutrons of which we are composed, to explorations of the complex biomolecules that are the building blocks of life, to explorations of some of the most violent phenomena in our universe and of the Universe itself, computation has provided not only significant insight, but often the only means by which we have been able to explore these complex, multicomponent systems and by which we have been able to achieve scientific discovery and understanding. While our ultimate target remains scientific discovery, it certainly can be said that at a fundamental level the world is mathematical. Equations ultimately govern the evolution of the systems of interest to us, be they physical, chemical, or biological systems. The development and choice of discretizations of these underlying equations is often a critical deciding factor in whether or not one is able to model such systems stably, faithfully, and practically, and in turn, the algorithms to solve the resultant discrete equations are the complementary, critical ingredient in the recipe to model the natural world. The use of parallel computing platforms, especially at the TeraScale, and the trend toward even larger numbers of processors, continue to present significant challenges in the development and implementation of these algorithms. Computational scientists often speak of their `workflows'. A workflow, as the name suggests, is the sum total of all complex and interlocking tasks, from simulation set up, execution, and I/O, to visualization and scientific discovery, through which the advancement in our understanding of the natural world is realized. For the computational scientist, enabling such workflows presents myriad, signiflcant challenges, and it is computer scientists that are called upon at such times to address these challenges. Simulations are currently generating data at the staggering rate of tens of TeraBytes per simulation, over the course of days. In the next few years, these data generation rates are expected to climb exponentially to hundreds of TeraBytes per simulation, performed over the course of months. The output, management, movement, analysis, and visualization of these data will be our key to unlocking the scientific discoveries buried within the data. And there is no hope of generating such data to begin with, or of scientific discovery, without stable computing platforms and a sufficiently high and sustained performance of scientific applications codes on them. Thus, scientific discovery in the realm of computational science at the TeraScale and beyond will occur at the intersection of science, applied mathematics, and computer science. The SciDAC Program was constructed to mirror this reality, and the pages that follow are a testament to the efficacy of such an approach. We would like to acknowledge the individuals on whose talents and efforts the success of SciDAC 2005 was based. Special thanks go to Betsy Riley for her work on the SciDAC 2005 Web site and meeting agenda, for lining up our corporate sponsors, for coordinating all media communications, and for her efforts in processing the proceedings contributions, to Sherry Hempfling for coordinating the overall SciDAC 2005 meeting planning, for handling a significant share of its associated communications, and for coordinating with the ORNL Conference Center and Grand Hyatt, to Angela Harris for producing many of the documents and records on which our meeting planning was based and for her efforts in coordinating with ORNL Graphics Services, to Angie Beach of the ORNL Conference Center for her efforts in procurement and setting up and executing the contracts with the hotel, and to John Bui and John Smith for their superb wireless networking and A/V set up and support. We are grateful for the relentless efforts of all of these individuals, their remarkable talents, and for the joy of working with them during this past year. They were the cornerstones of SciDAC 2005. Thanks also go to Kymba A'Hearn and Patty Boyd for on-site registration, Brittany Hagen for administrative support, Bruce Johnston for netcast support, Tim Jones for help with the proceedings and Web site, Sherry Lamb for housing and registration, Cindy Lathum for Web site design, Carolyn Peters for on-site registration, and Dami Rich for graphic design. And we would like to express our appreciation to the Oak Ridge National Laboratory, especially Jeff Nichols, the Argonne National Laboratory, the Lawrence Berkeley National Laboratory, and to our corporate sponsors, Cray, IBM, Intel, and SGI, for their support. We would like to extend special thanks also to our plenary speakers, technical speakers, poster presenters, and panelists for all of their efforts on behalf of SciDAC 2005 and for their remarkable achievements and contributions. We would like to express our deep appreciation to Lali Chatterjee, Graham Douglas and Margaret Smith of Institute of Physics Publishing, who worked tirelessly in order to provide us with this finished volume within two months, which is nothing short of miraculous. Finally, we wish to express our heartfelt thanks to Michael Strayer, SciDAC Director, whose vision it was to focus SciDAC 2005 on scientific discovery, around which all of the excitement we experienced revolved, and to our DOE SciDAC program managers, especially Fred Johnson, for their support, input, and help throughout.

  19. Thermotropic Liquid Crystal-Assisted Chemical and Biological Sensors

    PubMed Central

    Honaker, Lawrence W.; Usol’tseva, Nadezhda; Mann, Elizabeth K.

    2017-01-01

    In this review article, we analyze recent progress in the application of liquid crystal-assisted advanced functional materials for sensing biological and chemical analytes. Multiple research groups demonstrate substantial interest in liquid crystal (LC) sensing platforms, generating an increasing number of scientific articles. We review trends in implementing LC sensing techniques and identify common problems related to the stability and reliability of the sensing materials as well as to experimental set-ups. Finally, we suggest possible means of bridging scientific findings to viable and attractive LC sensor platforms. PMID:29295530

  20. Clinicians' expectations of Web 2.0 as a mechanism for knowledge transfer of stroke best practices.

    PubMed

    David, Isabelle; Poissant, Lise; Rochette, Annie

    2012-09-13

    Health professionals are increasingly encouraged to adopt an evidence-based practice to ensure greater efficiency of their services. To promote this practice, several strategies exist: distribution of educational materials, local consensus processes, educational outreach visits, local opinion leaders, and reminders. Despite these strategies, gaps continue to be observed between practice and scientific evidence. Therefore, it is important to implement innovative knowledge transfer strategies that will change health professionals' practices. Through its interactive capacities, Web 2.0 applications are worth exploring. As an example, virtual communities of practice have already begun to influence professional practice. This study was initially developed to help design a Web 2.0 platform for health professionals working with stroke patients. The aim was to gain a better understanding of professionals' perceptions of Web 2.0 before the development of the platform. A qualitative study following a phenomenological approach was chosen. We conducted individual semi-structured interviews with clinicians and managers. Interview transcripts were subjected to a content analysis. Twenty-four female clinicians and managers in Quebec, Canada, aged 28-66 participated. Most participants identified knowledge transfer as the most useful outcome of a Web 2.0 platform. Respondents also expressed their need for a user-friendly platform. Accessibility to a computer and the Internet, features of the Web 2.0 platform, user support, technology skills, and previous technological experience were found to influence perceived ease of use and usefulness. Our results show that the perceived lack of time of health professionals has an influence on perceived behavioral intention to use it despite favorable perception of the usefulness of the Web 2.0 platform. In conclusion, female health professionals in Quebec believe that Web 2.0 may be a useful mechanism for knowledge transfer. However, lack of time and lack of technological skills may limit their use of a future Web 2.0 platform. Further studies are required with other populations and in other regions to confirm these findings.

  1. [The Key Technology Study on Cloud Computing Platform for ECG Monitoring Based on Regional Internet of Things].

    PubMed

    Yang, Shu; Qiu, Yuyan; Shi, Bo

    2016-09-01

    This paper explores the methods of building the internet of things of a regional ECG monitoring, focused on the implementation of ECG monitoring center based on cloud computing platform. It analyzes implementation principles of automatic identifi cation in the types of arrhythmia. It also studies the system architecture and key techniques of cloud computing platform, including server load balancing technology, reliable storage of massive smalfi les and the implications of quick search function.

  2. Development of a Web-Based Visualization Platform for Climate Research Using Google Earth

    NASA Technical Reports Server (NTRS)

    Sun, Xiaojuan; Shen, Suhung; Leptoukh, Gregory G.; Wang, Panxing; Di, Liping; Lu, Mingyue

    2011-01-01

    Recently, it has become easier to access climate data from satellites, ground measurements, and models from various data centers, However, searching. accessing, and prc(essing heterogeneous data from different sources are very tim -consuming tasks. There is lack of a comprehensive visual platform to acquire distributed and heterogeneous scientific data and to render processed images from a single accessing point for climate studies. This paper. documents the design and implementation of a Web-based visual, interoperable, and scalable platform that is able to access climatological fields from models, satellites, and ground stations from a number of data sources using Google Earth (GE) as a common graphical interface. The development is based on the TCP/IP protocol and various data sharing open sources, such as OPeNDAP, GDS, Web Processing Service (WPS), and Web Mapping Service (WMS). The visualization capability of integrating various measurements into cE extends dramatically the awareness and visibility of scientific results. Using embedded geographic information in the GE, the designed system improves our understanding of the relationships of different elements in a four dimensional domain. The system enables easy and convenient synergistic research on a virtual platform for professionals and the general public, gr$tly advancing global data sharing and scientific research collaboration.

  3. WeBIAS: a web server for publishing bioinformatics applications.

    PubMed

    Daniluk, Paweł; Wilczyński, Bartek; Lesyng, Bogdan

    2015-11-02

    One of the requirements for a successful scientific tool is its availability. Developing a functional web service, however, is usually considered a mundane and ungratifying task, and quite often neglected. When publishing bioinformatic applications, such attitude puts additional burden on the reviewers who have to cope with poorly designed interfaces in order to assess quality of presented methods, as well as impairs actual usefulness to the scientific community at large. In this note we present WeBIAS-a simple, self-contained solution to make command-line programs accessible through web forms. It comprises a web portal capable of serving several applications and backend schedulers which carry out computations. The server handles user registration and authentication, stores queries and results, and provides a convenient administrator interface. WeBIAS is implemented in Python and available under GNU Affero General Public License. It has been developed and tested on GNU/Linux compatible platforms covering a vast majority of operational WWW servers. Since it is written in pure Python, it should be easy to deploy also on all other platforms supporting Python (e.g. Windows, Mac OS X). Documentation and source code, as well as a demonstration site are available at http://bioinfo.imdik.pan.pl/webias . WeBIAS has been designed specifically with ease of installation and deployment of services in mind. Setting up a simple application requires minimal effort, yet it is possible to create visually appealing, feature-rich interfaces for query submission and presentation of results.

  4. Measuring FLOPS Using Hardware Performance Counter Technologies on LC systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, D H

    2008-09-05

    FLOPS (FLoating-point Operations Per Second) is a commonly used performance metric for scientific programs that rely heavily on floating-point (FP) calculations. The metric is based on the number of FP operations rather than instructions, thereby facilitating a fair comparison between different machines. A well-known use of this metric is the LINPACK benchmark that is used to generate the Top500 list. It measures how fast a computer solves a dense N by N system of linear equations Ax=b, which requires a known number of FP operations, and reports the result in millions of FP operations per second (MFLOPS). While running amore » benchmark with known FP workloads can provide insightful information about the efficiency of a machine's FP pipelines in relation to other machines, measuring FLOPS of an arbitrary scientific application in a platform-independent manner is nontrivial. The goal of this paper is twofold. First, we explore the FP microarchitectures of key processors that are underpinning the LC machines. Second, we present the hardware performance monitoring counter-based measurement techniques that a user can use to get the native FLOPS of his or her program, which are practical solutions readily available on LC platforms. By nature, however, these native FLOPS metrics are not directly comparable across different machines mainly because FP operations are not consistent across microarchitectures. Thus, the first goal of this paper represents the base reference by which a user can interpret the measured FLOPS more judiciously.« less

  5. The Perfect Neuroimaging-Genetics-Computation Storm: Collision of Petabytes of Data, Millions of Hardware Devices and Thousands of Software Tools

    PubMed Central

    Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Zamanyan, Alen; Torri, Federica; Macciardi, Fabio; Hobel, Sam; Moon, Seok Woo; Sung, Young Hee; Jiang, Zhiguo; Labus, Jennifer; Kurth, Florian; Ashe-McNalley, Cody; Mayer, Emeran; Vespa, Paul M.; Van Horn, John D.; Toga, Arthur W.

    2013-01-01

    The volume, diversity and velocity of biomedical data are exponentially increasing providing petabytes of new neuroimaging and genetics data every year. At the same time, tens-of-thousands of computational algorithms are developed and reported in the literature along with thousands of software tools and services. Users demand intuitive, quick and platform-agnostic access to data, software tools, and infrastructure from millions of hardware devices. This explosion of information, scientific techniques, computational models, and technological advances leads to enormous challenges in data analysis, evidence-based biomedical inference and reproducibility of findings. The Pipeline workflow environment provides a crowd-based distributed solution for consistent management of these heterogeneous resources. The Pipeline allows multiple (local) clients and (remote) servers to connect, exchange protocols, control the execution, monitor the states of different tools or hardware, and share complete protocols as portable XML workflows. In this paper, we demonstrate several advanced computational neuroimaging and genetics case-studies, and end-to-end pipeline solutions. These are implemented as graphical workflow protocols in the context of analyzing imaging (sMRI, fMRI, DTI), phenotypic (demographic, clinical), and genetic (SNP) data. PMID:23975276

  6. Message Passing vs. Shared Address Space on a Cluster of SMPs

    NASA Technical Reports Server (NTRS)

    Shan, Hongzhang; Singh, Jaswinder Pal; Oliker, Leonid; Biswas, Rupak

    2000-01-01

    The convergence of scalable computer architectures using clusters of PCs (or PC-SMPs) with commodity networking has become an attractive platform for high end scientific computing. Currently, message-passing and shared address space (SAS) are the two leading programming paradigms for these systems. Message-passing has been standardized with MPI, and is the most common and mature programming approach. However message-passing code development can be extremely difficult, especially for irregular structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality, and high protocol overhead. In this paper, we compare the performance of and programming effort, required for six applications under both programming models on a 32 CPU PC-SMP cluster. Our application suite consists of codes that typically do not exhibit high efficiency under shared memory programming. due to their high communication to computation ratios and complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications: however, on certain classes of problems SAS performance is competitive with MPI. We also present new algorithms for improving the PC cluster performance of MPI collective operations.

  7. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology.

    PubMed

    González-Nilo, Fernando; Pérez-Acle, Tomás; Guínez-Molinos, Sergio; Geraldo, Daniela A; Sandoval, Claudia; Yévenes, Alejandro; Santos, Leonardo S; Laurie, V Felipe; Mendoza, Hegaly; Cachau, Raúl E

    2011-01-01

    After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  8. Space transportation, satellite services, and space platforms

    NASA Technical Reports Server (NTRS)

    Disher, J. H.

    1979-01-01

    The paper takes a preview of the progressive development of vehicles for space transportation, satellite services, and orbital platforms. A low-thrust upper stage of either the ion engine or chemical type will be developed to transport large spacecraft and space platforms to and from GEO. The multimission spacecraft, space telescope, and other scientific platforms will require orbital serves going beyond that provided by the Shuttle's remote manipulator system, and plans call for extravehicular activity tools, improved remote manipulators, and a remote manned work station (the cherry picker).

  9. Understanding I/O workload characteristics of a Peta-scale storage system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Youngjae; Gunasekaran, Raghul

    2015-01-01

    Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world s fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization,more » and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.« less

  10. Resilient workflows for computational mechanics platforms

    NASA Astrophysics Data System (ADS)

    Nguyên, Toàn; Trifan, Laurentiu; Désidéri, Jean-Antoine

    2010-06-01

    Workflow management systems have recently been the focus of much interest and many research and deployment for scientific applications worldwide [26, 27]. Their ability to abstract the applications by wrapping application codes have also stressed the usefulness of such systems for multidiscipline applications [23, 24]. When complex applications need to provide seamless interfaces hiding the technicalities of the computing infrastructures, their high-level modeling, monitoring and execution functionalities help giving production teams seamless and effective facilities [25, 31, 33]. Software integration infrastructures based on programming paradigms such as Python, Mathlab and Scilab have also provided evidence of the usefulness of such approaches for the tight coupling of multidisciplne application codes [22, 24]. Also high-performance computing based on multi-core multi-cluster infrastructures open new opportunities for more accurate, more extensive and effective robust multi-discipline simulations for the decades to come [28]. This supports the goal of full flight dynamics simulation for 3D aircraft models within the next decade, opening the way to virtual flight-tests and certification of aircraft in the future [23, 24, 29].

  11. Fragment-Based Docking: Development of the CHARMMing Web User Interface as a Platform for Computer-Aided Drug Design

    PubMed Central

    2015-01-01

    Web-based user interfaces to scientific applications are important tools that allow researchers to utilize a broad range of software packages with just an Internet connection and a browser.1 One such interface, CHARMMing (CHARMM interface and graphics), facilitates access to the powerful and widely used molecular software package CHARMM. CHARMMing incorporates tasks such as molecular structure analysis, dynamics, multiscale modeling, and other techniques commonly used by computational life scientists. We have extended CHARMMing’s capabilities to include a fragment-based docking protocol that allows users to perform molecular docking and virtual screening calculations either directly via the CHARMMing Web server or on computing resources using the self-contained job scripts generated via the Web interface. The docking protocol was evaluated by performing a series of “re-dockings” with direct comparison to top commercial docking software. Results of this evaluation showed that CHARMMing’s docking implementation is comparable to many widely used software packages and validates the use of the new CHARMM generalized force field for docking and virtual screening. PMID:25151852

  12. Fragment-based docking: development of the CHARMMing Web user interface as a platform for computer-aided drug design.

    PubMed

    Pevzner, Yuri; Frugier, Emilie; Schalk, Vinushka; Caflisch, Amedeo; Woodcock, H Lee

    2014-09-22

    Web-based user interfaces to scientific applications are important tools that allow researchers to utilize a broad range of software packages with just an Internet connection and a browser. One such interface, CHARMMing (CHARMM interface and graphics), facilitates access to the powerful and widely used molecular software package CHARMM. CHARMMing incorporates tasks such as molecular structure analysis, dynamics, multiscale modeling, and other techniques commonly used by computational life scientists. We have extended CHARMMing's capabilities to include a fragment-based docking protocol that allows users to perform molecular docking and virtual screening calculations either directly via the CHARMMing Web server or on computing resources using the self-contained job scripts generated via the Web interface. The docking protocol was evaluated by performing a series of "re-dockings" with direct comparison to top commercial docking software. Results of this evaluation showed that CHARMMing's docking implementation is comparable to many widely used software packages and validates the use of the new CHARMM generalized force field for docking and virtual screening.

  13. Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery.

    PubMed

    Karthikeyan, Muthukumarasamy; Vyas, Renu

    2015-01-01

    Advancement in chemoinformatics research in parallel with availability of high performance computing platform has made handling of large scale multi-dimensional scientific data for high throughput drug discovery easier. In this study we have explored publicly available molecular databases with the help of open-source based integrated in-house molecular informatics tools for virtual screening. The virtual screening literature for past decade has been extensively investigated and thoroughly analyzed to reveal interesting patterns with respect to the drug, target, scaffold and disease space. The review also focuses on the integrated chemoinformatics tools that are capable of harvesting chemical data from textual literature information and transform them into truly computable chemical structures, identification of unique fragments and scaffolds from a class of compounds, automatic generation of focused virtual libraries, computation of molecular descriptors for structure-activity relationship studies, application of conventional filters used in lead discovery along with in-house developed exhaustive PTC (Pharmacophore, Toxicophores and Chemophores) filters and machine learning tools for the design of potential disease specific inhibitors. A case study on kinase inhibitors is provided as an example.

  14. Message Passing and Shared Address Space Parallelism on an SMP Cluster

    NASA Technical Reports Server (NTRS)

    Shan, Hongzhang; Singh, Jaswinder P.; Oliker, Leonid; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.

  15. Design of web platform for science and engineering in the model of open market

    NASA Astrophysics Data System (ADS)

    Demichev, A. P.; Kryukov, A. P.

    2016-09-01

    This paper presents a design and operation algorithms of a web-platform for convenient, secure and effective remote interaction on the principles of the open market of users and providers of scientific application software and databases.

  16. Cloud computing for comparative genomics with windows azure platform.

    PubMed

    Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.

  17. Cloud Computing for Comparative Genomics with Windows Azure Platform

    PubMed Central

    Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd F.; Nelson, Tristan H.; Wall, Dennis P.

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609

  18. Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

    NASA Astrophysics Data System (ADS)

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

    2014-03-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.

  19. Cloud Based Applications and Platforms (Presentation)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brodt-Giles, D.

    2014-05-15

    Presentation to the Cloud Computing East 2014 Conference, where we are highlighting our cloud computing strategy, describing the platforms on the cloud (including Smartgrid.gov), and defining our process for implementing cloud based applications.

  20. EdGCM: Research Tools for Training the Climate Change Generation

    NASA Astrophysics Data System (ADS)

    Chandler, M. A.; Sohl, L. E.; Zhou, J.; Sieber, R.

    2011-12-01

    Climate scientists employ complex computer simulations of the Earth's physical systems to prepare climate change forecasts, study the physical mechanisms of climate, and to test scientific hypotheses and computer parameterizations. The Intergovernmental Panel on Climate Change 4th Assessment Report (2007) demonstrates unequivocally that policy makers rely heavily on such Global Climate Models (GCMs) to assess the impacts of potential economic and emissions scenarios. However, true climate modeling capabilities are not disseminated to the majority of world governments or U.S. researchers - let alone to the educators who will be training the students who are about to be presented with a world full of climate change stakeholders. The goal is not entirely quixotic; in fact, by the mid-1990's prominent climate scientists were predicting with certainty that schools and politicians would "soon" be running GCMs on laptops [Randall, 1996]. For a variety of reasons this goal was never achieved (nor even really attempted). However, around the same time NASA and the National Science Foundation supported a small pilot project at Columbia University to show the potential of putting sophisticated computer climate models - not just "demos" or "toy models" - into the hands of non-specialists. The Educational Global Climate Modeling Project (EdGCM) gave users access to a real global climate model and provided them with the opportunity to experience the details of climate model setup, model operation, post-processing and scientific visualization. EdGCM was designed for use in both research and education - it is a full-blown research GCM, but the ultimate goal is to develop a capability to embed these crucial technologies across disciplines, networks, platforms, and even across academia and industry. With this capability in place we can begin training the skilled workforce that is necessary to deal with the multitude of climate impacts that will occur over the coming decades. To further increase the educational potential of climate models, the EdGCM project has also created "EZgcm". Through a joint venture of NASA, Columbia University and McGill University EZgcm moves the focus toward a greater use of Web 1.0 and Web 2.0-based technologies. It shifts the educational objectives towards a greater emphasis on teaching students how science is conducted and what role science plays in assessing climate change. That is, students learn about the steps of the scientific process as conveyed by climate modeling research: constructing a hypothesis, designing an experiment, running a computer model, using scientific visualization to support analysis, communicating the results of that analysis, and role playing the scientific peer review process. This is in stark contrast to what they learn from the political debate over climate change, which they often confuse with a scientific debate.

  1. A cloud computing based platform for sleep behavior and chronic diseases collaborative research.

    PubMed

    Kuo, Mu-Hsing; Borycki, Elizabeth; Kushniruk, Andre; Huang, Yueh-Min; Hung, Shu-Hui

    2014-01-01

    The objective of this study is to propose a Cloud Computing based platform for sleep behavior and chronic disease collaborative research. The platform consists of two main components: (1) a sensing bed sheet with textile sensors to automatically record patient's sleep behaviors and vital signs, and (2) a service-oriented cloud computing architecture (SOCCA) that provides a data repository and allows for sharing and analysis of collected data. Also, we describe our systematic approach to implementing the SOCCA. We believe that the new cloud-based platform can provide nurse and other health professional researchers located in differing geographic locations with a cost effective, flexible, secure and privacy-preserved research environment.

  2. Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

    NASA Astrophysics Data System (ADS)

    Klimentov, A.; Buncic, P.; De, K.; Jha, S.; Maeno, T.; Mount, R.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Porter, R. J.; Read, K. F.; Vaniachine, A.; Wells, J. C.; Wenaus, T.

    2015-05-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.

  3. Towards prediction of correlated material properties using quantum Monte Carlo methods

    NASA Astrophysics Data System (ADS)

    Wagner, Lucas

    Correlated electron systems offer a richness of physics far beyond noninteracting systems. If we would like to pursue the dream of designer correlated materials, or, even to set a more modest goal, to explain in detail the properties and effective physics of known materials, then accurate simulation methods are required. Using modern computational resources, quantum Monte Carlo (QMC) techniques offer a way to directly simulate electron correlations. I will show some recent results on a few extremely challenging materials including the metal-insulator transition of VO2, the ground state of the doped cuprates, and the pressure dependence of magnetic properties in FeSe. By using a relatively simple implementation of QMC, at least some properties of these materials can be described truly from first principles, without any adjustable parameters. Using the QMC platform, we have developed a way of systematically deriving effective lattice models from the simulation. This procedure is particularly attractive for correlated electron systems because the QMC methods treat the one-body and many-body components of the wave function and Hamiltonian on completely equal footing. I will show some examples of using this downfolding technique and the high accuracy of QMC to connect our intuitive ideas about interacting electron systems with high fidelity simulations. The work in this presentation was supported in part by NSF DMR 1206242, the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program under Award Number FG02-12ER46875, and the Center for Emergent Superconductivity, Department of Energy Frontier Research Center under Grant No. DEAC0298CH1088. Computing resources were provided by a Blue Waters Illinois grant and INCITE PhotSuper and SuperMatSim allocations.

  4. Scientific rigor through videogames.

    PubMed

    Treuille, Adrien; Das, Rhiju

    2014-11-01

    Hypothesis-driven experimentation - the scientific method - can be subverted by fraud, irreproducibility, and lack of rigorous predictive tests. A robust solution to these problems may be the 'massive open laboratory' model, recently embodied in the internet-scale videogame EteRNA. Deploying similar platforms throughout biology could enforce the scientific method more broadly. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. ACQ4: an open-source software platform for data acquisition and analysis in neurophysiology research.

    PubMed

    Campagnola, Luke; Kratz, Megan B; Manis, Paul B

    2014-01-01

    The complexity of modern neurophysiology experiments requires specialized software to coordinate multiple acquisition devices and analyze the collected data. We have developed ACQ4, an open-source software platform for performing data acquisition and analysis in experimental neurophysiology. This software integrates the tasks of acquiring, managing, and analyzing experimental data. ACQ4 has been used primarily for standard patch-clamp electrophysiology, laser scanning photostimulation, multiphoton microscopy, intrinsic imaging, and calcium imaging. The system is highly modular, which facilitates the addition of new devices and functionality. The modules included with ACQ4 provide for rapid construction of acquisition protocols, live video display, and customizable analysis tools. Position-aware data collection allows automated construction of image mosaics and registration of images with 3-dimensional anatomical atlases. ACQ4 uses free and open-source tools including Python, NumPy/SciPy for numerical computation, PyQt for the user interface, and PyQtGraph for scientific graphics. Supported hardware includes cameras, patch clamp amplifiers, scanning mirrors, lasers, shutters, Pockels cells, motorized stages, and more. ACQ4 is available for download at http://www.acq4.org.

  6. Cyber-physical geographical information service-enabled control of diverse in-situ sensors.

    PubMed

    Chen, Nengcheng; Xiao, Changjiang; Pu, Fangling; Wang, Xiaolei; Wang, Chao; Wang, Zhili; Gong, Jianya

    2015-01-23

    Realization of open online control of diverse in-situ sensors is a challenge. This paper proposes a Cyber-Physical Geographical Information Service-enabled method for control of diverse in-situ sensors, based on location-based instant sensing of sensors, which provides closed-loop feedbacks. The method adopts the concepts and technologies of newly developed cyber-physical systems (CPSs) to combine control with sensing, communication, and computation, takes advantage of geographical information service such as services provided by the Tianditu which is a basic geographic information service platform in China and Sensor Web services to establish geo-sensor applications, and builds well-designed human-machine interfaces (HMIs) to support online and open interactions between human beings and physical sensors through cyberspace. The method was tested with experiments carried out in two geographically distributed scientific experimental fields, Baoxie Sensor Web Experimental Field in Wuhan city and Yemaomian Landslide Monitoring Station in Three Gorges, with three typical sensors chosen as representatives using the prototype system Geospatial Sensor Web Common Service Platform. The results show that the proposed method is an open, online, closed-loop means of control.

  7. Cyber-Physical Geographical Information Service-Enabled Control of Diverse In-Situ Sensors

    PubMed Central

    Chen, Nengcheng; Xiao, Changjiang; Pu, Fangling; Wang, Xiaolei; Wang, Chao; Wang, Zhili; Gong, Jianya

    2015-01-01

    Realization of open online control of diverse in-situ sensors is a challenge. This paper proposes a Cyber-Physical Geographical Information Service-enabled method for control of diverse in-situ sensors, based on location-based instant sensing of sensors, which provides closed-loop feedbacks. The method adopts the concepts and technologies of newly developed cyber-physical systems (CPSs) to combine control with sensing, communication, and computation, takes advantage of geographical information service such as services provided by the Tianditu which is a basic geographic information service platform in China and Sensor Web services to establish geo-sensor applications, and builds well-designed human-machine interfaces (HMIs) to support online and open interactions between human beings and physical sensors through cyberspace. The method was tested with experiments carried out in two geographically distributed scientific experimental fields, Baoxie Sensor Web Experimental Field in Wuhan city and Yemaomian Landslide Monitoring Station in Three Gorges, with three typical sensors chosen as representatives using the prototype system Geospatial Sensor Web Common Service Platform. The results show that the proposed method is an open, online, closed-loop means of control. PMID:25625906

  8. CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research

    PubMed Central

    Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.

    2014-01-01

    The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400

  9. CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.

    PubMed

    Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C

    2014-01-01

    The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.

  10. Acceleration of Cherenkov angle reconstruction with the new Intel Xeon/FPGA compute platform for the particle identification in the LHCb Upgrade

    NASA Astrophysics Data System (ADS)

    Faerber, Christian

    2017-10-01

    The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.

  11. Towards Reproducibility in Computational Hydrology

    NASA Astrophysics Data System (ADS)

    Hutton, Christopher; Wagener, Thorsten; Freer, Jim; Han, Dawei; Duffy, Chris; Arheimer, Berit

    2017-04-01

    Reproducibility is a foundational principle in scientific research. The ability to independently re-run an experiment helps to verify the legitimacy of individual findings, and evolve (or reject) hypotheses and models of how environmental systems function, and move them from specific circumstances to more general theory. Yet in computational hydrology (and in environmental science more widely) the code and data that produces published results are not regularly made available, and even if they are made available, there remains a multitude of generally unreported choices that an individual scientist may have made that impact the study result. This situation strongly inhibits the ability of our community to reproduce and verify previous findings, as all the information and boundary conditions required to set up a computational experiment simply cannot be reported in an article's text alone. In Hutton et al 2016 [1], we argue that a cultural change is required in the computational hydrological community, in order to advance and make more robust the process of knowledge creation and hypothesis testing. We need to adopt common standards and infrastructures to: (1) make code readable and re-useable; (2) create well-documented workflows that combine re-useable code together with data to enable published scientific findings to be reproduced; (3) make code and workflows available, easy to find, and easy to interpret, using code and code metadata repositories. To create change we argue for improved graduate training in these areas. In this talk we reflect on our progress in achieving reproducible, open science in computational hydrology, which are relevant to the broader computational geoscience community. In particular, we draw on our experience in the Switch-On (EU funded) virtual water science laboratory (http://www.switch-on-vwsl.eu/participate/), which is an open platform for collaboration in hydrological experiments (e.g. [2]). While we use computational hydrology as the example application area, we believe that our conclusions are of value to the wider environmental and geoscience community as far as the use of code and models for scientific advancement is concerned. References: [1] Hutton, C., T. Wagener, J. Freer, D. Han, C. Duffy, and B. Arheimer (2016), Most computational hydrology is not reproducible, so is it really science?, Water Resour. Res., 52, 7548-7555, doi:10.1002/2016WR019285. [2] Ceola, S., et al. (2015), Virtual laboratories: New opportunities for collaborative water science, Hydrol. Earth Syst. Sci. Discuss., 11(12), 13443-13478, doi:10.5194/hessd-11-13443-2014.

  12. The Cyborg Astrobiologist: testing a novelty detection algorithm on two mobile exploration systems at Rivas Vaciamadrid in Spain and at the Mars Desert Research Station in Utah

    NASA Astrophysics Data System (ADS)

    McGuire, P. C.; Gross, C.; Wendt, L.; Bonnici, A.; Souza-Egipsy, V.; Ormö, J.; Díaz-Martínez, E.; Foing, B. H.; Bose, R.; Walter, S.; Oesker, M.; Ontrup, J.; Haschke, R.; Ritter, H.

    2010-01-01

    In previous work, a platform was developed for testing computer-vision algorithms for robotic planetary exploration. This platform consisted of a digital video camera connected to a wearable computer for real-time processing of images at geological and astrobiological field sites. The real-time processing included image segmentation and the generation of interest points based upon uncommonness in the segmentation maps. Also in previous work, this platform for testing computer-vision algorithms has been ported to a more ergonomic alternative platform, consisting of a phone camera connected via the Global System for Mobile Communications (GSM) network to a remote-server computer. The wearable-computer platform has been tested at geological and astrobiological field sites in Spain (Rivas Vaciamadrid and Riba de Santiuste), and the phone camera has been tested at a geological field site in Malta. In this work, we (i) apply a Hopfield neural-network algorithm for novelty detection based upon colour, (ii) integrate a field-capable digital microscope on the wearable computer platform, (iii) test this novelty detection with the digital microscope at Rivas Vaciamadrid, (iv) develop a Bluetooth communication mode for the phone-camera platform, in order to allow access to a mobile processing computer at the field sites, and (v) test the novelty detection on the Bluetooth-enabled phone camera connected to a netbook computer at the Mars Desert Research Station in Utah. This systems engineering and field testing have together allowed us to develop a real-time computer-vision system that is capable, for example, of identifying lichens as novel within a series of images acquired in semi-arid desert environments. We acquired sequences of images of geologic outcrops in Utah and Spain consisting of various rock types and colours to test this algorithm. The algorithm robustly recognized previously observed units by their colour, while requiring only a single image or a few images to learn colours as familiar, demonstrating its fast learning capability.

  13. Executable research compendia in geoscience research infrastructures

    NASA Astrophysics Data System (ADS)

    Nüst, Daniel

    2017-04-01

    From generation through analysis and collaboration to communication, scientific research requires the right tools. Scientists create their own software using third party libraries and platforms. Cloud computing, Open Science, public data infrastructures, and Open Source enable scientists with unprecedented opportunites, nowadays often in a field "Computational X" (e.g. computational seismology) or X-informatics (e.g. geoinformatics) [0]. This increases complexity and generates more innovation, e.g. Environmental Research Infrastructures (environmental RIs [1]). Researchers in Computational X write their software relying on both source code (e.g. from https://github.com) and binary libraries (e.g. from package managers such as APT, https://wiki.debian.org/Apt, or CRAN, https://cran.r-project.org/). They download data from domain specific (cf. https://re3data.org) or generic (e.g. https://zenodo.org) data repositories, and deploy computations remotely (e.g. European Open Science Cloud). The results themselves are archived, given persistent identifiers, connected to other works (e.g. using https://orcid.org/), and listed in metadata catalogues. A single researcher, intentionally or not, interacts with all sub-systems of RIs: data acquisition, data access, data processing, data curation, and community support [3]. To preserve computational research [3] proposes the Executable Research Compendium (ERC), a container format closing the gap of dependency preservation by encapsulating the runtime environment. ERCs and RIs can be integrated for different uses: (i) Coherence: ERC services validate completeness, integrity and results (ii) Metadata: ERCs connect the different parts of a piece of research and faciliate discovery (iii) Exchange and Preservation: ERC as usable building blocks are the shared and archived entity (iv) Self-consistency: ERCs remove dependence on ephemeral sources (v) Execution: ERC services create and execute a packaged analysis but integrate with existing platforms for display and control These integrations are vital for capturing workflows in RIs and connect key stakeholders (scientists, publishers, librarians). They are demonstrated using developments by the DFG-funded project Opening Reproducible Research (http://o2r.info). Semi-automatic creation of ERCs based on research workflows is a core goal of the project. References [0] Tony Hey, Stewart Tansley, Kristin Tolle (eds), 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research. [1] P. Martin et al., Open Information Linking for Environmental Research Infrastructures, 2015 IEEE 11th International Conference on e-Science, Munich, 2015, pp. 513-520. doi: 10.1109/eScience.2015.66 [2] Y. Chen et al., Analysis of Common Requirements for Environmental Science Research Infrastructures, The International Symposium on Grids and Clouds (ISGC) 2013, Taipei, 2013, http://pos.sissa.it/archive/conferences/179/032/ISGC [3] Opening Reproducible Research, Geophysical Research Abstracts Vol. 18, EGU2016-7396, 2016, http://meetingorganizer.copernicus.org/EGU2016/EGU2016-7396.pdf

  14. Scientific Workflows and the Sensor Web for Virtual Environmental Observatories

    NASA Astrophysics Data System (ADS)

    Simonis, I.; Vahed, A.

    2008-12-01

    Virtual observatories mature from their original domain and become common practice for earth observation research and policy building. The term Virtual Observatory originally came from the astronomical research community. Here, virtual observatories provide universal access to the available astronomical data archives of space and ground-based observatories. Further on, as those virtual observatories aim at integrating heterogeneous ressources provided by a number of participating organizations, the virtual observatory acts as a coordinating entity that strives for common data analysis techniques and tools based on common standards. The Sensor Web is on its way to become one of the major virtual observatories outside of the astronomical research community. Like the original observatory that consists of a number of telescopes, each observing a specific part of the wave spectrum and with a collection of astronomical instruments, the Sensor Web provides a multi-eyes perspective on the current, past, as well as future situation of our planet and its surrounding spheres. The current view of the Sensor Web is that of a single worldwide collaborative, coherent, consistent and consolidated sensor data collection, fusion and distribution system. The Sensor Web can perform as an extensive monitoring and sensing system that provides timely, comprehensive, continuous and multi-mode observations. This technology is key to monitoring and understanding our natural environment, including key areas such as climate change, biodiversity, or natural disasters on local, regional, and global scales. The Sensor Web concept has been well established with ongoing global research and deployment of Sensor Web middleware and standards and represents the foundation layer of systems like the Global Earth Observation System of Systems (GEOSS). The Sensor Web consists of a huge variety of physical and virtual sensors as well as observational data, made available on the Internet at standardized interfaces. All data sets and sensor communication follow well-defined abstract models and corresponding encodings, mostly developed by the OGC Sensor Web Enablement initiative. Scientific progress is currently accelerated by an emerging new concept called scientific workflows, which organize and manage complex distributed computations. A scientific workflow represents and records the highly complex processes that a domain scientist typically would follow in exploration, discovery and ultimately, transformation of raw data to publishable results. The challenge is now to integrate the benefits of scientific workflows with those provided by the Sensor Web in order to leverage all resources for scientific exploration, problem solving, and knowledge generation. Scientific workflows for the Sensor Web represent the next evolutionary step towards efficient, powerful, and flexible earth observation frameworks and platforms. Those platforms support the entire process from capturing data, sharing and integrating, to requesting additional observations. Multiple sites and organizations will participate on single platforms and scientists from different countries and organizations interact and contribute to large-scale research projects. Simultaneously, the data- and information overload becomes manageable, as multiple layers of abstraction will free scientists to deal with underlying data-, processing or storage peculiarities. The vision are automated investigation and discovery mechanisms that allow scientists to pose queries to the system, which in turn would identify potentially related resources, schedules processing tasks and assembles all parts in workflows that may satisfy the query.

  15. Platform-independent method for computer aided schematic drawings

    DOEpatents

    Vell, Jeffrey L [Slingerlands, NY; Siganporia, Darius M [Clifton Park, NY; Levy, Arthur J [Fort Lauderdale, FL

    2012-02-14

    A CAD/CAM method is disclosed for a computer system to capture and interchange schematic drawing and associated design information. The schematic drawing and design information are stored in an extensible, platform-independent format.

  16. A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data

    NASA Astrophysics Data System (ADS)

    Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.

    2017-12-01

    Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.

  17. The Efficacy of the Internet-Based Blackboard Platform in Developmental Writing Classes

    ERIC Educational Resources Information Center

    Shudooh, Yusuf M.

    2016-01-01

    The application of computer-assisted platforms in writing classes is a relatively new paradigm in education. The adoption of computers-assisted writing classes is gaining ground in many western and non western universities. Numerous issues can be addressed when conducting computer-assisted classes (CAC). However, a few studies conducted to assess…

  18. [Investigation methodology and application on scientific and technological personnel of traditional Chinese medical resources based on data from Chinese scientific research paper].

    PubMed

    Li, Hai-yan; Li, Yuan-hai; Yang, Yang; Liu, Fang-zhou; Wang, Jing; Tian, Ye; Yang, Ce; Liu, Yang; Li, Meng; Sun Li-ying

    2015-12-01

    The aim of this study is to identify the present status of the scientific and technological personnel in the field of traditional Chinese medicine (TCM) resource science. Based on the data from Chinese scientific research paper, an investigation regarding the number of the personnel, the distribution, their output of paper, their scientific research teams, high-yield authors and high-cited authors was conducted. The study covers seven subfields of traditional Chinese medicine identification, quality standard, Chinese medicine cultivation, harvest processing of TCM, market development and resource protection and resource management, as well as 82 widely used Chinese medicine species, such as Ginseng and Radix Astragali. One hundred and fifteen domain authority experts were selected based on the data of high-yield authors and high-cited authors. The database system platform "Skilled Scientific and Technological Personnel in the field of Traditional Chinese Medicine Resource Science-Chinese papers" was established. This platform successfully provided the retrieval result of the personnel, output of paper, and their core research team by input the study field, year, and Chinese medicine species. The investigation provides basic data of scientific and technological personnel in the field of traditional Chinese medicine resource science for administrative agencies and also evidence for the selection of scientific and technological personnel and construction of scientific research teams.

  19. State estimation improves prospects for ocean research

    NASA Astrophysics Data System (ADS)

    Stammer, Detlef; Wunsch, C.; Fukumori, I.; Marshall, J.

    Rigorous global ocean state estimation methods can now be used to produce dynamically consistent time-varying model/data syntheses, the results of which are being used to study a variety of important scientific problems. Figure 1 shows a schematic of a complete ocean observing and synthesis system that includes global observations and state-of-the-art ocean general circulation models (OGCM) run on modern computer platforms. A global observing system is described in detail in Smith and Koblinsky [2001],and the present status of ocean modeling and anticipated improvements are addressed by Griffies et al. [2001]. Here, the focus is on the third component of state estimation: the synthesis of the observations and a model into a unified, dynamically consistent estimate.

  20. Aerosol Polarimetry Sensor (APS): Design Summary, Performance and Potential Modifications

    NASA Technical Reports Server (NTRS)

    Cairns, Brian

    2014-01-01

    APS is a mature design that has already been built and has a TRL of 7. Algorithmic and retrieval capabilities continue to improve and make better and more sophisticated used of the data. Adjoint solutions, both in one dimensional and three dimensional are computationally efficient and should be the preferred implementation for the calculation of Jacobians (one dimensional), or cost-function gradients (three dimensional). Adjoint solutions necessarily provide resolution of internal fields and simplify incorporation of active measurements in retrievals, which will be necessary for a future ACE mission. Its best to test these capabilities when you know the answer: OSSEs that are well constrained observationally provide the best place to test future multi-instrument platform capabilities and ensure capabilities will meet scientific needs.

  1. Power Efficient Hardware Architecture of SHA-1 Algorithm for Trusted Mobile Computing

    NASA Astrophysics Data System (ADS)

    Kim, Mooseop; Ryou, Jaecheol

    The Trusted Mobile Platform (TMP) is developed and promoted by the Trusted Computing Group (TCG), which is an industry standard body to enhance the security of the mobile computing environment. The built-in SHA-1 engine in TMP is one of the most important circuit blocks and contributes the performance of the whole platform because it is used as key primitives supporting platform integrity and command authentication. Mobile platforms have very stringent limitations with respect to available power, physical circuit area, and cost. Therefore special architecture and design methods for low power SHA-1 circuit are required. In this paper, we present a novel and efficient hardware architecture of low power SHA-1 design for TMP. Our low power SHA-1 hardware can compute 512-bit data block using less than 7,000 gates and has a power consumption about 1.1 mA on a 0.25μm CMOS process.

  2. An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations

    NASA Technical Reports Server (NTRS)

    Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

    1996-01-01

    We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.

  3. Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms

    NASA Technical Reports Server (NTRS)

    Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

    1997-01-01

    We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.

  4. A Web Tool for Research in Nonlinear Optics

    NASA Astrophysics Data System (ADS)

    Prikhod'ko, Nikolay V.; Abramovsky, Viktor A.; Abramovskaya, Natalia V.; Demichev, Andrey P.; Kryukov, Alexandr P.; Polyakov, Stanislav P.

    2016-02-01

    This paper presents a project of developing the web platform called WebNLO for computer modeling of nonlinear optics phenomena. We discuss a general scheme of the platform and a model for interaction between the platform modules. The platform is built as a set of interacting RESTful web services (SaaS approach). Users can interact with the platform through a web browser or command line interface. Such a resource has no analogues in the field of nonlinear optics and will be created for the first time therefore allowing researchers to access high-performance computing resources that will significantly reduce the cost of the research and development process.

  5. Continuous measurement of breast tumour hormone receptor expression: a comparison of two computational pathology platforms.

    PubMed

    Ahern, Thomas P; Beck, Andrew H; Rosner, Bernard A; Glass, Ben; Frieling, Gretchen; Collins, Laura C; Tamimi, Rulla M

    2017-05-01

    Computational pathology platforms incorporate digital microscopy with sophisticated image analysis to permit rapid, continuous measurement of protein expression. We compared two computational pathology platforms on their measurement of breast tumour oestrogen receptor (ER) and progesterone receptor (PR) expression. Breast tumour microarrays from the Nurses' Health Study were stained for ER (n=592) and PR (n=187). One expert pathologist scored cases as positive if ≥1% of tumour nuclei exhibited stain. ER and PR were then measured with the Definiens Tissue Studio (automated) and Aperio Digital Pathology (user-supervised) platforms. Platform-specific measurements were compared using boxplots, scatter plots and correlation statistics. Classification of ER and PR positivity by platform-specific measurements was evaluated with areas under receiver operating characteristic curves (AUC) from univariable logistic regression models, using expert pathologist classification as the standard. Both platforms showed considerable overlap in continuous measurements of ER and PR between positive and negative groups classified by expert pathologist. Platform-specific measurements were strongly and positively correlated with one another (r≥0.77). The user-supervised Aperio workflow performed slightly better than the automated Definiens workflow at classifying ER positivity (AUC Aperio =0.97; AUC Definiens =0.90; difference=0.07, 95% CI 0.05 to 0.09) and PR positivity (AUC Aperio =0.94; AUC Definiens =0.87; difference=0.07, 95% CI 0.03 to 0.12). Paired hormone receptor expression measurements from two different computational pathology platforms agreed well with one another. The user-supervised workflow yielded better classification accuracy than the automated workflow. Appropriately validated computational pathology algorithms enrich molecular epidemiology studies with continuous protein expression data and may accelerate tumour biomarker discovery. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  6. Energize New Mexico - Integration of Diverse Energy-Related Research Data into an Interoperable Geospatial Infrastructure and National Data Repositories

    NASA Astrophysics Data System (ADS)

    Hudspeth, W. B.; Barrett, H.; Diller, S.; Valentin, G.

    2016-12-01

    Energize is New Mexico's Experimental Program to Stimulate Competitive Research (NM EPSCoR), funded by the NSF with a focus on building capacity to conduct scientific research. Energize New Mexico leverages the work of faculty and students from NM universities and colleges to provide the tools necessary to a quantitative, science-driven discussion of the state's water policy options and to realize New Mexico's potential for sustainable energy development. This presentation discusses the architectural details of NM EPSCoR's collaborative data management system, GSToRE, and how New Mexico researchers use it to share and analyze diverse research data, with the goal of attaining sustainable energy development in the state.The Earth Data Analysis Center (EDAC) at The University of New Mexico leads the development of computational interoperability capacity that allows the wide use and sharing of energy-related data among NM EPSCoR researchers. Data from a variety of research disciplines is stored and maintained in EDAC's Geographic Storage, Transformation and Retrieval Engine (GSToRE), a distributed platform for large-scale vector and raster data discovery, subsetting, and delivery via Web services that are based on Open Geospatial Consortium (OGC) and REST Web-service standards. Researchers upload and register scientific datasets using a front-end client that collects the critical metadata. In addition, researchers have the option to register their datasets with DataONE, a national, community-driven project that provides access to data across multiple member repositories. The GSToRE platform maintains a searchable, core collection of metadata elements that can be used to deliver metadata in multiple formats, including ISO 19115-2/19139 and FGDC CSDGM. Stored metadata elements also permit the platform to automate the registration of Energize datasets into DataONE, once the datasets are approved for release to the public.

  7. Warp-X: A new exascale computing platform for beam–plasma simulations

    DOE PAGES

    Vay, J. -L.; Almgren, A.; Bell, J.; ...

    2018-01-31

    Turning the current experimental plasma accelerator state-of-the-art from a promising technology into mainstream scientific tools depends critically on high-performance, high-fidelity modeling of complex processes that develop over a wide range of space and time scales. As part of the U.S. Department of Energy's Exascale Computing Project, a team from Lawrence Berkeley National Laboratory, in collaboration with teams from SLAC National Accelerator Laboratory and Lawrence Livermore National Laboratory, is developing a new plasma accelerator simulation tool that will harness the power of future exascale supercomputers for high-performance modeling of plasma accelerators. We present the various components of the codes such asmore » the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software. Lastly, the code structure, status, early examples of applications and plans are discussed.« less

  8. JINR cloud infrastructure evolution

    NASA Astrophysics Data System (ADS)

    Baranov, A. V.; Balashov, N. A.; Kutovskiy, N. A.; Semenov, R. N.

    2016-09-01

    To fulfil JINR commitments in different national and international projects related to the use of modern information technologies such as cloud and grid computing as well as to provide a modern tool for JINR users for their scientific research a cloud infrastructure was deployed at Laboratory of Information Technologies of Joint Institute for Nuclear Research. OpenNebula software was chosen as a cloud platform. Initially it was set up in simple configuration with single front-end host and a few cloud nodes. Some custom development was done to tune JINR cloud installation to fit local needs: web form in the cloud web-interface for resources request, a menu item with cloud utilization statistics, user authentication via Kerberos, custom driver for OpenVZ containers. Because of high demand in that cloud service and its resources over-utilization it was re-designed to cover increasing users' needs in capacity, availability and reliability. Recently a new cloud instance has been deployed in high-availability configuration with distributed network file system and additional computing power.

  9. Warp-X: A new exascale computing platform for beam–plasma simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vay, J. -L.; Almgren, A.; Bell, J.

    Turning the current experimental plasma accelerator state-of-the-art from a promising technology into mainstream scientific tools depends critically on high-performance, high-fidelity modeling of complex processes that develop over a wide range of space and time scales. As part of the U.S. Department of Energy's Exascale Computing Project, a team from Lawrence Berkeley National Laboratory, in collaboration with teams from SLAC National Accelerator Laboratory and Lawrence Livermore National Laboratory, is developing a new plasma accelerator simulation tool that will harness the power of future exascale supercomputers for high-performance modeling of plasma accelerators. We present the various components of the codes such asmore » the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software. Lastly, the code structure, status, early examples of applications and plans are discussed.« less

  10. Web-based analysis and publication of flow cytometry experiments.

    PubMed

    Kotecha, Nikesh; Krutzik, Peter O; Irish, Jonathan M

    2010-07-01

    Cytobank is a Web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a Web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permission, from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at http://www.cytobank.org. (c) 2010 by John Wiley & Sons, Inc.

  11. Web-Based Analysis and Publication of Flow Cytometry Experiments

    PubMed Central

    Kotecha, Nikesh; Krutzik, Peter O.; Irish, Jonathan M.

    2014-01-01

    Cytobank is a web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permissions from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at www.cytobank.org PMID:20578106

  12. GO2OGS 1.0: a versatile workflow to integrate complex geological information with fault data into numerical simulation models

    NASA Astrophysics Data System (ADS)

    Fischer, T.; Naumov, D.; Sattler, S.; Kolditz, O.; Walther, M.

    2015-11-01

    We offer a versatile workflow to convert geological models built with the ParadigmTM GOCAD© (Geological Object Computer Aided Design) software into the open-source VTU (Visualization Toolkit unstructured grid) format for usage in numerical simulation models. Tackling relevant scientific questions or engineering tasks often involves multidisciplinary approaches. Conversion workflows are needed as a way of communication between the diverse tools of the various disciplines. Our approach offers an open-source, platform-independent, robust, and comprehensible method that is potentially useful for a multitude of environmental studies. With two application examples in the Thuringian Syncline, we show how a heterogeneous geological GOCAD model including multiple layers and faults can be used for numerical groundwater flow modeling, in our case employing the OpenGeoSys open-source numerical toolbox for groundwater flow simulations. The presented workflow offers the chance to incorporate increasingly detailed data, utilizing the growing availability of computational power to simulate numerical models.

  13. Solving lattice QCD systems of equations using mixed precision solvers on GPUs

    NASA Astrophysics Data System (ADS)

    Clark, M. A.; Babich, R.; Barros, K.; Brower, R. C.; Rebbi, C.

    2010-09-01

    Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodynamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40, 135 and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.

  14. 78 FR 27974 - Proposed Collection; 60-Day Comment Request: National Cancer Institute (NCI) Alliance for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-13

    ... Comment Request: National Cancer Institute (NCI) Alliance for Nanotechnology in Cancer Platform... Nanotechnology Research, National Cancer Institute, 31 Center Drive, Bldg. 31 A, Rm. 10A52, Bethesda, MD 20892 or... Institute (NCI) Alliance for Nanotechnology in Cancer Platform Partnership Scientific Progress Reports, 0925...

  15. QuantifyMe: An Open-Source Automated Single-Case Experimental Design Platform.

    PubMed

    Taylor, Sara; Sano, Akane; Ferguson, Craig; Mohan, Akshay; Picard, Rosalind W

    2018-04-05

    Smartphones and wearable sensors have enabled unprecedented data collection, with many products now providing feedback to users about recommended step counts or sleep durations. However, these recommendations do not provide personalized insights that have been shown to be best suited for a specific individual. A scientific way to find individualized recommendations and causal links is to conduct experiments using single-case experimental design; however, properly designed single-case experiments are not easy to conduct on oneself. We designed, developed, and evaluated a novel platform, QuantifyMe, for novice self-experimenters to conduct proper-methodology single-case self-experiments in an automated and scientific manner using their smartphones. We provide software for the platform that we used (available for free on GitHub), which provides the methodological elements to run many kinds of customized studies. In this work, we evaluate its use with four different kinds of personalized investigations, examining how variables such as sleep duration and regularity, activity, and leisure time affect personal happiness, stress, productivity, and sleep efficiency. We conducted a six-week pilot study ( N = 13) to evaluate QuantifyMe. We describe the lessons learned developing the platform and recommendations for its improvement, as well as its potential for enabling personalized insights to be scientifically evaluated in many individuals, reducing the high administrative cost for advancing human health and wellbeing.

  16. QuantifyMe: An Open-Source Automated Single-Case Experimental Design Platform †

    PubMed Central

    Sano, Akane; Ferguson, Craig; Mohan, Akshay; Picard, Rosalind W.

    2018-01-01

    Smartphones and wearable sensors have enabled unprecedented data collection, with many products now providing feedback to users about recommended step counts or sleep durations. However, these recommendations do not provide personalized insights that have been shown to be best suited for a specific individual. A scientific way to find individualized recommendations and causal links is to conduct experiments using single-case experimental design; however, properly designed single-case experiments are not easy to conduct on oneself. We designed, developed, and evaluated a novel platform, QuantifyMe, for novice self-experimenters to conduct proper-methodology single-case self-experiments in an automated and scientific manner using their smartphones. We provide software for the platform that we used (available for free on GitHub), which provides the methodological elements to run many kinds of customized studies. In this work, we evaluate its use with four different kinds of personalized investigations, examining how variables such as sleep duration and regularity, activity, and leisure time affect personal happiness, stress, productivity, and sleep efficiency. We conducted a six-week pilot study (N = 13) to evaluate QuantifyMe. We describe the lessons learned developing the platform and recommendations for its improvement, as well as its potential for enabling personalized insights to be scientifically evaluated in many individuals, reducing the high administrative cost for advancing human health and wellbeing. PMID:29621133

  17. SysFinder: A customized platform for search, comparison and assisted design of appropriate animal models based on systematic similarity.

    PubMed

    Yang, Shuang; Zhang, Guoqing; Liu, Wan; Wang, Zhen; Zhang, Jifeng; Yang, Dongshan; Chen, Y Eugene; Sun, Hong; Li, Yixue

    2017-05-20

    Animal models are increasingly gaining values by cross-comparisons of response or resistance to clinical agents used for patients. However, many disease mechanisms and drug effects generated from animal models are not transferable to human. To address these issues, we developed SysFinder (http://lifecenter.sgst.cn/SysFinder), a platform for scientists to find appropriate animal models for translational research. SysFinder offers a "topic-centered" approach for systematic comparisons of human genes, whose functions are involved in a specific scientific topic, to the corresponding homologous genes of animal models. Scientific topic can be a certain disease, drug, gene function or biological pathway. SysFinder calculates multi-level similarity indexes to evaluate the similarities between human and animal models in specified scientific topics. Meanwhile, SysFinder offers species-specific information to investigate the differences in molecular mechanisms between humans and animal models. Furthermore, SysFinder provides a user-friendly platform for determination of short guide RNAs (sgRNAs) and homology arms to design a new animal model. Case studies illustrate the ability of SysFinder in helping experimental scientists. SysFinder is a useful platform for experimental scientists to carry out their research in the human molecular mechanisms. Copyright © 2017 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.

  18. Design of platform for removing screws from LCD display shields

    NASA Astrophysics Data System (ADS)

    Tu, Zimei; Qin, Qin; Dou, Jianfang; Zhu, Dongdong

    2017-11-01

    Removing the screws on the sides of a shield is a necessary process in disassembling a computer LCD display. To solve this issue, a platform has been designed for removing the screws on display shields. This platform uses virtual instrument technology with LabVIEW as the development environment to design the mechanical structure with the technologies of motion control, human-computer interaction and target recognition. This platform removes the screws from the sides of the shield of an LCD display mechanically thus to guarantee follow-up separation and recycle.

  19. Using CyberShake Workflows to Manage Big Seismic Hazard Data on Large-Scale Open-Science HPC Resources

    NASA Astrophysics Data System (ADS)

    Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

    2015-12-01

    The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and migrating from file-based communication to MPI messaging, to greatly reduce the I/O demands and node-hour requirements of CyberShake. We will also present performance metrics from CyberShake Study 15.4, and discuss challenges that producers of Big Data on open-science HPC resources face moving forward.

  20. MyGeoHub: A Collaborative Geospatial Research and Education Platform

    NASA Astrophysics Data System (ADS)

    Kalyanam, R.; Zhao, L.; Biehl, L. L.; Song, C. X.; Merwade, V.; Villoria, N.

    2017-12-01

    Scientific research is increasingly collaborative and globally distributed; research groups now rely on web-based scientific tools and data management systems to simplify their day-to-day collaborative workflows. However, such tools often lack seamless interfaces, requiring researchers to contend with manual data transfers, annotation and sharing. MyGeoHub is a web platform that supports out-of-the-box, seamless workflows involving data ingestion, metadata extraction, analysis, sharing and publication. MyGeoHub is built on the HUBzero cyberinfrastructure platform and adds general-purpose software building blocks (GABBs), for geospatial data management, visualization and analysis. A data management building block iData, processes geospatial files, extracting metadata for keyword and map-based search while enabling quick previews. iData is pervasive, allowing access through a web interface, scientific tools on MyGeoHub or even mobile field devices via a data service API. GABBs includes a Python map library as well as map widgets that in a few lines of code, generate complete geospatial visualization web interfaces for scientific tools. GABBs also includes powerful tools that can be used with no programming effort. The GeoBuilder tool provides an intuitive wizard for importing multi-variable, geo-located time series data (typical of sensor readings, GPS trackers) to build visualizations supporting data filtering and plotting. MyGeoHub has been used in tutorials at scientific conferences and educational activities for K-12 students. MyGeoHub is also constantly evolving; the recent addition of Jupyter and R Shiny notebook environments enable reproducible, richly interactive geospatial analyses and applications ranging from simple pre-processing to published tools. MyGeoHub is not a monolithic geospatial science gateway, instead it supports diverse needs ranging from just a feature-rich data management system, to complex scientific tools and workflows.

  1. Distributed geospatial model sharing based on open interoperability standards

    USGS Publications Warehouse

    Feng, Min; Liu, Shuguang; Euliss, Ned H.; Fang, Yin

    2009-01-01

    Numerous geospatial computational models have been developed based on sound principles and published in journals or presented in conferences. However modelers have made few advances in the development of computable modules that facilitate sharing during model development or utilization. Constraints hampering development of model sharing technology includes limitations on computing, storage, and connectivity; traditional stand-alone and closed network systems cannot fully support sharing and integrating geospatial models. To address this need, we have identified methods for sharing geospatial computational models using Service Oriented Architecture (SOA) techniques and open geospatial standards. The service-oriented model sharing service is accessible using any tools or systems compliant with open geospatial standards, making it possible to utilize vast scientific resources available from around the world to solve highly sophisticated application problems. The methods also allow model services to be empowered by diverse computational devices and technologies, such as portable devices and GRID computing infrastructures. Based on the generic and abstract operations and data structures required for Web Processing Service (WPS) standards, we developed an interactive interface for model sharing to help reduce interoperability problems for model use. Geospatial computational models are shared on model services, where the computational processes provided by models can be accessed through tools and systems compliant with WPS. We developed a platform to help modelers publish individual models in a simplified and efficient way. Finally, we illustrate our technique using wetland hydrological models we developed for the prairie pothole region of North America.

  2. Causal biological network database: a comprehensive platform of causal biological network models focused on the pulmonary and vascular systems.

    PubMed

    Boué, Stéphanie; Talikka, Marja; Westra, Jurjen Willem; Hayes, William; Di Fabio, Anselmo; Park, Jennifer; Schlage, Walter K; Sewer, Alain; Fields, Brett; Ansari, Sam; Martin, Florian; Veljkovic, Emilija; Kenney, Renee; Peitsch, Manuel C; Hoeng, Julia

    2015-01-01

    With the wealth of publications and data available, powerful and transparent computational approaches are required to represent measured data and scientific knowledge in a computable and searchable format. We developed a set of biological network models, scripted in the Biological Expression Language, that reflect causal signaling pathways across a wide range of biological processes, including cell fate, cell stress, cell proliferation, inflammation, tissue repair and angiogenesis in the pulmonary and cardiovascular context. This comprehensive collection of networks is now freely available to the scientific community in a centralized web-based repository, the Causal Biological Network database, which is composed of over 120 manually curated and well annotated biological network models and can be accessed at http://causalbionet.com. The website accesses a MongoDB, which stores all versions of the networks as JSON objects and allows users to search for genes, proteins, biological processes, small molecules and keywords in the network descriptions to retrieve biological networks of interest. The content of the networks can be visualized and browsed. Nodes and edges can be filtered and all supporting evidence for the edges can be browsed and is linked to the original articles in PubMed. Moreover, networks may be downloaded for further visualization and evaluation. Database URL: http://causalbionet.com © The Author(s) 2015. Published by Oxford University Press.

  3. On the performances of computer vision algorithms on mobile platforms

    NASA Astrophysics Data System (ADS)

    Battiato, S.; Farinella, G. M.; Messina, E.; Puglisi, G.; Ravì, D.; Capra, A.; Tomaselli, V.

    2012-01-01

    Computer Vision enables mobile devices to extract the meaning of the observed scene from the information acquired with the onboard sensor cameras. Nowadays, there is a growing interest in Computer Vision algorithms able to work on mobile platform (e.g., phone camera, point-and-shot-camera, etc.). Indeed, bringing Computer Vision capabilities on mobile devices open new opportunities in different application contexts. The implementation of vision algorithms on mobile devices is still a challenging task since these devices have poor image sensors and optics as well as limited processing power. In this paper we have considered different algorithms covering classic Computer Vision tasks: keypoint extraction, face detection, image segmentation. Several tests have been done to compare the performances of the involved mobile platforms: Nokia N900, LG Optimus One, Samsung Galaxy SII.

  4. Elastic Cloud Computing Architecture and System for Heterogeneous Spatiotemporal Computing

    NASA Astrophysics Data System (ADS)

    Shi, X.

    2017-10-01

    Spatiotemporal computation implements a variety of different algorithms. When big data are involved, desktop computer or standalone application may not be able to complete the computation task due to limited memory and computing power. Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure and platforms. Some are perfect for implementation on a cluster of graphics processing units (GPUs), while GPUs may not be useful on certain kind of spatiotemporal computation. This is the same situation in utilizing a cluster of Intel's many-integrated-core (MIC) or Xeon Phi, as well as Hadoop or Spark platforms, to handle big spatiotemporal data. Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation could be similar or better than GPUs and MICs. It is expected that an elastic cloud computing architecture and system that integrates all of GPUs, MICs, and FPGAs could be developed and deployed to support spatiotemporal computing over heterogeneous data types and computational problems.

  5. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis

    PubMed Central

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  6. Pulseq-Graphical Programming Interface: Open source visual environment for prototyping pulse sequences and integrated magnetic resonance imaging algorithm development.

    PubMed

    Ravi, Keerthi Sravan; Potdar, Sneha; Poojar, Pavan; Reddy, Ashok Kumar; Kroboth, Stefan; Nielsen, Jon-Fredrik; Zaitsev, Maxim; Venkatesan, Ramesh; Geethanath, Sairam

    2018-03-11

    To provide a single open-source platform for comprehensive MR algorithm development inclusive of simulations, pulse sequence design and deployment, reconstruction, and image analysis. We integrated the "Pulseq" platform for vendor-independent pulse programming with Graphical Programming Interface (GPI), a scientific development environment based on Python. Our integrated platform, Pulseq-GPI, permits sequences to be defined visually and exported to the Pulseq file format for execution on an MR scanner. For comparison, Pulseq files using either MATLAB only ("MATLAB-Pulseq") or Python only ("Python-Pulseq") were generated. We demonstrated three fundamental sequences on a 1.5 T scanner. Execution times of the three variants of implementation were compared on two operating systems. In vitro phantom images indicate equivalence with the vendor supplied implementations and MATLAB-Pulseq. The examples demonstrated in this work illustrate the unifying capability of Pulseq-GPI. The execution times of all the three implementations were fast (a few seconds). The software is capable of user-interface based development and/or command line programming. The tool demonstrated here, Pulseq-GPI, integrates the open-source simulation, reconstruction and analysis capabilities of GPI Lab with the pulse sequence design and deployment features of Pulseq. Current and future work includes providing an ISMRMRD interface and incorporating Specific Absorption Ratio and Peripheral Nerve Stimulation computations. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Mid-year report FY17 Q2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moreland, Kenneth D.; Pugmire, David; Rogers, David

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less

  8. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY17.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moreland, Kenneth D.; Pugmire, David; Rogers, David

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less

  9. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem. Mid-year report FY16 Q2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less

  10. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY15 Q4.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less

  11. A Design Comparison of Atmospheric Flight Vehicles for the Exploration of Titan

    NASA Technical Reports Server (NTRS)

    Gasbarre, Joseph F.; Wright, Henry S.; Lewis, Mark J.

    2005-01-01

    Titan, the largest moon of Saturn, is one of the most scientifically interesting locations in the Solar System. With a very cold atmosphere that is five times as dense as Earth s, and one and a half times the surface pressure, it also provides one of the most aeronautically fascinating environments known to humankind. While this may seem the ideal place to attempt atmospheric flight, many challenges await any vehicle attempting to navigate through it. In addition to these physical challenges, any scientific exploration mission to Titan will most likely have several operational constraints. One difficult constraint is the desire for a global survey of the planet and thus, a long duration flight within the atmosphere. Since many of the scientific measurements that would be unique to a vehicle flying through the atmosphere (as opposed to an orbiting spacecraft) desire near-surface positioning of their associated instruments, the vehicle must also be able to fly within the first scale height of the atmosphere. Another difficult constraint is that interaction with the surface, whether by landing or dropped probe, is also highly desirable from a scientific perspective. Two common atmospheric flight platforms that might be used for this mission are the airplane and airship. Under the assumption of a mission architecture that would involve an orbiting relay spacecraft delivered via aerocapture and an atmospheric flight vehicle delivered via direct entry, designs were developed for both platforms that are unique to Titan. Consequently, after a viable design was achieved for each platform, their advantages and disadvantages were compared. This comparison included such factors as deployment risk, surface interaction capability, mass, and design heritage. When considering all factors, the preferred candidate platform for a global survey of Titan is an airship.

  12. Discovery and analysis of time delay sources in the USGS personal computer data collection platform (PCDCP) system

    USGS Publications Warehouse

    White, Timothy C.; Sauter, Edward A.; Stewart, Duff C.

    2014-01-01

    Intermagnet is an international oversight group which exists to establish a global network for geomagnetic observatories. This group establishes data standards and standard operating procedures for members and prospective members. Intermagnet has proposed a new One-Second Data Standard, for that emerging geomagnetic product. The standard specifies that all data collected must have a time stamp accuracy of ±10 milliseconds of the top-of-the-second Coordinated Universal Time. Therefore, the U.S. Geological Survey Geomagnetism Program has designed and executed several tests on its current data collection system, the Personal Computer Data Collection Platform. Tests are designed to measure the time shifts introduced by individual components within the data collection system, as well as to measure the time shift introduced by the entire Personal Computer Data Collection Platform. Additional testing designed for Intermagnet will be used to validate further such measurements. Current results of the measurements showed a 5.0–19.9 millisecond lag for the vertical channel (Z) of the Personal Computer Data Collection Platform and a 13.0–25.8 millisecond lag for horizontal channels (H and D) of the collection system. These measurements represent a dynamically changing delay introduced within the U.S. Geological Survey Personal Computer Data Collection Platform.

  13. SU-D-BRD-02: A Web-Based Image Processing and Plan Evaluation Platform (WIPPEP) for Future Cloud-Based Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chai, X; Liu, L; Xing, L

    Purpose: Visualization and processing of medical images and radiation treatment plan evaluation have traditionally been constrained to local workstations with limited computation power and ability of data sharing and software update. We present a web-based image processing and planning evaluation platform (WIPPEP) for radiotherapy applications with high efficiency, ubiquitous web access, and real-time data sharing. Methods: This software platform consists of three parts: web server, image server and computation server. Each independent server communicates with each other through HTTP requests. The web server is the key component that provides visualizations and user interface through front-end web browsers and relay informationmore » to the backend to process user requests. The image server serves as a PACS system. The computation server performs the actual image processing and dose calculation. The web server backend is developed using Java Servlets and the frontend is developed using HTML5, Javascript, and jQuery. The image server is based on open source DCME4CHEE PACS system. The computation server can be written in any programming language as long as it can send/receive HTTP requests. Our computation server was implemented in Delphi, Python and PHP, which can process data directly or via a C++ program DLL. Results: This software platform is running on a 32-core CPU server virtually hosting the web server, image server, and computation servers separately. Users can visit our internal website with Chrome browser, select a specific patient, visualize image and RT structures belonging to this patient and perform image segmentation running Delphi computation server and Monte Carlo dose calculation on Python or PHP computation server. Conclusion: We have developed a webbased image processing and plan evaluation platform prototype for radiotherapy. This system has clearly demonstrated the feasibility of performing image processing and plan evaluation platform through a web browser and exhibited potential for future cloud based radiotherapy.« less

  14. Checkpointing Shared Memory Programs at the Application-level

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bronevetsky, G; Schulz, M; Szwed, P

    2004-09-08

    Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart(CPR)-the state of the computation is saved periodically on disk, and when a failure occurs, the computation is restarted from the last saved state. At present, it is the responsibility of the programmer to instrument applications for CPR. Our group is investigating the use of compiler technology to instrument codes to make them self-checkpointing and self-restarting, thereby providing an automatic solution to the problem of making long-running scientific applications resilient to hardware faults. Our previous work focusedmore » on message-passing programs. In this paper, we describe such a system for shared-memory programs running on symmetric multiprocessors. The system has two components: (i)a pre-compiler for source-to-source modification of applications, and (ii) a runtime system that implements a protocol for coordinating CPR among the threads of the parallel application. For the sake of concreteness, we focus on a non-trivial subset of OpenMP that includes barriers and locks. One of the advantages of this approach is that the ability to tolerate faults becomes embedded within the application itself, so applications become self-checkpointing and self-restarting on any platform. We demonstrate this by showing that our transformed benchmarks can checkpoint and restart on three different platforms (Windows/x86, Linux/x86, and Tru64/Alpha). Our experiments show that the overhead introduced by this approach is usually quite small; they also suggest ways in which the current implementation can be tuned to reduced overheads further.« less

  15. Scalability of Several Asynchronous Many-Task Models for In Situ Statistical Analysis.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille; Kolla, Hemanth

    This report is a sequel to [PB16], in which we provided a first progress report on research and development towards a scalable, asynchronous many-task, in situ statistical analysis engine using the Legion runtime system. This earlier work included a prototype implementation of a proposed solution, using a proxy mini-application as a surrogate for a full-scale scientific simulation code. The first scalability studies were conducted with the above on modestly-sized experimental clusters. In contrast, in the current work we have integrated our in situ analysis engines with a full-size scientific application (S3D, using the Legion-SPMD model), and have conducted nu- mericalmore » tests on the largest computational platform currently available for DOE science ap- plications. We also provide details regarding the design and development of a light-weight asynchronous collectives library. We describe how this library is utilized within our SPMD- Legion S3D workflow, and compare the data aggregation technique deployed herein to the approach taken within our previous work.« less

  16. File-access characteristics of parallel scientific workloads

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David; Purakayastha, Apratim; Best, Michael; Ellis, Carla Schlatter

    1995-01-01

    Phenomenal improvements in the computational performance of multiprocessors have not been matched by comparable gains in I/O system performance. This imbalance has resulted in I/O becoming a significant bottleneck for many scientific applications. One key to overcoming this bottleneck is improving the performance of parallel file systems. The design of a high-performance parallel file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of parallel file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide parallel file-system design.

  17. Mechanism change in a simulation of peer review: from junk support to elitism.

    PubMed

    Paolucci, Mario; Grimaldo, Francisco

    2014-01-01

    Peer review works as the hinge of the scientific process, mediating between research and the awareness/acceptance of its results. While it might seem obvious that science would regulate itself scientifically, the consensus on peer review is eroding; a deeper understanding of its workings and potential alternatives is sorely needed. Employing a theoretical approach supported by agent-based simulation, we examined computational models of peer review, performing what we propose to call redesign , that is, the replication of simulations using different mechanisms . Here, we show that we are able to obtain the high sensitivity to rational cheating that is present in literature. In addition, we also show how this result appears to be fragile against small variations in mechanisms. Therefore, we argue that exploration of the parameter space is not enough if we want to support theoretical statements with simulation, and that exploration at the level of mechanisms is needed. These findings also support prudence in the application of simulation results based on single mechanisms, and endorse the use of complex agent platforms that encourage experimentation of diverse mechanisms.

  18. GenomicTools: a computational platform for developing high-throughput analytics in genomics.

    PubMed

    Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

    2012-01-15

    Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.

  19. Enabling a new Paradigm to Address Big Data and Open Science Challenges

    NASA Astrophysics Data System (ADS)

    Ramamurthy, Mohan; Fisher, Ward

    2017-04-01

    Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers also need to give scientists an ecosystem that includes data, tools, workflows and other services needed to perform analytics, integration, interpretation, and synthesis - all in the same environment or platform. Instead of moving data to processing systems near users, as is the tradition, the cloud permits one to bring processing, computing, analysis and visualization to data - so called data proximate workbench capabilities, also known as server-side processing. In this talk, I will present the ongoing work at Unidata to facilitate a new paradigm for doing science by offering a suite of tools, resources, and platforms to leverage cloud technologies for addressing both big data and Open Science/reproducibility challenges. That work includes the development and deployment of new protocols for data access and server-side operations and Docker container images of key applications, JupyterHub Python notebook tools, and cloud-based analysis and visualization capability via the CloudIDV tool to enable reproducible workflows and effectively use the accessed data.

  20. An adaptable XML based approach for scientific data management and integration

    NASA Astrophysics Data System (ADS)

    Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo

    2008-03-01

    Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.

  1. An Adaptable XML Based Approach for Scientific Data Management and Integration.

    PubMed

    Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo

    2008-02-20

    Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.

  2. Encounters with Authority: Tactics and Negotiations at the Periphery of Participatory Platforms

    ERIC Educational Resources Information Center

    Mugar, Gabriel

    2017-01-01

    Digital participatory platforms like Wikipedia are often celebrated as projects that allow anyone to contribute. Any user can sign up and start contributing immediately. Similarly, projects that engage volunteers in the production of scientific knowledge create easy points of entry to make contributions. These low barriers to entry are a hallmark…

  3. End-to-End simulations for the MICADO-MAORY SCAO mode

    NASA Astrophysics Data System (ADS)

    Vidal, Fabrice; Ferreira, Florian; Déo, Vincent; Sevin, Arnaud; Gendron, Eric; Clénet, Yann; Durand, Sébastien; Gratadour, Damien; Doucet, Nicolas; Rousset, Gérard; Davies, Richard

    2018-04-01

    MICADO is a E-ELT first light near-infrared imager. It will work at the diffraction limit of the telescope thanks to multi-conjugate adaptive optics (MCAO) and single-conjugate adaptive optics (SCAO) modes provided inside the MAORY AO module. The SCAO capability is a joint development by the MICADO and MAORY consortia, lead by MICADO, and is motivated by scientific programs for which SCAO will deliver the best AO performance (e.g. exoplanets, solar system science, bright AGNs, etc). Shack-Hartmann (SH) or Pyramid WFS were both envisioned for the wavefront measurement of the SCAO mode. In addition to WFS design considerations, numerical simulations are therefore needed to trade-off between these two WFS. COMPASS (COMputing Platform for Adaptive optics SyStems) is a GPU-based adaptive optics end-to-end simulation platform allowing us to perform numerical simulations in various modes (SCAO, LTAO, MOAO, MCAO...). COMPASS was originally bound to Yorick for its user interface and a major upgrade has been recently done to now bind to Python allowing a better long term support to the community. Thanks to the speed of computation of COMPASS we were able to span quickly a very large parameters of space at the E-ELT scale. We present the results of the study between WFS choice (SH or Pyramid), WFS parameters (detector noise, guide star magnitude, number of subapertures, number of controlled modes...), turbulence conditions and AO controls for the MICADO-MAORY SCAO mode.

  4. [Exploiture and application of an internet-based Computation Platform for Integrative Pharmacology of Traditional Chinese Medicine].

    PubMed

    Xu, Hai-Yu; Liu, Zhen-Ming; Fu, Yan; Zhang, Yan-Qiong; Yu, Jian-Jun; Guo, Fei-Fei; Tang, Shi-Huan; Lv, Chuan-Yu; Su, Jin; Cui, Ru-Yi; Yang, Hong-Jun

    2017-09-01

    Recently, integrative pharmacology(IP) has become a pivotal paradigm for the modernization of traditional Chinese medicines(TCM) and combinatorial drugs discovery, which is an interdisciplinary science for establishing the in vitro and in vivo correlation between absorption, distribution, metabolism, and excretion/pharmacokinetic(ADME/PK) profiles of TCM and the molecular networks of disease by the integration of the knowledge of multi-disciplinary and multi-stages. In the present study, an internet-based Computation Platform for IP of TCM(TCM-IP, www.tcmip.cn) is established to promote the development of the emerging discipline. Among them, a big data of TCM is an important resource for TCM-IP including Chinese Medicine Formula Database, Chinese Medical Herbs Database, Chemical Database of Chinese Medicine, Target Database for Disease and Symptoms, et al. Meanwhile, some data mining and bioinformatics approaches are critical technology for TCM-IP including the identification of the TCM constituents, ADME prediction, target prediction for the TCM constituents, network construction and analysis, et al. Furthermore, network beautification and individuation design are employed to meet the consumer's requirement. We firmly believe that TCM-IP is a very useful tool for the identification of active constituents of TCM and their involving potential molecular mechanism for therapeutics, which would wildly applied in quality evaluation, clinical repositioning, scientific discovery based on original thinking, prescription compatibility and new drug of TCM, et al. Copyright© by the Chinese Pharmaceutical Association.

  5. Integrating Authentic Earth Science Data in Online Visualization Tools and Social Media Networking to Promote Earth Science Education

    NASA Astrophysics Data System (ADS)

    Carter, B. L.; Campbell, B.; Chambers, L.; Davis, A.; Riebeek, H.; Ward, K.

    2008-12-01

    The Goddard Space Flight Center (GSFC) is one of the largest Earth Science research-based institutions in the nation. Along with the research comes a dedicated group of people who are tasked with developing Earth science research-based education and public outreach materials to reach the broadest possible range of audiences. The GSFC Earth science education community makes use of a wide variety of platforms in order to reach their goals of communicating science. These platforms include using social media networking such as Twitter and Facebook, as well as geo-spatial tools such as MY NASA DATA, NASA World Wind, NEO, and Google Earth. Using a wide variety of platforms serves the dual purposes of promoting NASA Earth Science research and making authentic data available to educational communities that otherwise might not otherwise be granted access. Making data available to education communities promotes scientific literacy through the investigation of scientific phenomena using the same data that is used by the scientific community. Data from several NASA missions will be used to demonstrate the ways in which Earth science data are made available for the education community.

  6. The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

    PubMed

    Thackston, Russell; Fortenberry, Ryan C

    2015-05-05

    The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.

  7. A Platform-Independent Plugin for Navigating Online Radiology Cases.

    PubMed

    Balkman, Jason D; Awan, Omer A

    2016-06-01

    Software methods that enable navigation of radiology cases on various digital platforms differ between handheld devices and desktop computers. This has resulted in poor compatibility of online radiology teaching files across mobile smartphones, tablets, and desktop computers. A standardized, platform-independent, or "agnostic" approach for presenting online radiology content was produced in this work by leveraging modern hypertext markup language (HTML) and JavaScript web software technology. We describe the design and evaluation of this software, demonstrate its use across multiple viewing platforms, and make it publicly available as a model for future development efforts.

  8. Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kozacik, Stephen

    Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.

  9. Mobile, Cloud, and Big Data Computing: Contributions, Challenges, and New Directions in Telecardiology

    PubMed Central

    Hsieh, Jui-Chien; Li, Ai-Hsien; Yang, Chung-Chi

    2013-01-01

    Many studies have indicated that computing technology can enable off-site cardiologists to read patients’ electrocardiograph (ECG), echocardiography (ECHO), and relevant images via smart phones during pre-hospital, in-hospital, and post-hospital teleconsultation, which not only identifies emergency cases in need of immediate treatment, but also prevents the unnecessary re-hospitalizations. Meanwhile, several studies have combined cloud computing and mobile computing to facilitate better storage, delivery, retrieval, and management of medical files for telecardiology. In the future, the aggregated ECG and images from hospitals worldwide will become big data, which should be used to develop an e-consultation program helping on-site practitioners deliver appropriate treatment. With information technology, real-time tele-consultation and tele-diagnosis of ECG and images can be practiced via an e-platform for clinical, research, and educational purposes. While being devoted to promote the application of information technology onto telecardiology, we need to resolve several issues: (1) data confidentiality in the cloud, (2) data interoperability among hospitals, and (3) network latency and accessibility. If these challenges are overcome, tele-consultation will be ubiquitous, easy to perform, inexpensive, and beneficial. Most importantly, these services will increase global collaboration and advance clinical practice, education, and scientific research in cardiology. PMID:24232290

  10. Evolution of the ATLAS PanDA workload management system for exascale computational science

    NASA Astrophysics Data System (ADS)

    Maeno, T.; De, K.; Klimentov, A.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.; Yu, D.; Atlas Collaboration

    2014-06-01

    An important foundation underlying the impressive success of data processing and analysis in the ATLAS experiment [1] at the LHC [2] is the Production and Distributed Analysis (PanDA) workload management system [3]. PanDA was designed specifically for ATLAS and proved to be highly successful in meeting all the distributed computing needs of the experiment. However, the core design of PanDA is not experiment specific. The PanDA workload management system is capable of meeting the needs of other data intensive scientific applications. Alpha-Magnetic Spectrometer [4], an astro-particle experiment on the International Space Station, and the Compact Muon Solenoid [5], an LHC experiment, have successfully evaluated PanDA and are pursuing its adoption. In this paper, a description of the new program of work to develop a generic version of PanDA will be given, as well as the progress in extending PanDA's capabilities to support supercomputers and clouds and to leverage intelligent networking. PanDA has demonstrated at a very large scale the value of automated dynamic brokering of diverse workloads across distributed computing resources. The next generation of PanDA will allow other data-intensive sciences and a wider exascale community employing a variety of computing platforms to benefit from ATLAS' experience and proven tools.

  11. Mobile, cloud, and big data computing: contributions, challenges, and new directions in telecardiology.

    PubMed

    Hsieh, Jui-Chien; Li, Ai-Hsien; Yang, Chung-Chi

    2013-11-13

    Many studies have indicated that computing technology can enable off-site cardiologists to read patients' electrocardiograph (ECG), echocardiography (ECHO), and relevant images via smart phones during pre-hospital, in-hospital, and post-hospital teleconsultation, which not only identifies emergency cases in need of immediate treatment, but also prevents the unnecessary re-hospitalizations. Meanwhile, several studies have combined cloud computing and mobile computing to facilitate better storage, delivery, retrieval, and management of medical files for telecardiology. In the future, the aggregated ECG and images from hospitals worldwide will become big data, which should be used to develop an e-consultation program helping on-site practitioners deliver appropriate treatment. With information technology, real-time tele-consultation and tele-diagnosis of ECG and images can be practiced via an e-platform for clinical, research, and educational purposes. While being devoted to promote the application of information technology onto telecardiology, we need to resolve several issues: (1) data confidentiality in the cloud, (2) data interoperability among hospitals, and (3) network latency and accessibility. If these challenges are overcome, tele-consultation will be ubiquitous, easy to perform, inexpensive, and beneficial. Most importantly, these services will increase global collaboration and advance clinical practice, education, and scientific research in cardiology.

  12. Center for Center for Technology for Advanced Scientific Component Software (TASCS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kostadin, Damevski

    A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less

  13. Computational Science: A Research Methodology for the 21st Century

    NASA Astrophysics Data System (ADS)

    Orbach, Raymond L.

    2004-03-01

    Computational simulation - a means of scientific discovery that employs computer systems to simulate a physical system according to laws derived from theory and experiment - has attained peer status with theory and experiment. Important advances in basic science are accomplished by a new "sociology" for ultrascale scientific computing capability (USSCC), a fusion of sustained advances in scientific models, mathematical algorithms, computer architecture, and scientific software engineering. Expansion of current capabilities by factors of 100 - 1000 open up new vistas for scientific discovery: long term climatic variability and change, macroscopic material design from correlated behavior at the nanoscale, design and optimization of magnetic confinement fusion reactors, strong interactions on a computational lattice through quantum chromodynamics, and stellar explosions and element production. The "virtual prototype," made possible by this expansion, can markedly reduce time-to-market for industrial applications such as jet engines and safer, more fuel efficient cleaner cars. In order to develop USSCC, the National Energy Research Scientific Computing Center (NERSC) announced the competition "Innovative and Novel Computational Impact on Theory and Experiment" (INCITE), with no requirement for current DOE sponsorship. Fifty nine proposals for grand challenge scientific problems were submitted for a small number of awards. The successful grants, and their preliminary progress, will be described.

  14. A simple centrality index for scientific social recognition

    NASA Astrophysics Data System (ADS)

    Kinouchi, Osame; Soares, Leonardo D. H.; Cardoso, George C.

    2018-02-01

    We introduce a new centrality index for bipartite networks of papers and authors that we call K-index. The K-index grows with the citation performance of the papers that cite a given researcher and can be seen as a measure of scientific social recognition. Indeed, the K-index measures the number of hubs, defined in a self-consistent way in the bipartite network, that cites a given author. We show that the K-index can be computed by simple inspection of the Web of Science platform and presents several advantages over other centrality indexes, in particular Hirsch h-index. The K-index is robust to self-citations, is not limited by the total number of papers published by a researcher as occurs for the h-index and can distinguish in a consistent way researchers that have the same h-index but very different scientific social recognition. The K-index easily detects a known case of a researcher with inflated number of papers, citations and h-index due to scientific misconduct. Finally, we show that, in a sample of twenty-eight physics Nobel laureates and twenty-eight highly cited non-Nobel-laureate physicists, the K-index correlates better to the achievement of the prize than the number of papers, citations, citations per paper, citing articles or the h-index. Clustering researchers in a K versus h plot reveals interesting outliers that suggest that these two indexes can present complementary independent information.

  15. University Students Use of Computers and Mobile Devices for Learning and Their Reading Speed on Different Platforms

    ERIC Educational Resources Information Center

    Mpofu, Bongeka

    2016-01-01

    This research was aimed at the investigation of mobile device and computer use at a higher learning institution. The goal was to determine the current use of computers and mobile devices for learning and the students' reading speed on different platforms. The research was contextualised in a sample of students at the University of South Africa.…

  16. 77 FR 69593 - Atlantic Highly Migratory Species; Exempted Fishing, Scientific Research, Display, and Chartering...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-20

    ... platforms while conducting the specified research. As an example, NMFS has considered the recreational and... research and is therefore exempt from regulation. Examples of research conducted under LOAs include tagging... life history studies. Scientific research is not exempt from regulation under ATCA. NMFS issues SRPs...

  17. Physics-based Modeling of Material Behavior and Damage Initiation in Nanoengineered Composites

    NASA Astrophysics Data System (ADS)

    Subramanian, Nithya

    Materials with unprecedented properties are necessary to make dramatic changes in current and future aerospace platforms. Hybrid materials and composites are increasingly being used in aircraft and spacecraft frames; however, future platforms will require an optimal design of novel materials that enable operation in a variety of environments and produce known/predicted damage mechanisms. Nanocomposites and nanoengineered composites with CNTs have the potential to make significant improvements in strength, stiffness, fracture toughness, flame retardancy and resistance to corrosion. Therefore, these materials have generated tremendous scientific and technical interest over the past decade and various architectures are being explored for applications to light-weight airframe structures. However, the success of such materials with significantly improved performance metrics requires careful control of the parameters during synthesis and processing. Their implementation is also limited due to the lack of complete understanding of the effects the nanoparticles impart to the bulk properties of composites. It is common for computational methods to be applied to explain phenomena measured or observed experimentally. Frequently, a given phenomenon or material property is only considered to be fully understood when the associated physics has been identified through accompanying calculations or simulations. The computationally and experimentally integrated research presented in this dissertation provides improved understanding of the mechanical behavior and response including damage and failure in CNT nanocomposites, enhancing confidence in their applications. The computations at the atomistic level helps to understand the underlying mechanochemistry and allow a systematic investigation of the complex CNT architectures and the material performance across a wide range of parameters. Simulation of the bond breakage phenomena and development of the interface to continuum scale damage captures the effects of applied loading and damage precursor and provides insight into the safety of nanoengineered composites under service loads. The validated modeling methodology is expected to be a step in the direction of computationally-assisted design and certification of novel materials, thus liberating the pace of their implementation in future applications.

  18. Application verification research of cloud computing technology in the field of real time aerospace experiment

    NASA Astrophysics Data System (ADS)

    Wan, Junwei; Chen, Hongyan; Zhao, Jing

    2017-08-01

    According to the requirements of real-time, reliability and safety for aerospace experiment, the single center cloud computing technology application verification platform is constructed. At the IAAS level, the feasibility of the cloud computing technology be applied to the field of aerospace experiment is tested and verified. Based on the analysis of the test results, a preliminary conclusion is obtained: Cloud computing platform can be applied to the aerospace experiment computing intensive business. For I/O intensive business, it is recommended to use the traditional physical machine.

  19. Inferring cortical function in the mouse visual system through large-scale systems neuroscience.

    PubMed

    Hawrylycz, Michael; Anastassiou, Costas; Arkhipov, Anton; Berg, Jim; Buice, Michael; Cain, Nicholas; Gouwens, Nathan W; Gratiy, Sergey; Iyer, Ramakrishnan; Lee, Jung Hoon; Mihalas, Stefan; Mitelut, Catalin; Olsen, Shawn; Reid, R Clay; Teeter, Corinne; de Vries, Saskia; Waters, Jack; Zeng, Hongkui; Koch, Christof

    2016-07-05

    The scientific mission of the Project MindScope is to understand neocortex, the part of the mammalian brain that gives rise to perception, memory, intelligence, and consciousness. We seek to quantitatively evaluate the hypothesis that neocortex is a relatively homogeneous tissue, with smaller functional modules that perform a common computational function replicated across regions. We here focus on the mouse as a mammalian model organism with genetics, physiology, and behavior that can be readily studied and manipulated in the laboratory. We seek to describe the operation of cortical circuitry at the computational level by comprehensively cataloging and characterizing its cellular building blocks along with their dynamics and their cell type-specific connectivities. The project is also building large-scale experimental platforms (i.e., brain observatories) to record the activity of large populations of cortical neurons in behaving mice subject to visual stimuli. A primary goal is to understand the series of operations from visual input in the retina to behavior by observing and modeling the physical transformations of signals in the corticothalamic system. We here focus on the contribution that computer modeling and theory make to this long-term effort.

  20. CDAC Student Report: Summary of LLNL Internship

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herriman, Jane E.

    Multiple objectives motivated me to apply for an internship at LLNL: I wanted to experience the work environment at a national lab, to learn about research and job opportunities at LLNL in particular, and to gain greater experience with code development, particularly within the realm of high performance computing (HPC). This summer I was selected to participate in LLNL's Computational Chemistry and Material Science Summer Institute (CCMS). CCMS is a 10 week program hosted by the Quantum Simulations group leader, Dr. Eric Schwegler. CCMS connects graduate students to mentors at LLNL involved in similar re- search and provides weekly seminarsmore » on a broad array of topics from within chemistry and materials science. Dr. Xavier Andrade and Dr. Erik Draeger served as my co-mentors over the summer, and Dr. Andrade continues to mentor me now that CCMS has concluded. Dr. Andrade is a member of the Quantum Simulations group within the Physical and Life Sciences at LLNL, and Dr. Draeger leads the HPC group within the Center for Applied Scientific Computing (CASC). The two have worked together to develop Qb@ll, an open-source first principles molecular dynamics code that was the platform for my summer research project.« less

  1. XVIS: Visualization for the Extreme-Scale Scientific-Computation Ecosystem Final Scientific/Technical Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Geveci, Berk; Maynard, Robert

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. The XVis project brought together collaborators from predominant DOE projects for visualization on accelerators and combining their respectivemore » features into a new visualization toolkit called VTK-m.« less

  2. GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

    PubMed

    Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

    2013-01-28

    Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.

  3. Clinicians’ Expectations of Web 2.0 as a Mechanism for Knowledge Transfer of Stroke Best Practices

    PubMed Central

    David, Isabelle; Rochette, Annie

    2012-01-01

    Background Health professionals are increasingly encouraged to adopt an evidence-based practice to ensure greater efficiency of their services. To promote this practice, several strategies exist: distribution of educational materials, local consensus processes, educational outreach visits, local opinion leaders, and reminders. Despite these strategies, gaps continue to be observed between practice and scientific evidence. Therefore, it is important to implement innovative knowledge transfer strategies that will change health professionals’ practices. Through its interactive capacities, Web 2.0 applications are worth exploring. As an example, virtual communities of practice have already begun to influence professional practice. Objective This study was initially developed to help design a Web 2.0 platform for health professionals working with stroke patients. The aim was to gain a better understanding of professionals’ perceptions of Web 2.0 before the development of the platform. Methods A qualitative study following a phenomenological approach was chosen. We conducted individual semi-structured interviews with clinicians and managers. Interview transcripts were subjected to a content analysis. Results Twenty-four female clinicians and managers in Quebec, Canada, aged 28-66 participated. Most participants identified knowledge transfer as the most useful outcome of a Web 2.0 platform. Respondents also expressed their need for a user-friendly platform. Accessibility to a computer and the Internet, features of the Web 2.0 platform, user support, technology skills, and previous technological experience were found to influence perceived ease of use and usefulness. Our results show that the perceived lack of time of health professionals has an influence on perceived behavioral intention to use it despite favorable perception of the usefulness of the Web 2.0 platform. Conclusions In conclusion, female health professionals in Quebec believe that Web 2.0 may be a useful mechanism for knowledge transfer. However, lack of time and lack of technological skills may limit their use of a future Web 2.0 platform. Further studies are required with other populations and in other regions to confirm these findings. PMID:23195753

  4. Ensuring correct rollback recovery in distributed shared memory systems

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1995-01-01

    Distributed shared memory (DSM) implemented on a cluster of workstations is an increasingly attractive platform for executing parallel scientific applications. Checkpointing and rollback techniques can be used in such a system to allow the computation to progress in spite of the temporary failure of one or more processing nodes. This paper presents the design of an independent checkpointing method for DSM that takes advantage of DSM's specific properties to reduce error-free and rollback overhead. The scheme reduces the dependencies that need to be considered for correct rollback to those resulting from transfers of pages. Furthermore, in-transit messages can be recovered without the use of logging. We extend the scheme to a DSM implementation using lazy release consistency, where the frequency of dependencies is further reduced.

  5. Automated sample area definition for high-throughput microscopy.

    PubMed

    Zeder, M; Ellrott, A; Amann, R

    2011-04-01

    High-throughput screening platforms based on epifluorescence microscopy are powerful tools in a variety of scientific fields. Although some applications are based on imaging geometrically defined samples such as microtiter plates, multiwell slides, or spotted gene arrays, others need to cope with inhomogeneously located samples on glass slides. The analysis of microbial communities in aquatic systems by sample filtration on membrane filters followed by multiple fluorescent staining, or the investigation of tissue sections are examples. Therefore, we developed a strategy for flexible and fast definition of sample locations by the acquisition of whole slide overview images and automated sample recognition by image analysis. Our approach was tested on different microscopes and the computer programs are freely available (http://www.technobiology.ch). Copyright © 2011 International Society for Advancement of Cytometry.

  6. Information visualisation for science and policy: engaging users and avoiding bias.

    PubMed

    McInerny, Greg J; Chen, Min; Freeman, Robin; Gavaghan, David; Meyer, Miriah; Rowland, Francis; Spiegelhalter, David J; Stefaner, Moritz; Tessarolo, Geizi; Hortal, Joaquin

    2014-03-01

    Visualisations and graphics are fundamental to studying complex subject matter. However, beyond acknowledging this value, scientists and science-policy programmes rarely consider how visualisations can enable discovery, create engaging and robust reporting, or support online resources. Producing accessible and unbiased visualisations from complicated, uncertain data requires expertise and knowledge from science, policy, computing, and design. However, visualisation is rarely found in our scientific training, organisations, or collaborations. As new policy programmes develop [e.g., the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES)], we need information visualisation to permeate increasingly both the work of scientists and science policy. The alternative is increased potential for missed discoveries, miscommunications, and, at worst, creating a bias towards the research that is easiest to display. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. A study on specialist or special disease clinics based on big data.

    PubMed

    Fang, Zhuyuan; Fan, Xiaowei; Chen, Gong

    2014-09-01

    Correlation analysis and processing of massive medical information can be implemented through big data technology to find the relevance of different factors in the life cycle of a disease and to provide the basis for scientific research and clinical practice. This paper explores the concept of constructing a big medical data platform and introduces the clinical model construction. Medical data can be collected and consolidated by distributed computing technology. Through analysis technology, such as artificial neural network and grey model, a medical model can be built. Big data analysis, such as Hadoop, can be used to construct early prediction and intervention models as well as clinical decision-making model for specialist and special disease clinics. It establishes a new model for common clinical research for specialist and special disease clinics.

  8. Ethical sharing of health data in online platforms - which values should be considered?

    PubMed

    Riso, Brígida; Tupasela, Aaro; Vears, Danya F; Felzmann, Heike; Cockbain, Julian; Loi, Michele; Kongsholm, Nana C H; Zullo, Silvia; Rakic, Vojin

    2017-08-21

    Intensified and extensive data production and data storage are characteristics of contemporary western societies. Health data sharing is increasing with the growth of Information and Communication Technology (ICT) platforms devoted to the collection of personal health and genomic data. However, the sensitive and personal nature of health data poses ethical challenges when data is disclosed and shared even if for scientific research purposes.With this in mind, the Science and Values Working Group of the COST Action CHIP ME 'Citizen's Health through public-private Initiatives: Public health, Market and Ethical perspectives' (IS 1303) identified six core values they considered to be essential for the ethical sharing of health data using ICT platforms. We believe that using this ethical framework will promote respectful scientific practices in order to maintain individuals' trust in research.We use these values to analyse five ICT platforms and explore how emerging data sharing platforms are reconfiguring the data sharing experience from a range of perspectives. We discuss which types of values, rights and responsibilities they entail and enshrine within their philosophy or outlook on what it means to share personal health information. Through this discussion we address issues of the design and the development process of personal health data and patient-oriented infrastructures, as well as new forms of technologically-mediated empowerment.

  9. Sea truth and environmental characterization studies of Mobile Bay, Alabama, utilizing ERTS-1, data collection platforms

    NASA Technical Reports Server (NTRS)

    Schroeder, W. W.

    1977-01-01

    The paper reports on the scientific results obtained during a feasibility study that evaluated the potential of using ERTS data collection platforms (DCPs) in the coastal environment of Mobile Bay, Alabama. The utility of instrumented buoys operated in a coastal marine environment as ERTS DCPs is demonstrated. It is shown that these platforms are capable of providing both sea-truth data for ERTS imagery studies and time-series data for event monitoring and/or environmental characterization studies.

  10. Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application.

    PubMed

    Hanwell, Marcus D; de Jong, Wibe A; Harris, Christopher J

    2017-10-30

    An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction-connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platform with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web-going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.

  11. Joint the Center for Applied Scientific Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gamblin, Todd; Bremer, Timo; Van Essen, Brian

    The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.

  12. Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

    PubMed Central

    Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu

    2017-01-01

    High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments. PMID:28835734

  13. Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments.

    PubMed

    Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu

    2017-01-01

    High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.

  14. GPU-based High-Performance Computing for Radiation Therapy

    PubMed Central

    Jia, Xun; Ziegenhein, Peter; Jiang, Steve B.

    2014-01-01

    Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. Graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past a few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of studies have been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this article, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. PMID:24486639

  15. OpenACC performance for simulating 2D radial dambreak using FVM HLLE flux

    NASA Astrophysics Data System (ADS)

    Gunawan, P. H.; Pahlevi, M. R.

    2018-03-01

    The aim of this paper is to investigate the performances of openACC platform for computing 2D radial dambreak. Here, the shallow water equation will be used to describe and simulate 2D radial dambreak with finite volume method (FVM) using HLLE flux. OpenACC is a parallel computing platform based on GPU cores. Indeed, from this research this platform is used to minimize computational time on the numerical scheme performance. The results show the using OpenACC, the computational time is reduced. For the dry and wet radial dambreak simulations using 2048 grids, the computational time of parallel is obtained 575.984 s and 584.830 s respectively for both simulations. These results show the successful of OpenACC when they are compared with the serial time of dry and wet radial dambreak simulations which are collected 28047.500 s and 29269.40 s respectively.

  16. MACBenAbim: A Multi-platform Mobile Application for searching keyterms in Computational Biology and Bioinformatics.

    PubMed

    Oluwagbemi, Olugbenga O; Adewumi, Adewole; Esuruoso, Abimbola

    2012-01-01

    Computational biology and bioinformatics are gradually gaining grounds in Africa and other developing nations of the world. However, in these countries, some of the challenges of computational biology and bioinformatics education are inadequate infrastructures, and lack of readily-available complementary and motivational tools to support learning as well as research. This has lowered the morale of many promising undergraduates, postgraduates and researchers from aspiring to undertake future study in these fields. In this paper, we developed and described MACBenAbim (Multi-platform Mobile Application for Computational Biology and Bioinformatics), a flexible user-friendly tool to search for, define and describe the meanings of keyterms in computational biology and bioinformatics, thus expanding the frontiers of knowledge of the users. This tool also has the capability of achieving visualization of results on a mobile multi-platform context. MACBenAbim is available from the authors for non-commercial purposes.

  17. Workload Characterization of CFD Applications Using Partial Differential Equation Solvers

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.

  18. 78 FR 41046 - Advanced Scientific Computing Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-09

    ... Services Administration, notice is hereby given that the Advanced Scientific Computing Advisory Committee will be renewed for a two-year period beginning on July 1, 2013. The Committee will provide advice to the Director, Office of Science (DOE), on the Advanced Scientific Computing Research Program managed...

  19. Interplanetary monitoring platform engineering history and achievements

    NASA Technical Reports Server (NTRS)

    Butler, P. M.

    1980-01-01

    In the fall of 1979, last of ten Interplanetary Monitoring Platform Satellite (IMP) missions ended a ten year series of flights dedicated to obtaining new knowledge of the radiation effects in outer space and of solar phenomena during a period of maximum solar flare activity. The technological achievements and scientific accomplishments from the IMP program are described.

  20. Sustainable access to data, products, services and software from the European seismological Research Infrastructures: the EPOS TCS Seismology

    NASA Astrophysics Data System (ADS)

    Haslinger, Florian; Dupont, Aurelien; Michelini, Alberto; Rietbrock, Andreas; Sleeman, Reinoud; Wiemer, Stefan; Basili, Roberto; Bossu, Rémy; Cakti, Eser; Cotton, Fabrice; Crawford, Wayne; Diaz, Jordi; Garth, Tom; Locati, Mario; Luzi, Lucia; Pinho, Rui; Pitilakis, Kyriazis; Strollo, Angelo

    2016-04-01

    Easy, efficient and comprehensive access to data, data products, scientific services and scientific software is a key ingredient in enabling research at the frontiers of science. Organizing this access across the European Research Infrastructures in the field of seismology, so that it best serves user needs, takes advantage of state-of-the-art ICT solutions, provides cross-domain interoperability, and is organizationally and financially sustainable in the long term, is the core challenge of the implementation phase of the Thematic Core Service (TCS) Seismology within the EPOS-IP project. Building upon the existing European-level infrastructures ORFEUS for seismological waveforms, EMSC for seismological products, and EFEHR for seismological hazard and risk information, and implementing a pilot Computational Earth Science service starting from the results of the VERCE project, the work within the EPOS-IP project focuses on improving and extending the existing services, aligning them with global developments, to at the end produce a well coordinated framework that is technically, organizationally, and financially integrated with the EPOS architecture. This framework needs to respect the roles and responsibilities of the underlying national research infrastructures that are the data owners and main providers of data and products, and allow for active input and feedback from the (scientific) user community. At the same time, it needs to remain flexible enough to cope with unavoidable challenges in the availability of resources and dynamics of contributors. The technical work during the next years is organized in four areas: - constructing the next generation software architecture for the European Integrated (waveform) Data Archive EIDA, developing advanced metadata and station information services, fully integrate strong motion waveforms and derived parametric engineering-domain data, and advancing the integration of mobile (temporary) networks and OBS deployments in EIDA; - further development and expansion of services to access seismological products of scientific interest as provided by the community by implementing a common collection and development (IT) platform, improvements in the earthquake information services e.g. by introducing more robust quality indicators and diversifying collection and dissemination mechanisms, as well as improving historical earthquake data services; - development of a comprehensive suite of earthquake hazard products, tools, and services harmonized on the European level and available through a common access platform, encompassing information on seismic sources, seismogenic faults, ground-motion prediction equations, geotechnical information, and strong-motion recordings in buildings, together with an interface to earthquake risk; - a portal implementation of computational seismology tools and services, specifically for seismic waveform propagation in complex 3D media following the results of the VERCE project, and initiating the inclusion of further suitable codes on that portal in discussion with the community, forming the basis of EPOS computational earth science infrastructure. This will be accompanied by development and implementation of integrated and interoperable metadata structures, adequate and referencable persistent identifiers, and appropriate user access and authorization mechanisms. Here we present further detail on the work plan with the attempt to foster interaction with the target user community on the spectrum of services as well as on feedback mechanisms and governance.

  1. Low-cost embedded systems for democratizing ocean sensor technology in the coastal zone

    NASA Astrophysics Data System (ADS)

    Glazer, B. T.; Lio, H. I.

    2017-12-01

    Environmental sciences suffer from undersampling. Enabling sustained and unattended data collection in the coastal zone typically involves expensive instrumentation and infrastructure deployed as cabled observatories or moorings with little flexibility in deployment location following initial installation. High costs of commercially-available or custom instruments have limited the number of sensor sites that can be targeted by academic researchers, and have also limited engagement with the public. We have developed a novel, low-cost, open-source sensor and software platform to enable wireless data transfer of biogeochemical sensors in the coastal zone. The platform is centered upon widely available, low-cost, single board computers and microcontrollers. We have used a blend of on-hand research-grade sensors and low-cost open-source electronics that can be assembled by tech-savvy non-engineers. Robust, open-source code that remains customizable for specific miniNode configurations can match a specific site's measurement needs, depending on the scientific research priorities. We have demonstrated prototype capabilities and versatility through lab testing and field deployments of multiple sensor nodes with multiple sensor inputs, all of which are streaming near-real-time data from Kaneohe Bay over wireless RF links to a shore-based base station.

  2. Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Patchett, John M; Ahrens, James P; Lo, Li - Ta

    2010-10-15

    Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less

  3. ACQ4: an open-source software platform for data acquisition and analysis in neurophysiology research

    PubMed Central

    Campagnola, Luke; Kratz, Megan B.; Manis, Paul B.

    2014-01-01

    The complexity of modern neurophysiology experiments requires specialized software to coordinate multiple acquisition devices and analyze the collected data. We have developed ACQ4, an open-source software platform for performing data acquisition and analysis in experimental neurophysiology. This software integrates the tasks of acquiring, managing, and analyzing experimental data. ACQ4 has been used primarily for standard patch-clamp electrophysiology, laser scanning photostimulation, multiphoton microscopy, intrinsic imaging, and calcium imaging. The system is highly modular, which facilitates the addition of new devices and functionality. The modules included with ACQ4 provide for rapid construction of acquisition protocols, live video display, and customizable analysis tools. Position-aware data collection allows automated construction of image mosaics and registration of images with 3-dimensional anatomical atlases. ACQ4 uses free and open-source tools including Python, NumPy/SciPy for numerical computation, PyQt for the user interface, and PyQtGraph for scientific graphics. Supported hardware includes cameras, patch clamp amplifiers, scanning mirrors, lasers, shutters, Pockels cells, motorized stages, and more. ACQ4 is available for download at http://www.acq4.org. PMID:24523692

  4. A Specification for a Godunov-type Eulerian 2-D Hydrocode, Revision 0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nystrom, William D; Robey, Jonathan M

    2012-05-01

    The purpose of this code specification is to describe an algorithm for solving the Euler equations of hydrodynamics in a 2D rectangular region in sufficient detail to allow a software developer to produce an implementation on their target platform using their programming language of choice without requiring detailed knowledge and experience in the field of computational fluid dynamics. It should be possible for a software developer who is proficient in the programming language of choice and is knowledgable of the target hardware to produce an efficient implementation of this specification if they also possess a thorough working knowledge of parallelmore » programming and have some experience in scientific programming using fields and meshes. On modern architectures, it will be important to focus on issues related to the exploitation of the fine grain parallelism and data locality present in this algorithm. This specification aims to make that task easier by presenting the essential details of the algorithm in a systematic and language neutral manner while also avoiding the inclusion of implementation details that would likely be specific to a particular type of programming paradigm or platform architecture.« less

  5. An integrated compact airborne multispectral imaging system using embedded computer

    NASA Astrophysics Data System (ADS)

    Zhang, Yuedong; Wang, Li; Zhang, Xuguo

    2015-08-01

    An integrated compact airborne multispectral imaging system using embedded computer based control system was developed for small aircraft multispectral imaging application. The multispectral imaging system integrates CMOS camera, filter wheel with eight filters, two-axis stabilized platform, miniature POS (position and orientation system) and embedded computer. The embedded computer has excellent universality and expansibility, and has advantages in volume and weight for airborne platform, so it can meet the requirements of control system of the integrated airborne multispectral imaging system. The embedded computer controls the camera parameters setting, filter wheel and stabilized platform working, image and POS data acquisition, and stores the image and data. The airborne multispectral imaging system can connect peripheral device use the ports of the embedded computer, so the system operation and the stored image data management are easy. This airborne multispectral imaging system has advantages of small volume, multi-function, and good expansibility. The imaging experiment results show that this system has potential for multispectral remote sensing in applications such as resource investigation and environmental monitoring.

  6. Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

    DOE PAGES

    Klimentov, A.; Buncic, P.; De, K.; ...

    2015-05-22

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less

  7. Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klimentov, A.; Buncic, P.; De, K.

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less

  8. ESA's Multi-mission Sentinel-1 Toolbox

    NASA Astrophysics Data System (ADS)

    Veci, Luis; Lu, Jun; Foumelis, Michael; Engdahl, Marcus

    2017-04-01

    The Sentinel-1 Toolbox is a new open source software for scientific learning, research and exploitation of the large archives of Sentinel and heritage missions. The Toolbox is based on the proven BEAM/NEST architecture inheriting all current NEST functionality including multi-mission support for most civilian satellite SAR missions. The project is funded through ESA's Scientific Exploitation of Operational Missions (SEOM). The Sentinel-1 Toolbox will strive to serve the SEOM mandate by providing leading-edge software to the science and application users in support of ESA's operational SAR mission as well as by educating and growing a SAR user community. The Toolbox consists of a collection of processing tools, data product readers and writers and a display and analysis application. A common architecture for all Sentinel Toolboxes is being jointly developed by Brockmann Consult, Array Systems Computing and C-S called the Sentinel Application Platform (SNAP). The SNAP architecture is ideal for Earth Observation processing and analysis due the following technological innovations: Extensibility, Portability, Modular Rich Client Platform, Generic EO Data Abstraction, Tiled Memory Management, and a Graph Processing Framework. The project has developed new tools for working with Sentinel-1 data in particular for working with the new Interferometric TOPSAR mode. TOPSAR Complex Coregistration and a complete Interferometric processing chain has been implemented for Sentinel-1 TOPSAR data. To accomplish this, a coregistration following the Spectral Diversity[4] method has been developed as well as special azimuth handling in the coherence, interferogram and spectral filter operators. The Toolbox includes reading of L0, L1 and L2 products in SAFE format, calibration and de-noising, slice product assembling, TOPSAR deburst and sub-swath merging, terrain flattening radiometric normalization, and visualization for L2 OCN products. The Toolbox also provides several new tools for exploitation of polarimetric data including speckle filters, decompositions, and classifiers. The Toolbox will also include tools for large data stacks, supervised and unsupervised classification, improved vector handling and change detection. Architectural improvements such as smart memory configuration, task queuing, and optimizations for complex data will provide better support and performance for very large products and stacks.In addition, a Cloud Exploitation Platform Extension (CEP) has been developed to add the capability to smoothly utilize a cloud computing platform where EO data repositories and high performance processing capabilities are available. The extension to the SENTINEL Application Platform would facilitate entry into cloud processing services for supporting bulk processing on high performance clusters. Since December 2016, the COMET-LiCS InSAR portal (http://comet.nerc.ac.uk/COMET-LiCS-portal/) has been live, delivering interferograms and coherence estimates over the entire Alpine-Himalayan belt. The portal already contains tens of thousands of products, which can be browsed in a user-friendly portal, and downloaded for free by the general public. For our processing, we use the facilities at the Climate and Environmental Monitoring from Space (CEMS). Here we have large storage and processing facilities to our disposal, and a complete duplicate of the Sentinel-1 archive is maintained. This greatly simplifies the infrastructure we had to develop for automated processing of large areas. Here we will give an overview of the current status of the processing system, as well as discuss future plans. We will cover the infrastructure we developed to automatically produce interferograms and its challenges, and the processing strategy for time series analysis. We will outline the objectives of the system in the near and distant future, and a roadmap for its continued development. Finally, we will highlight some of the scientific results and projects linked to the system.

  9. Cloud hosting of the IPython Notebook to Provide Collaborative Research Environments for Big Data Analysis

    NASA Astrophysics Data System (ADS)

    Kershaw, Philip; Lawrence, Bryan; Gomez-Dans, Jose; Holt, John

    2015-04-01

    We explore how the popular IPython Notebook computing system can be hosted on a cloud platform to provide a flexible virtual research hosting environment for Earth Observation data processing and analysis and how this approach can be expanded more broadly into a generic SaaS (Software as a Service) offering for the environmental sciences. OPTIRAD (OPTImisation environment for joint retrieval of multi-sensor RADiances) is a project funded by the European Space Agency to develop a collaborative research environment for Data Assimilation of Earth Observation products for land surface applications. Data Assimilation provides a powerful means to combine multiple sources of data and derive new products for this application domain. To be most effective, it requires close collaboration between specialists in this field, land surface modellers and end users of data generated. A goal of OPTIRAD then is to develop a collaborative research environment to engender shared working. Another significant challenge is that of data volume and complexity. Study of land surface requires high spatial and temporal resolutions, a relatively large number of variables and the application of algorithms which are computationally expensive. These problems can be addressed with the application of parallel processing techniques on specialist compute clusters. However, scientific users are often deterred by the time investment required to port their codes to these environments. Even when successfully achieved, it may be difficult to readily change or update. This runs counter to the scientific process of continuous experimentation, analysis and validation. The IPython Notebook provides users with a web-based interface to multiple interactive shells for the Python programming language. Code, documentation and graphical content can be saved and shared making it directly applicable to OPTIRAD's requirements for a shared working environment. Given the web interface it can be readily made into a hosted service with Wakari and Microsoft Azure being notable examples. Cloud-hosting of the Notebook allows the same familiar Python interface to be retained but backed by Cloud Computing attributes of scalability, elasticity and resource pooling. This combination makes it a powerful solution to address the needs of long-tail science users of Big Data: an intuitive interactive interface with which to access powerful compute resources. IPython Notebook can be hosted as a single user desktop environment but the recent development by the IPython community of JupyterHub enables it to be run as a multi-user hosting environment. In addition, IPython.parallel allows the exposition of parallel compute infrastructure through a Python interface. Applying these technologies in combination, a collaborative research environment has been developed for OPTIRAD on the UK JASMIN/CEMS facility's private cloud (http://jasmin.ac.uk). Based on this experience, a generic virtualised solution is under development suitable for use by the wider environmental science community - on both JASMIN and portable to third party cloud platforms.

  10. The Design of a High Performance Earth Imagery and Raster Data Management and Processing Platform

    NASA Astrophysics Data System (ADS)

    Xie, Qingyun

    2016-06-01

    This paper summarizes the general requirements and specific characteristics of both geospatial raster database management system and raster data processing platform from a domain-specific perspective as well as from a computing point of view. It also discusses the need of tight integration between the database system and the processing system. These requirements resulted in Oracle Spatial GeoRaster, a global scale and high performance earth imagery and raster data management and processing platform. The rationale, design, implementation, and benefits of Oracle Spatial GeoRaster are described. Basically, as a database management system, GeoRaster defines an integrated raster data model, supports image compression, data manipulation, general and spatial indices, content and context based queries and updates, versioning, concurrency, security, replication, standby, backup and recovery, multitenancy, and ETL. It provides high scalability using computer and storage clustering. As a raster data processing platform, GeoRaster provides basic operations, image processing, raster analytics, and data distribution featuring high performance computing (HPC). Specifically, HPC features include locality computing, concurrent processing, parallel processing, and in-memory computing. In addition, the APIs and the plug-in architecture are discussed.

  11. Novel Scientific Visualization Interfaces for Interactive Information Visualization and Sharing

    NASA Astrophysics Data System (ADS)

    Demir, I.; Krajewski, W. F.

    2012-12-01

    As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools in the Iowa Flood Information System (IFIS), developed within the light of these challenges. The IFIS is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to and visualization of flood inundation maps, real-time flood conditions, flood forecasts both short-term and seasonal, and other flood-related data for communities in Iowa. The key element of the system's architecture is the notion of community. Locations of the communities, those near streams and rivers, define basin boundaries. The IFIS provides community-centric watershed and river characteristics, weather (rainfall) conditions, and streamflow data and visualization tools. Interactive interfaces allow access to inundation maps for different stage and return period values, and flooding scenarios with contributions from multiple rivers. Real-time and historical data of water levels, gauge heights, and rainfall conditions are available in the IFIS. 2D and 3D interactive visualizations in the IFIS make the data more understandable to general public. Users are able to filter data sources for their communities and selected rivers. The data and information on IFIS is also accessible through web services and mobile applications. The IFIS is optimized for various browsers and screen sizes to provide access through multiple platforms including tablets and mobile devices. Multiple view modes in the IFIS accommodate different user types from general public to researchers and decision makers by providing different level of tools and details. River view mode allows users to visualize data from multiple IFC bridge sensors and USGS stream gauges to follow flooding condition along a river. The IFIS will help communities make better-informed decisions on the occurrence of floods, and will alert communities in advance to help minimize damage of floods.

  12. Sentinel-1 Interferometry from the Cloud to the Scientist

    NASA Astrophysics Data System (ADS)

    Garron, J.; Stoner, C.; Johnston, A.; Arko, S. A.

    2017-12-01

    Big data problems and solutions are growing in the technological and scientific sectors daily. Cloud computing is a vertically and horizontally scalable solution available now for archiving and processing large volumes of data quickly, without significant on-site computing hardware costs. Be that as it may, the conversion of scientific data processors to these powerful platforms requires not only the proof of concept, but the demonstration of credibility in an operational setting. The Alaska Satellite Facility (ASF) Distributed Active Archive Center (DAAC), in partnership with NASA's Jet Propulsion Laboratory, is exploring the functional architecture of Amazon Web Services cloud computing environment for the processing, distribution and archival of Synthetic Aperture Radar data in preparation for the NASA-ISRO Synthetic Aperture Radar (NISAR) Mission. Leveraging built-in AWS services for logging, monitoring and dashboarding, the GRFN (Getting Ready for NISAR) team has built a scalable processing, distribution and archival system of Sentinel-1 L2 interferograms produced using the ISCE algorithm. This cloud-based functional prototype provides interferograms over selected global land deformation features (volcanoes, land subsidence, seismic zones) and are accessible to scientists via NASA's EarthData Search client and the ASF DAACs primary SAR interface, Vertex, for direct download. The interferograms are produced using nearest-neighbor logic for identifying pairs of granules for interferometric processing, creating deep stacks of BETA products from almost every satellite orbit for scientists to explore. This presentation highlights the functional lessons learned to date from this exercise, including the cost analysis of various data lifecycle policies as implemented through AWS. While demonstrating the architecture choices in support of efficient big science data management, we invite feedback and questions about the process and products from the InSAR community.

  13. Digital Rocks Portal: a sustainable platform for imaged dataset sharing, translation and automated analysis

    NASA Astrophysics Data System (ADS)

    Prodanovic, M.; Esteva, M.; Hanlon, M.; Nanda, G.; Agarwal, P.

    2015-12-01

    Recent advances in imaging have provided a wealth of 3D datasets that reveal pore space microstructure (nm to cm length scale) and allow investigation of nonlinear flow and mechanical phenomena from first principles using numerical approaches. This framework has popularly been called "digital rock physics". Researchers, however, have trouble storing and sharing the datasets both due to their size and the lack of standardized image types and associated metadata for volumetric datasets. This impedes scientific cross-validation of the numerical approaches that characterize large scale porous media properties, as well as development of multiscale approaches required for correct upscaling. A single research group typically specializes in an imaging modality and/or related modeling on a single length scale, and lack of data-sharing infrastructure makes it difficult to integrate different length scales. We developed a sustainable, open and easy-to-use repository called the Digital Rocks Portal, that (1) organizes images and related experimental measurements of different porous materials, (2) improves access to them for a wider community of geosciences or engineering researchers not necessarily trained in computer science or data analysis. Once widely accepter, the repository will jumpstart productivity and enable scientific inquiry and engineering decisions founded on a data-driven basis. This is the first repository of its kind. We show initial results on incorporating essential software tools and pipelines that make it easier for researchers to store and reuse data, and for educators to quickly visualize and illustrate concepts to a wide audience. For data sustainability and continuous access, the portal is implemented within the reliable, 24/7 maintained High Performance Computing Infrastructure supported by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin. Long-term storage is provided through the University of Texas System Research Cyber-infrastructure initiative.

  14. Introductory Tools for Radiative Transfer Models

    NASA Astrophysics Data System (ADS)

    Feldman, D.; Kuai, L.; Natraj, V.; Yung, Y.

    2006-12-01

    Satellite data are currently so voluminous that, despite their unprecedented quality and potential for scientific application, only a small fraction is analyzed due to two factors: researchers' computational constraints and a relatively small number of researchers actively utilizing the data. Ultimately it is hoped that the terabytes of unanalyzed data being archived can receive scientific scrutiny but this will require a popularization of the methods associated with the analysis. Since a large portion of complexity is associated with the proper implementation of the radiative transfer model, it is reasonable and appropriate to make the model as accessible as possible to general audiences. Unfortunately, the algorithmic and conceptual details that are necessary for state-of-the-art analysis also tend to frustrate the accessibility for those new to remote sensing. Several efforts have been made to have web- based radiative transfer calculations, and these are useful for limited calculations, but analysis of more than a few spectra requires the utilization of home- or server-based computing resources. We present a system that is designed to allow for easier access to radiative transfer models with implementation on a home computing platform in the hopes that this system can be utilized in and expanded upon in advanced high school and introductory college settings. This learning-by-doing process is aided through the use of several powerful tools. The first is a wikipedia-style introduction to the salient features of radiative transfer that references the seminal works in the field and refers to more complicated calculations and algorithms sparingly5. The second feature is a technical forum, commonly referred to as a tiki-wiki, that addresses technical and conceptual questions through public postings, private messages, and a ranked searching routine. Together, these tools may be able to facilitate greater interest in the field of remote sensing.

  15. GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

    PubMed Central

    Gadelha, Luiz; Ribeiro-Alves, Marcelo; Porto, Fábio

    2017-01-01

    There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet. PMID:28695067

  16. GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis.

    PubMed

    Costa, Raquel L; Gadelha, Luiz; Ribeiro-Alves, Marcelo; Porto, Fábio

    2017-01-01

    There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet.

  17. Museum-University Partnerships as a New Platform for Public Engagement with Scientific Research

    ERIC Educational Resources Information Center

    Bell, Jamie; Chesebrough, David; Cryan, Jason; Koster, Emlyn

    2016-01-01

    A growing trend in natural history museums, science museums, and science centers is the establishment of innovative new partnerships with universities to bring scientific research to the public in compelling and transformative ways. The strengths of both kinds of institutions are leveraged in effective and publicly visible programs, activities,…

  18. The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science

    PubMed Central

    Bruskiewich, Richard; Senger, Martin; Davenport, Guy; Ruiz, Manuel; Rouard, Mathieu; Hazekamp, Tom; Takeya, Masaru; Doi, Koji; Satoh, Kouji; Costa, Marcos; Simon, Reinhard; Balaji, Jayashree; Akintunde, Akinnola; Mauleon, Ramil; Wanchana, Samart; Shah, Trushar; Anacleto, Mylah; Portugal, Arllet; Ulat, Victor Jun; Thongjuea, Supat; Braak, Kyle; Ritter, Sebastian; Dereeper, Alexis; Skofic, Milko; Rojas, Edwin; Martins, Natalia; Pappas, Georgios; Alamban, Ryan; Almodiel, Roque; Barboza, Lord Hendrix; Detras, Jeffrey; Manansala, Kevin; Mendoza, Michael Jonathan; Morales, Jeffrey; Peralta, Barry; Valerio, Rowena; Zhang, Yi; Gregorio, Sergio; Hermocilla, Joseph; Echavez, Michael; Yap, Jan Michael; Farmer, Andrew; Schiltz, Gary; Lee, Jennifer; Casstevens, Terry; Jaiswal, Pankaj; Meintjes, Ayton; Wilkinson, Mark; Good, Benjamin; Wagner, James; Morris, Jane; Marshall, David; Collins, Anthony; Kikuchi, Shoshi; Metz, Thomas; McLaren, Graham; van Hintum, Theo

    2008-01-01

    The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making. PMID:18483570

  19. Social Computing as Next-Gen Learning Paradigm: A Platform and Applications

    NASA Astrophysics Data System (ADS)

    Margherita, Alessandro; Taurino, Cesare; Del Vecchio, Pasquale

    As a field at the intersection between computer science and people behavior, social computing can contribute significantly in the endeavor of innovating how individuals and groups interact for learning and working purposes. In particular, the generation of Internet applications tagged as web 2.0 provides an opportunity to create new “environments” where people can exchange knowledge and experience, create new knowledge and learn together. This chapter illustrates the design and application of a prototypal platform which embeds tools such as blog, wiki, folksonomy and RSS in a unique web-based system. This platform has been developed to support a case-based and project-driven learning strategy for the development of business and technology management competencies in undergraduate and graduate education programs. A set of illustrative scenarios are described to show how a learning community can be promoted, created, and sustained through the technological platform.

  20. A generic, cost-effective, and scalable cell lineage analysis platform

    PubMed Central

    Biezuner, Tamir; Spiro, Adam; Raz, Ofir; Amir, Shiran; Milo, Lilach; Adar, Rivka; Chapal-Ilani, Noa; Berman, Veronika; Fried, Yael; Ainbinder, Elena; Cohen, Galit; Barr, Haim M.; Halaban, Ruth; Shapiro, Ehud

    2016-01-01

    Advances in single-cell genomics enable commensurate improvements in methods for uncovering lineage relations among individual cells. Current sequencing-based methods for cell lineage analysis depend on low-resolution bulk analysis or rely on extensive single-cell sequencing, which is not scalable and could be biased by functional dependencies. Here we show an integrated biochemical-computational platform for generic single-cell lineage analysis that is retrospective, cost-effective, and scalable. It consists of a biochemical-computational pipeline that inputs individual cells, produces targeted single-cell sequencing data, and uses it to generate a lineage tree of the input cells. We validated the platform by applying it to cells sampled from an ex vivo grown tree and analyzed its feasibility landscape by computer simulations. We conclude that the platform may serve as a generic tool for lineage analysis and thus pave the way toward large-scale human cell lineage discovery. PMID:27558250

  1. An interactive parallel programming environment applied in atmospheric science

    NASA Technical Reports Server (NTRS)

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  2. Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

    NASA Astrophysics Data System (ADS)

    Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

    2018-01-01

    We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.

  3. Cellular computational platform and neurally inspired elements thereof

    DOEpatents

    Okandan, Murat

    2016-11-22

    A cellular computational platform is disclosed that includes a multiplicity of functionally identical, repeating computational hardware units that are interconnected electrically and optically. Each computational hardware unit includes a reprogrammable local memory and has interconnections to other such units that have reconfigurable weights. Each computational hardware unit is configured to transmit signals into the network for broadcast in a protocol-less manner to other such units in the network, and to respond to protocol-less broadcast messages that it receives from the network. Each computational hardware unit is further configured to reprogram the local memory in response to incoming electrical and/or optical signals.

  4. Whole earth modeling: developing and disseminating scientific software for computational geophysics.

    NASA Astrophysics Data System (ADS)

    Kellogg, L. H.

    2016-12-01

    Historically, a great deal of specialized scientific software for modeling and data analysis has been developed by individual researchers or small groups of scientists working on their own specific research problems. As the magnitude of available data and computer power has increased, so has the complexity of scientific problems addressed by computational methods, creating both a need to sustain existing scientific software, and expand its development to take advantage of new algorithms, new software approaches, and new computational hardware. To that end, communities like the Computational Infrastructure for Geodynamics (CIG) have been established to support the use of best practices in scientific computing for solid earth geophysics research and teaching. Working as a scientific community enables computational geophysicists to take advantage of technological developments, improve the accuracy and performance of software, build on prior software development, and collaborate more readily. The CIG community, and others, have adopted an open-source development model, in which code is developed and disseminated by the community in an open fashion, using version control and software repositories like Git. One emerging issue is how to adequately identify and credit the intellectual contributions involved in creating open source scientific software. The traditional method of disseminating scientific ideas, peer reviewed publication, was not designed for review or crediting scientific software, although emerging publication strategies such software journals are attempting to address the need. We are piloting an integrated approach in which authors are identified and credited as scientific software is developed and run. Successful software citation requires integration with the scholarly publication and indexing mechanisms as well, to assign credit, ensure discoverability, and provide provenance for software.

  5. MSD-MAP: A Network-Based Systems Biology Platform for Predicting Disease-Metabolite Links.

    PubMed

    Wathieu, Henri; Issa, Naiem T; Mohandoss, Manisha; Byers, Stephen W; Dakshanamurthy, Sivanesan

    2017-01-01

    Cancer-associated metabolites result from cell-wide mechanisms of dysregulation. The field of metabolomics has sought to identify these aberrant metabolites as disease biomarkers, clues to understanding disease mechanisms, or even as therapeutic agents. This study was undertaken to reliably predict metabolites associated with colorectal, esophageal, and prostate cancers. Metabolite and disease biological action networks were compared in a computational platform called MSD-MAP (Multi Scale Disease-Metabolite Association Platform). Using differential gene expression analysis with patient-based RNAseq data from The Cancer Genome Atlas, genes up- or down-regulated in cancer compared to normal tissue were identified. Relational databases were used to map biological entities including pathways, functions, and interacting proteins, to those differential disease genes. Similar relational maps were built for metabolites, stemming from known and in silico predicted metabolite-protein associations. The hypergeometric test was used to find statistically significant relationships between disease and metabolite biological signatures at each tier, and metabolites were assessed for multi-scale association with each cancer. Metabolite networks were also directly associated with various other diseases using a disease functional perturbation database. Our platform recapitulated metabolite-disease links that have been empirically verified in the scientific literature, with network-based mapping of jointly-associated biological activity also matching known disease mechanisms. This was true for colorectal, esophageal, and prostate cancers, using metabolite action networks stemming from both predicted and known functional protein associations. By employing systems biology concepts, MSD-MAP reliably predicted known cancermetabolite links, and may serve as a predictive tool to streamline conventional metabolomic profiling methodologies. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Support for Taverna workflows in the VPH-Share cloud platform.

    PubMed

    Kasztelnik, Marek; Coto, Ernesto; Bubak, Marian; Malawski, Maciej; Nowakowski, Piotr; Arenas, Juan; Saglimbeni, Alfredo; Testi, Debora; Frangi, Alejandro F

    2017-07-01

    To address the increasing need for collaborative endeavours within the Virtual Physiological Human (VPH) community, the VPH-Share collaborative cloud platform allows researchers to expose and share sequences of complex biomedical processing tasks in the form of computational workflows. The Taverna Workflow System is a very popular tool for orchestrating complex biomedical & bioinformatics processing tasks in the VPH community. This paper describes the VPH-Share components that support the building and execution of Taverna workflows, and explains how they interact with other VPH-Share components to improve the capabilities of the VPH-Share platform. Taverna workflow support is delivered by the Atmosphere cloud management platform and the VPH-Share Taverna plugin. These components are explained in detail, along with the two main procedures that were developed to enable this seamless integration: workflow composition and execution. 1) Seamless integration of VPH-Share with other components and systems. 2) Extended range of different tools for workflows. 3) Successful integration of scientific workflows from other VPH projects. 4) Execution speed improvement for medical applications. The presented workflow integration provides VPH-Share users with a wide range of different possibilities to compose and execute workflows, such as desktop or online composition, online batch execution, multithreading, remote execution, etc. The specific advantages of each supported tool are presented, as are the roles of Atmosphere and the VPH-Share plugin within the VPH-Share project. The combination of the VPH-Share plugin and Atmosphere engenders the VPH-Share infrastructure with far more flexible, powerful and usable capabilities for the VPH-Share community. As both components can continue to evolve and improve independently, we acknowledge that further improvements are still to be developed and will be described. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Advocating for responsible oil and natural gas extraction policies; FracTracker as a mechanism for overcoming the barriers to scientific advocacy for academics and communities

    NASA Astrophysics Data System (ADS)

    Ferrar, K. J.; Malone, S.; Kelso, M.; Lenker, B.

    2013-12-01

    The inability to translate data to scientific information that can readily be incorporated by citizens into the public arena is an obstacle for science-based advocacy. This issue is particularly poignant for shale oil and natural gas development via hydraulic fracturing, as the issue has become highly politicized. Barriers to engaging in policy debate are different but highly related for community members and scientists. For citizens and interest groups, barriers including accessibility, public awareness and data presentation limit the motivation for community involvement in political interactions. To overcome such barriers, social researchers call for public engagement to move upstream and many call for a broad engagement of scientists in science-based advocacy. Furthermore surveys have shown that citizens, interest groups, and decision-makers share a broad desire for scientists to engage in environmental policy development. Regardless, scientists face a number of perceived barriers, with academics expressing the most resistance to overcoming the tension created by adherence to the scientific method and the need to engage with the broader society, described by Schneider (1990) as the 'double ethical bind'. For the scientific community the appeal of public dissemination of information beyond the scope of academic journals is limited for a number of reasons. Barriers include preservation of credibility, peer attitudes, training, and career trajectory. The result is a lack of translated information available to the public. This systematic analysis of the FracTracker platform provides an evaluation of where the features of the public engagement, GIS platform has been successful at overcoming these barriers to public dissemination, where the platform needs further development or is ill-suited to address these issues, and the development of FracTracker as an outlet for scientific researchers to engage with citizens. The analysis will also provide insight into what information the public is most interested in concerning hydraulic fracturing and unconventional oil and natural gas development. The evaluations in consideration of these core barriers to public engagement and science-based advocacy will provide insight into the future development of programs, as well as the improvement of the FracTracker platform.

  8. Report to the Institutional Computing Executive Group (ICEG) August 14, 2006

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carnes, B

    We have delayed this report from its normal distribution schedule for two reasons. First, due to the coverage provided in the White Paper on Institutional Capability Computing Requirements distributed in August 2005, we felt a separate 2005 ICEG report would not be value added. Second, we wished to provide some specific information about the Peloton procurement and we have just now reached a point in the process where we can make some definitive statements. The Peloton procurement will result in an almost complete replacement of current M&IC systems. We have plans to retire MCR, iLX, and GPS. We will replacemore » them with new parallel and serial capacity systems based on the same node architecture in the new Peloton capability system named ATLAS. We are currently adding the first users to the Green Data Oasis, a large file system on the open network that will provide the institution with external collaboration data sharing. Only Thunder will remain from the current M&IC system list and it will be converted from Capability to Capacity. We are confident that we are entering a challenging yet rewarding new phase for the M&IC program. Institutional computing has been an essential component of our S&T investment strategy and has helped us achieve recognition in many scientific and technical forums. Through consistent institutional investments, M&IC has grown into a powerful unclassified computing resource that is being used across the Lab to push the limits of computing and its application to simulation science. With the addition of Peloton, the Laboratory will significantly increase the broad-based computing resources available to meet the ever-increasing demand for the large scale simulations indispensable to advancing all scientific disciplines. All Lab research efforts are bolstered through the long term development of mission driven scalable applications and platforms. The new systems will soon be fully utilized and will position Livermore to extend the outstanding science and technology breakthroughs the M&IC program has enabled to date.« less

  9. Integration of Panda Workload Management System with supercomputers

    NASA Astrophysics Data System (ADS)

    De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

    2016-09-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  10. The Effect of In-Service Training of Computer Science Teachers on Scratch Programming Language Skills Using an Electronic Learning Platform on Programming Skills and the Attitudes towards Teaching Programming

    ERIC Educational Resources Information Center

    Alkaria, Ahmed; Alhassan, Riyadh

    2017-01-01

    This study was conducted to examine the effect of in-service training of computer science teachers in Scratch language using an electronic learning platform on acquiring programming skills and attitudes towards teaching programming. The sample of this study consisted of 40 middle school computer science teachers. They were assigned into two…

  11. Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors.

    PubMed

    Feinstein, Wei P; Moreno, Juana; Jarrell, Mark; Brylinski, Michal

    2015-06-01

    Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of eFindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of eFindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of eFindSite is freely available to the academic community at www.brylinski.org/efindsite.

  12. Physicists Get INSPIREd: INSPIRE Project and Grid Applications

    NASA Astrophysics Data System (ADS)

    Klem, Jukka; Iwaszkiewicz, Jan

    2011-12-01

    INSPIRE is the new high-energy physics scientific information system developed by CERN, DESY, Fermilab and SLAC. INSPIRE combines the curated and trusted contents of SPIRES database with Invenio digital library technology. INSPIRE contains the entire HEP literature with about one million records and in addition to becoming the reference HEP scientific information platform, it aims to provide new kinds of data mining services and metrics to assess the impact of articles and authors. Grid and cloud computing provide new opportunities to offer better services in areas that require large CPU and storage resources including document Optical Character Recognition (OCR) processing, full-text indexing of articles and improved metrics. D4Science-II is a European project that develops and operates an e-Infrastructure supporting Virtual Research Environments (VREs). It develops an enabling technology (gCube) which implements a mechanism for facilitating the interoperation of its e-Infrastructure with other autonomously running data e-Infrastructures. As a result, this creates the core of an e-Infrastructure ecosystem. INSPIRE is one of the e-Infrastructures participating in D4Science-II project. In the context of the D4Science-II project, the INSPIRE e-Infrastructure makes available some of its resources and services to other members of the resulting ecosystem. Moreover, it benefits from the ecosystem via a dedicated Virtual Organization giving access to an array of resources ranging from computing and storage resources of grid infrastructures to data and services.

  13. A sensor architecture for real-time, in situ measurement of overlake evaporation on the Laurentian Great Lakes

    NASA Astrophysics Data System (ADS)

    Kerkez, B.; Fries, K. J.; Gronewold, A.; Lenters, J. D.

    2014-12-01

    While overlake evaporation is a major component of the Great Lakes' water balance, our scientific understanding of the climatic drivers of evaporation and its effects on water levels is significantly impeded by limited data. Existing measurement methods, such as eddy covariance, are not easily implemented in offshore applications. As such, there are only a handful of sites making direct, overlake measurements of evaporation on the entire Great Lakes, where the lake surface area comprises nearly one third of the entire basin. Long-term forecasts of water levels are thus very uncertain, particularly relating to climatic forcing, which is known to be a major driver of evaporation. We present a novel sensor architecture which is deployed on buoys, both tethered and drifting, to provide real-time measurements of overlake evaporation across the Great Lakes. Our system is comprised of a hierarchy of low-power, cost-effective sensor nodes, which carry out on-board computations to estimate evaporation in real-time. An ultra-low power microcontroller samples a suite of sensors to compute evaporation based on the Bowen ratio energy budget approach. The readings are then transmitted via satellite modules to a cloud-based server infrastructure for real-time updated scientific analysis and forecasting. Initial assessment of our new satellite drifter platform indicates robust field performance, validating its use in ongoing efforts to deploy a large-scale evaporation observation network across the Great Lakes basin.

  14. Grid heterogeneity in in-silico experiments: an exploration of drug screening using DOCK on cloud environments.

    PubMed

    Yim, Wen-Wai; Chien, Shu; Kusumoto, Yasuyuki; Date, Susumu; Haga, Jason

    2010-01-01

    Large-scale in-silico screening is a necessary part of drug discovery and Grid computing is one answer to this demand. A disadvantage of using Grid computing is the heterogeneous computational environments characteristic of a Grid. In our study, we have found that for the molecular docking simulation program DOCK, different clusters within a Grid organization can yield inconsistent results. Because DOCK in-silico virtual screening (VS) is currently used to help select chemical compounds to test with in-vitro experiments, such differences have little effect on the validity of using virtual screening before subsequent steps in the drug discovery process. However, it is difficult to predict whether the accumulation of these discrepancies over sequentially repeated VS experiments will significantly alter the results if VS is used as the primary means for identifying potential drugs. Moreover, such discrepancies may be unacceptable for other applications requiring more stringent thresholds. This highlights the need for establishing a more complete solution to provide the best scientific accuracy when executing an application across Grids. One possible solution to platform heterogeneity in DOCK performance explored in our study involved the use of virtual machines as a layer of abstraction. This study investigated the feasibility and practicality of using virtual machine and recent cloud computing technologies in a biological research application. We examined the differences and variations of DOCK VS variables, across a Grid environment composed of different clusters, with and without virtualization. The uniform computer environment provided by virtual machines eliminated inconsistent DOCK VS results caused by heterogeneous clusters, however, the execution time for the DOCK VS increased. In our particular experiments, overhead costs were found to be an average of 41% and 2% in execution time for two different clusters, while the actual magnitudes of the execution time costs were minimal. Despite the increase in overhead, virtual clusters are an ideal solution for Grid heterogeneity. With greater development of virtual cluster technology in Grid environments, the problem of platform heterogeneity may be eliminated through virtualization, allowing greater usage of VS, and will benefit all Grid applications in general.

  15. Integrating Data Base into the Elementary School Science Program.

    ERIC Educational Resources Information Center

    Schlenker, Richard M.

    This document describes seven science activities that combine scientific principles and computers. The objectives for the activities are to show students how the computer can be used as a tool to store and arrange scientific data, provide students with experience using the computer as a tool to manage scientific data, and provide students with…

  16. Environmental Detectives--The Development of an Augmented Reality Platform for Environmental Simulations

    ERIC Educational Resources Information Center

    Klopfer, Eric; Squire, Kurt

    2008-01-01

    The form factors of handheld computers make them increasingly popular among K-12 educators. Although some compelling examples of educational software for handhelds exist, we believe that the potential of this platform are just being discovered. This paper reviews innovative applications for mobile computing for both education and entertainment…

  17. Beam Dynamics Simulation Platform and Studies of Beam Breakup in Dielectric Wakefield Structures

    NASA Astrophysics Data System (ADS)

    Schoessow, P.; Kanareykin, A.; Jing, C.; Kustov, A.; Altmark, A.; Gai, W.

    2010-11-01

    A particle-Green's function beam dynamics code (BBU-3000) to study beam breakup effects is incorporated into a parallel computing framework based on the Boinc software environment, and supports both task farming on a heterogeneous cluster and local grid computing. User access to the platform is through a web browser.

  18. Assessing the Decision Process towards Bring Your Own Device

    ERIC Educational Resources Information Center

    Koester, Richard F.

    2017-01-01

    Information technology continues to evolve to the point where mobile technologies--such as smart phones, tablets, and ultra-mobile computers have the embedded flexibility and power to be a ubiquitous platform to fulfill the entire user's computing needs. Mobile technology users view these platforms as adaptable enough to be the single solution for…

  19. Multivariate Gradient Analysis for Evaluating and Visualizing a Learning System Platform for Computer Programming

    ERIC Educational Resources Information Center

    Mather, Richard

    2015-01-01

    This paper explores the application of canonical gradient analysis to evaluate and visualize student performance and acceptance of a learning system platform. The subject of evaluation is a first year BSc module for computer programming. This uses "Ceebot," an animated and immersive game-like development environment. Multivariate…

  20. The community FabLab platform: applications and implications in biomedical engineering.

    PubMed

    Stephenson, Makeda K; Dow, Douglas E

    2014-01-01

    Skill development in science, technology, engineering and math (STEM) education present one of the most formidable challenges of modern society. The Community FabLab platform presents a viable solution. Each FabLab contains a suite of modern computer numerical control (CNC) equipment, electronics and computing hardware and design, programming, computer aided design (CAD) and computer aided machining (CAM) software. FabLabs are community and educational resources and open to the public. Development of STEM based workforce skills such as digital fabrication and advanced manufacturing can be enhanced using this platform. Particularly notable is the potential of the FabLab platform in STEM education. The active learning environment engages and supports a diversity of learners, while the iterative learning that is supported by the FabLab rapid prototyping platform facilitates depth of understanding, creativity, innovation and mastery. The product and project based learning that occurs in FabLabs develops in the student a personal sense of accomplishment, self-awareness, command of the material and technology. This helps build the interest and confidence necessary to excel in STEM and throughout life. Finally the introduction and use of relevant technologies at every stage of the education process ensures technical familiarity and a broad knowledge base needed for work in STEM based fields. Biomedical engineering education strives to cultivate broad technical adeptness, creativity, interdisciplinary thought, and an ability to form deep conceptual understanding of complex systems. The FabLab platform is well designed to enhance biomedical engineering education.

Top