grid workflow management: Topics by Science.gov

Sample records for grid workflow management

Kwf-Grid workflow management system for Earth science applications

NASA Astrophysics Data System (ADS)

Tran, V.; Hluchy, L.

2009-04-01

In this paper, we present workflow management tool for Earth science applications in EGEE. The workflow management tool was originally developed within K-wf Grid project for GT4 middleware and has many advanced features like semi-automatic workflow composition, user-friendly GUI for managing workflows, knowledge management. In EGEE, we are porting the workflow management tool to gLite middleware for Earth science applications K-wf Grid workflow management system was developed within "Knowledge-based Workflow System for Grid Applications" under the 6th Framework Programme. The workflow mangement system intended to - semi-automatically compose a workflow of Grid services, - execute the composed workflow application in a Grid computing environment, - monitor the performance of the Grid infrastructure and the Grid applications, - analyze the resulting monitoring information, - capture the knowledge that is contained in the information by means of intelligent agents, - and finally to reuse the joined knowledge gathered from all participating users in a collaborative way in order to efficiently construct workflows for new Grid applications. Kwf Grid workflow engines can support different types of jobs (e.g. GRAM job, web services) in a workflow. New class of gLite job has been added to the system, allows system to manage and execute gLite jobs in EGEE infrastructure. The GUI has been adapted to the requirements of EGEE users, new credential management servlet is added to portal. Porting K-wf Grid workflow management system to gLite would allow EGEE users to use the system and benefit from its avanced features. The system is primarly tested and evaluated with applications from ES clusters.
Towards Dynamic Authentication in the Grid — Secure and Mobile Business Workflows Using GSet

NASA Astrophysics Data System (ADS)

Mangler, Jürgen; Schikuta, Erich; Witzany, Christoph; Jorns, Oliver; Ul Haq, Irfan; Wanek, Helmut

Until now, the research community mainly focused on the technical aspects of Grid computing and neglected commercial issues. However, recently the community tends to accept that the success of the Grid is crucially based on commercial exploitation. In our vision Foster's and Kesselman's statement "The Grid is all about sharing." has to be extended by "... and making money out of it!". To allow for the realization of this vision the trust-worthyness of the underlying technology needs to be ensured. This can be achieved by the use of gSET (Gridified Secure Electronic Transaction) as a basic technology for trust management and secure accounting in the presented Grid based workflow. We present a framework, conceptually and technically, from the area of the Mobile-Grid, which justifies the Grid infrastructure as a viable platform to enable commercially successful business workflows.
Cyberinfrastructure for End-to-End Environmental Explorations

NASA Astrophysics Data System (ADS)

Merwade, V.; Kumar, S.; Song, C.; Zhao, L.; Govindaraju, R.; Niyogi, D.

2007-12-01

The design and implementation of a cyberinfrastructure for End-to-End Environmental Exploration (C4E4) is presented. The C4E4 framework addresses the need for an integrated data/computation platform for studying broad environmental impacts by combining heterogeneous data resources with state-of-the-art modeling and visualization tools. With Purdue being a TeraGrid Resource Provider, C4E4 builds on top of the Purdue TeraGrid data management system and Grid resources, and integrates them through a service-oriented workflow system. It allows researchers to construct environmental workflows for data discovery, access, transformation, modeling, and visualization. Using the C4E4 framework, we have implemented an end-to-end SWAT simulation and analysis workflow that connects our TeraGrid data and computation resources. It enables researchers to conduct comprehensive studies on the impact of land management practices in the St. Joseph watershed using data from various sources in hydrologic, atmospheric, agricultural, and other related disciplines.
gProcess and ESIP Platforms for Satellite Imagery Processing over the Grid

NASA Astrophysics Data System (ADS)

Bacu, Victor; Gorgan, Dorian; Rodila, Denisa; Pop, Florin; Neagu, Gabriel; Petcu, Dana

2010-05-01

The Environment oriented Satellite Data Processing Platform (ESIP) is developed through the SEE-GRID-SCI (SEE-GRID eInfrastructure for regional eScience) co-funded by the European Commission through FP7 [1]. The gProcess Platform [2] is a set of tools and services supporting the development and the execution over the Grid of the workflow based processing, and particularly the satelite imagery processing. The ESIP [3], [4] is build on top of the gProcess platform by adding a set of satellite image processing software modules and meteorological algorithms. The satellite images can reveal and supply important information on earth surface parameters, climate data, pollution level, weather conditions that can be used in different research areas. Generally, the processing algorithms of the satellite images can be decomposed in a set of modules that forms a graph representation of the processing workflow. Two types of workflows can be defined in the gProcess platform: abstract workflow (PDG - Process Description Graph), in which the user defines conceptually the algorithm, and instantiated workflow (iPDG - instantiated PDG), which is the mapping of the PDG pattern on particular satellite image and meteorological data [5]. The gProcess platform allows the definition of complex workflows by combining data resources, operators, services and sub-graphs. The gProcess platform is developed for the gLite middleware that is available in EGEE and SEE-GRID infrastructures [6]. gProcess exposes the specific functionality through web services [7]. The Editor Web Service retrieves information on available resources that are used to develop complex workflows (available operators, sub-graphs, services, supported resources, etc.). The Manager Web Service deals with resources management (uploading new resources such as workflows, operators, services, data, etc.) and in addition retrieves information on workflows. The Executor Web Service manages the execution of the instantiated workflows on the Grid infrastructure. In addition, this web service monitors the execution and generates statistical data that are important to evaluate performances and to optimize execution. The Viewer Web Service allows access to input and output data. To prove and to validate the utility of the gProcess and ESIP platforms there were developed the GreenView and GreenLand applications. The GreenView related functionality includes the refinement of some meteorological data such as temperature, and the calibration of the satellite images based on field measurements. The GreenLand application performs the classification of the satellite images by using a set of vegetation indices. The gProcess and ESIP platforms are used as well in GiSHEO project [8] to support the processing of Earth Observation data over the Grid in eGLE (GiSHEO eLearning Environment). Experiments of performance assessment were conducted and they have revealed that the workflow-based execution could improve the execution time of a satellite image processing algorithm [9]. It is not a reliable solution to execute all the workflow nodes on different machines. The execution of some nodes can be more time consuming and they will be performed in a longer time than other nodes. The total execution time will be affected because some nodes will slow down the execution. It is important to correctly balance the workflow nodes. Based on some optimization strategy the workflow nodes can be grouped horizontally, vertically or in a hybrid approach. In this way, those operators will be executed on one machine and also the data transfer between workflow nodes will be lower. The dynamic nature of the Grid infrastructure makes it more exposed to the occurrence of failures. These failures can occur at worker node, services availability, storage element, etc. Currently gProcess has support for some basic error prevention and error management solutions. In future, some more advanced error prevention and management solutions will be integrated in the gProcess platform. References [1] SEE-GRID-SCI Project, http://www.see-grid-sci.eu/ [2] Bacu V., Stefanut T., Rodila D., Gorgan D., Process Description Graph Composition by gProcess Platform. HiPerGRID - 3rd International Workshop on High Performance Grid Middleware, 28 May, Bucharest. Proceedings of CSCS-17 Conference, Vol.2., ISSN 2066-4451, pp. 423-430, (2009). [3] ESIP Platform, http://wiki.egee-see.org/index.php/JRA1_Commonalities [4] Gorgan D., Bacu V., Rodila D., Pop Fl., Petcu D., Experiments on ESIP - Environment oriented Satellite Data Processing Platform. SEE-GRID-SCI User Forum, 9-10 Dec 2009, Bogazici University, Istanbul, Turkey, ISBN: 978-975-403-510-0, pp. 157-166 (2009). [5] Radu, A., Bacu, V., Gorgan, D., Diagrammatic Description of Satellite Image Processing Workflow. Workshop on Grid Computing Applications Development (GridCAD) at the SYNASC Symposium, 28 September 2007, Timisoara, IEEE Computer Press, ISBN 0-7695-3078-8, 2007, pp. 341-348 (2007). [6] Gorgan D., Bacu V., Stefanut T., Rodila D., Mihon D., Grid based Satellite Image Processing Platform for Earth Observation Applications Development. IDAACS'2009 - IEEE Fifth International Workshop on "Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications", 21-23 September, Cosenza, Italy, IEEE Published in Computer Press, 247-252 (2009). [7] Rodila D., Bacu V., Gorgan D., Integration of Satellite Image Operators as Workflows in the gProcess Application. Proceedings of ICCP2009 - IEEE 5th International Conference on Intelligent Computer Communication and Processing, 27-29 Aug, 2009 Cluj-Napoca. ISBN: 978-1-4244-5007-7, pp. 355-358 (2009). [8] GiSHEO consortium, Project site, http://gisheo.info.uvt.ro [9] Bacu V., Gorgan D., Graph Based Evaluation of Satellite Imagery Processing over Grid. ISPDC 2008 - 7th International Symposium on Parallel and Distributed Computing, July 1-5, 2008, Krakow, Poland. IEEE Computer Society 2008, ISBN: 978-0-7695-3472-5, pp. 147-154.
Pervasive access to images and data--the use of computing grids and mobile/wireless devices across healthcare enterprises.

PubMed

Pohjonen, Hanna; Ross, Peeter; Blickman, Johan G; Kamman, Richard

2007-01-01

Emerging technologies are transforming the workflows in healthcare enterprises. Computing grids and handheld mobile/wireless devices are providing clinicians with enterprise-wide access to all patient data and analysis tools on a pervasive basis. In this paper, emerging technologies are presented that provide computing grids and streaming-based access to image and data management functions, and system architectures that enable pervasive computing on a cost-effective basis. Finally, the implications of such technologies are investigated regarding the positive impacts on clinical workflows.
IceProd 2 Usage Experience

NASA Astrophysics Data System (ADS)

Delventhal, D.; Schultz, D.; Diaz Velez, J. C.

2017-10-01

IceProd is a data processing and management framework developed by the IceCube Neutrino Observatory for processing of Monte Carlo simulations, detector data, and data driven analysis. It runs as a separate layer on top of grid and batch systems. This is accomplished by a set of daemons which process job workflow, maintaining configuration and status information on the job before, during, and after processing. IceProd can also manage complex workflow DAGs across distributed computing grids in order to optimize usage of resources. IceProd has recently been rewritten to increase its scaling capabilities, handle user analysis workflows together with simulation production, and facilitate the integration with 3rd party scheduling tools. IceProd 2, the second generation of IceProd, has been running in production for several months now. We share our experience setting up the system and things we’ve learned along the way.
ScyFlow: An Environment for the Visual Specification and Execution of Scientific Workflows

NASA Technical Reports Server (NTRS)

McCann, Karen M.; Yarrow, Maurice; DeVivo, Adrian; Mehrotra, Piyush

2004-01-01

With the advent of grid technologies, scientists and engineers are building more and more complex applications to utilize distributed grid resources. The core grid services provide a path for accessing and utilizing these resources in a secure and seamless fashion. However what the scientists need is an environment that will allow them to specify their application runs at a high organizational level, and then support efficient execution across any given set or sets of resources. We have been designing and implementing ScyFlow, a dual-interface architecture (both GUT and APT) that addresses this problem. The scientist/user specifies the application tasks along with the necessary control and data flow, and monitors and manages the execution of the resulting workflow across the distributed resources. In this paper, we utilize two scenarios to provide the details of the two modules of the project, the visual editor and the runtime workflow engine.
Grid infrastructure for automatic processing of SAR data for flood applications

NASA Astrophysics Data System (ADS)

Kussul, Natalia; Skakun, Serhiy; Shelestov, Andrii

2010-05-01

More and more geosciences applications are being put on to the Grids. Due to the complexity of geosciences applications that is caused by complex workflow, the use of computationally intensive environmental models, the need of management and integration of heterogeneous data sets, Grid offers solutions to tackle these problems. Many geosciences applications, especially those related to the disaster management and mitigations require the geospatial services to be delivered in proper time. For example, information on flooded areas should be provided to corresponding organizations (local authorities, civil protection agencies, UN agencies etc.) no more than in 24 h to be able to effectively allocate resources required to mitigate the disaster. Therefore, providing infrastructure and services that will enable automatic generation of products based on the integration of heterogeneous data represents the tasks of great importance. In this paper we present Grid infrastructure for automatic processing of synthetic-aperture radar (SAR) satellite images to derive flood products. In particular, we use SAR data acquired by ESA's ENVSAT satellite, and neural networks to derive flood extent. The data are provided in operational mode from ESA rolling archive (within ESA Category-1 grant). We developed a portal that is based on OpenLayers frameworks and provides access point to the developed services. Through the portal the user can define geographical region and search for the required data. Upon selection of data sets a workflow is automatically generated and executed on the resources of Grid infrastructure. For workflow execution and management we use Karajan language. The workflow of SAR data processing consists of the following steps: image calibration, image orthorectification, image processing with neural networks, topographic effects removal, geocoding and transformation to lat/long projection, and visualisation. These steps are executed by different software, and can be executed by different resources of the Grid system. The resulting geospatial services are available in various OGC standards such as KML and WMS. Currently, the Grid infrastructure integrates the resources of several geographically distributed organizations, in particular: Space Research Institute NASU-NSAU (Ukraine) with deployed computational and storage nodes based on Globus Toolkit 4 (htpp://www.globus.org) and gLite 3 (http://glite.web.cern.ch) middleware, access to geospatial data and a Grid portal; Institute of Cybernetics of NASU (Ukraine) with deployed computational and storage nodes (SCIT-1/2/3 clusters) based on Globus Toolkit 4 middleware and access to computational resources (approximately 500 processors); Center of Earth Observation and Digital Earth Chinese Academy of Sciences (CEODE-CAS, China) with deployed computational nodes based on Globus Toolkit 4 middleware and access to geospatial data (approximately 16 processors). We are currently adding new geospatial services based on optical satellite data, namely MODIS. This work is carried out jointly with the CEODE-CAS. Using workflow patterns that were developed for SAR data processing we are building new workflows for optical data processing.
SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences.

PubMed

Inda, Márcia A; van Batenburg, Marinus F; Roos, Marco; Belloum, Adam S Z; Vasunin, Dmitry; Wibisono, Adianto; van Kampen, Antoine H C; Breit, Timo M

2008-08-08

Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perform in silico experimentation conveniently with this genomics data, biologists need tools to process and compare datasets routinely and explore the obtained results interactively. The complexity of such experimentation requires these tools to be based on an e-Science approach, hence generic, modular, and reusable. A virtual laboratory environment with workflows, workflow management systems, and Grid computation are therefore essential. Here we apply an e-Science approach to develop SigWin-detector, a workflow-based tool that can detect significantly enriched windows of (genomic) features in a (DNA) sequence in a fast and reproducible way. For proof-of-principle, we utilize a biological use case to detect regions of increased and decreased gene expression (RIDGEs and anti-RIDGEs) in human transcriptome maps. We improved the original method for RIDGE detection by replacing the costly step of estimation by random sampling with a faster analytical formula for computing the distribution of the null hypothesis being tested and by developing a new algorithm for computing moving medians. SigWin-detector was developed using the WS-VLAM workflow management system and consists of several reusable modules that are linked together in a basic workflow. The configuration of this basic workflow can be adapted to satisfy the requirements of the specific in silico experiment. As we show with the results from analyses in the biological use case on RIDGEs, SigWin-detector is an efficient and reusable Grid-based tool for discovering windows enriched for features of a particular type in any sequence of values. Thus, SigWin-detector provides the proof-of-principle for the modular e-Science based concept of integrative bioinformatics experimentation.
Grid-based platform for training in Earth Observation

NASA Astrophysics Data System (ADS)

Petcu, Dana; Zaharie, Daniela; Panica, Silviu; Frincu, Marc; Neagul, Marian; Gorgan, Dorian; Stefanut, Teodor

2010-05-01

GiSHEO platform [1] providing on-demand services for training and high education in Earth Observation is developed, in the frame of an ESA funded project through its PECS programme, to respond to the needs of powerful education resources in remote sensing field. It intends to be a Grid-based platform of which potential for experimentation and extensibility are the key benefits compared with a desktop software solution. Near-real time applications requiring simultaneous multiple short-time-response data-intensive tasks, as in the case of a short time training event, are the ones that are proved to be ideal for this platform. The platform is based on Globus Toolkit 4 facilities for security and process management, and on the clusters of four academic institutions involved in the project. The authorization uses a VOMS service. The main public services are the followings: the EO processing services (represented through special WSRF-type services); the workflow service exposing a particular workflow engine; the data indexing and discovery service for accessing the data management mechanisms; the processing services, a collection allowing easy access to the processing platform. The WSRF-type services for basic satellite image processing are reusing free image processing tools, OpenCV and GDAL. New algorithms and workflows were develop to tackle with challenging problems like detecting the underground remains of old fortifications, walls or houses. More details can be found in [2]. Composed services can be specified through workflows and are easy to be deployed. The workflow engine, OSyRIS (Orchestration System using a Rule based Inference Solution), is based on DROOLS, and a new rule-based workflow language, SILK (SImple Language for worKflow), has been built. Workflow creation in SILK can be done with or without a visual designing tools. The basics of SILK are the tasks and relations (rules) between them. It is similar with the SCUFL language, but not relying on XML in order to allow the introduction of more workflow specific issues. Moreover, an event-condition-action (ECA) approach allows a greater flexibility when expressing data and task dependencies, as well as the creation of adaptive workflows which can react to changes in the configuration of the Grid or in the workflow itself. Changes inside the grid are handled by creating specific rules which allow resource selection based on various task scheduling criteria. Modifications of the workflow are usually accomplished either by inserting or retracting at runtime rules belonging to it or by modifying the executor of the task in case a better one is found. The former implies changes in its structure while the latter does not necessarily mean changes of the resource but more precisely changes of the algorithm used for solving the task. More details can be found in [3]. Another important platform component is the data indexing and storage service, GDIS, providing features for data storage, indexing data using a specialized RDBMS, finding data by various conditions, querying external services and keeping track of temporary data generated by other components. The data storage component part of GDIS is responsible for storing the data by using available storage backends such as local disk file systems (ext3), local cluster storage (GFS) or distributed file systems (HDFS). A front-end GridFTP service is capable of interacting with the storage domains on behalf of the clients and in a uniform way and also enforces the security restrictions provided by other specialized services and related with data access. The data indexing is performed by PostGIS. An advanced and flexible interface for searching the project's geographical repository is built around a custom query language (LLQL - Lisp Like Query Language) designed to provide fine grained access to the data in the repository and to query external services (e.g. for exploiting the connection with GENESI-DR catalog). More details can be found in [4]. The Workload Management System (WMS) provides two types of resource managers. The first one will be based on Condor HTC and use Condor as a job manager for task dispatching and working nodes (for development purposes) while the second one will use GT4 GRAM (for production purposes). The WMS main component, the Grid Task Dispatcher (GTD), is responsible for the interaction with other internal services as the composition engine in order to facilitate access to the processing platform. Its main responsibilities are to receive tasks from the workflow engine or directly from user interface, to use a task description language (the ClassAd meta language in case of Condor HTC) for job units, to submit and check the status of jobs inside the workload management system and to retrieve job logs for debugging purposes. More details can be found in [4]. A particular component of the platform is eGLE, the eLearning environment. It provides the functionalities necessary to create the visual appearance of the lessons through the usage of visual containers like tools, patterns and templates. The teacher uses the platform for testing the already created lessons, as well as for developing new lesson resources, such as new images and workflows describing graph-based processing. The students execute the lessons or describe and experiment with new workflows or different data. The eGLE database includes several workflow-based lesson descriptions, teaching materials and lesson resources, selected satellite and spatial data. More details can be found in [5]. A first training event of using the platform was organized in September 2009 during 11th SYNASC symposium (links to the demos, testing interface, and exercises are available on project site [1]). The eGLE component was presented at 4th GPC conference in May 2009. Moreover, the functionality of the platform will be presented as demo in April 2010 at 5th EGEE User forum. References: [1] GiSHEO consortium, Project site, http://gisheo.info.uvt.ro [2] D. Petcu, D. Zaharie, M. Neagul, S. Panica, M. Frincu, D. Gorgan, T. Stefanut, V. Bacu, Remote Sensed Image Processing on Grids for Training in Earth Observation. In Image Processing, V. Kordic (ed.), In-Tech, January 2010. [3] M. Neagul, S. Panica, D. Petcu, D. Zaharie, D. Gorgan, Web and Grid Services for Training in Earth Observation, IDAACS 2009, IEEE Computer Press, 241-246 [4] M. Frincu, S. Panica, M. Neagul, D. Petcu, Gisheo: On Demand Grid Service Based Platform for EO Data Processing. HiperGrid 2009, Politehnica Press, 415-422. [5] D. Gorgan, T. Stefanut, V. Bacu, Grid Based Training Environment for Earth Observation, GPC 2009, LNCS 5529, 98-109
Pegasus Workflow Management System: Helping Applications From Earth and Space

NASA Astrophysics Data System (ADS)

Mehta, G.; Deelman, E.; Vahi, K.; Silva, F.

2010-12-01

Pegasus WMS is a Workflow Management System that can manage large-scale scientific workflows across Grid, local and Cloud resources simultaneously. Pegasus WMS provides a means for representing the workflow of an application in an abstract XML form, agnostic of the resources available to run it and the location of data and executables. It then compiles these workflows into concrete plans by querying catalogs and farming computations across local and distributed computing resources, as well as emerging commercial and community cloud environments in an easy and reliable manner. Pegasus WMS optimizes the execution as well as data movement by leveraging existing Grid and cloud technologies via a flexible pluggable interface and provides advanced features like reusing existing data, automatic cleanup of generated data, and recursive workflows with deferred planning. It also captures all the provenance of the workflow from the planning stage to the execution of the generated data, helping scientists to accurately measure performance metrics of their workflow as well as data reproducibility issues. Pegasus WMS was initially developed as part of the GriPhyN project to support large-scale high-energy physics and astrophysics experiments. Direct funding from the NSF enabled support for a wide variety of applications from diverse domains including earthquake simulation, bacterial RNA studies, helioseismology and ocean modeling. Earthquake Simulation: Pegasus WMS was recently used in a large scale production run in 2009 by the Southern California Earthquake Centre to run 192 million loosely coupled tasks and about 2000 tightly coupled MPI style tasks on National Cyber infrastructure for generating a probabilistic seismic hazard map of the Southern California region. SCEC ran 223 workflows over a period of eight weeks, using on average 4,420 cores, with a peak of 14,540 cores. A total of 192 million files were produced totaling about 165TB out of which 11TB of data was saved. Astrophysics: The Laser Interferometer Gravitational-Wave Observatory (LIGO) uses Pegasus WMS to search for binary inspiral gravitational waves. A month of LIGO data requires many thousands of jobs, running for days on hundreds of CPUs on the LIGO Data Grid (LDG) and Open Science Grid (OSG). Ocean Temperature Forecast: Researchers at the Jet Propulsion Laboratory are exploring Pegasus WMS to run ocean forecast ensembles of the California coastal region. These models produce a number of daily forecasts for water temperature, salinity, and other measures. Helioseismology: The Solar Dynamics Observatory (SDO) is NASA's most important solar physics mission of this coming decade. Pegasus WMS is being used to analyze the data from SDO, which will be predominantly used to learn about solar magnetic activity and to probe the internal structure and dynamics of the Sun with helioseismology. Bacterial RNA studies: SIPHT is an application in bacterial genomics, which predicts sRNA (small non-coding RNAs)-encoding genes in bacteria. This project currently provides a web-based interface using Pegasus WMS at the backend to facilitate large-scale execution of the workflows on varied resources and provide better notifications of task/workflow completion.
Grid workflow validation using ontology-based tacit knowledge: A case study for quantitative remote sensing applications

NASA Astrophysics Data System (ADS)

Liu, Jia; Liu, Longli; Xue, Yong; Dong, Jing; Hu, Yingcui; Hill, Richard; Guang, Jie; Li, Chi

2017-01-01

Workflow for remote sensing quantitative retrieval is the ;bridge; between Grid services and Grid-enabled application of remote sensing quantitative retrieval. Workflow averts low-level implementation details of the Grid and hence enables users to focus on higher levels of application. The workflow for remote sensing quantitative retrieval plays an important role in remote sensing Grid and Cloud computing services, which can support the modelling, construction and implementation of large-scale complicated applications of remote sensing science. The validation of workflow is important in order to support the large-scale sophisticated scientific computation processes with enhanced performance and to minimize potential waste of time and resources. To research the semantic correctness of user-defined workflows, in this paper, we propose a workflow validation method based on tacit knowledge research in the remote sensing domain. We first discuss the remote sensing model and metadata. Through detailed analysis, we then discuss the method of extracting the domain tacit knowledge and expressing the knowledge with ontology. Additionally, we construct the domain ontology with Protégé. Through our experimental study, we verify the validity of this method in two ways, namely data source consistency error validation and parameters matching error validation.
A high throughput geocomputing system for remote sensing quantitative retrieval and a case study

NASA Astrophysics Data System (ADS)

Xue, Yong; Chen, Ziqiang; Xu, Hui; Ai, Jianwen; Jiang, Shuzheng; Li, Yingjie; Wang, Ying; Guang, Jie; Mei, Linlu; Jiao, Xijuan; He, Xingwei; Hou, Tingting

2011-12-01

The quality and accuracy of remote sensing instruments have been improved significantly, however, rapid processing of large-scale remote sensing data becomes the bottleneck for remote sensing quantitative retrieval applications. The remote sensing quantitative retrieval is a data-intensive computation application, which is one of the research issues of high throughput computation. The remote sensing quantitative retrieval Grid workflow is a high-level core component of remote sensing Grid, which is used to support the modeling, reconstruction and implementation of large-scale complex applications of remote sensing science. In this paper, we intend to study middleware components of the remote sensing Grid - the dynamic Grid workflow based on the remote sensing quantitative retrieval application on Grid platform. We designed a novel architecture for the remote sensing Grid workflow. According to this architecture, we constructed the Remote Sensing Information Service Grid Node (RSSN) with Condor. We developed a graphic user interface (GUI) tools to compose remote sensing processing Grid workflows, and took the aerosol optical depth (AOD) retrieval as an example. The case study showed that significant improvement in the system performance could be achieved with this implementation. The results also give a perspective on the potential of applying Grid workflow practices to remote sensing quantitative retrieval problems using commodity class PCs.
Integrating Clinical Trial Imaging Data Resources Using Service-Oriented Architecture and Grid Computing

PubMed Central

Cladé, Thierry; Snyder, Joshua C.

2010-01-01

Clinical trials which use imaging typically require data management and workflow integration across several parties. We identify opportunities for all parties involved to realize benefits with a modular interoperability model based on service-oriented architecture and grid computing principles. We discuss middleware products for implementation of this model, and propose caGrid as an ideal candidate due to its healthcare focus; free, open source license; and mature developer tools and support. PMID:20449775
IceProd 2: A Next Generation Data Analysis Framework for the IceCube Neutrino Observatory

NASA Astrophysics Data System (ADS)

Schultz, D.

2015-12-01

We describe the overall structure and new features of the second generation of IceProd, a data processing and management framework. IceProd was developed by the IceCube Neutrino Observatory for processing of Monte Carlo simulations, detector data, and analysis levels. It runs as a separate layer on top of grid and batch systems. This is accomplished by a set of daemons which process job workflow, maintaining configuration and status information on the job before, during, and after processing. IceProd can also manage complex workflow DAGs across distributed computing grids in order to optimize usage of resources. IceProd is designed to be very light-weight; it runs as a python application fully in user space and can be set up easily. For the initial completion of this second version of IceProd, improvements have been made to increase security, reliability, scalability, and ease of use.
Web service module for access to g-Lite

NASA Astrophysics Data System (ADS)

Goranova, R.; Goranov, G.

2012-10-01

G-Lite is a lightweight grid middleware for grid computing installed on all clusters of the European Grid Infrastructure (EGI). The middleware is partially service-oriented and does not provide well-defined Web services for job management. The existing Web services in the environment cannot be directly used by grid users for building service compositions in the EGI. In this article we present a module of well-defined Web services for job management in the EGI. We describe the architecture of the module and the design of the developed Web services. The presented Web services are composable and can participate in service compositions (workflows). An example of usage of the module with tools for service compositions in g-Lite is shown.
A virtual data language and system for scientific workflow management in data grid environments

NASA Astrophysics Data System (ADS)

Zhao, Yong

With advances in scientific instrumentation and simulation, scientific data is growing fast in both size and analysis complexity. So-called Data Grids aim to provide high performance, distributed data analysis infrastructure for data- intensive sciences, where scientists distributed worldwide need to extract information from large collections of data, and to share both data products and the resources needed to produce and store them. However, the description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed workflow notation called virtual data language, within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful virtual data system that supports distributed execution. The resulting language and system are capable of expressing complex workflows in a simple compact form, enacting those workflows in distributed environments, monitoring and recording the execution processes, and tracing the derivation history of data products. We describe the motivation, design, implementation, and evaluation of the virtual data language and system, and the application of the virtual data paradigm in various science disciplines, including astronomy, cognitive neuroscience.
Cloud Bursting with GlideinWMS: Means to satisfy ever increasing computing needs for Scientific Workflows

NASA Astrophysics Data System (ADS)

Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt; Larson, Krista; Sfiligoi, Igor; Rynge, Mats

2014-06-01

Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared over the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by "Big Data" will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.
Cloud Bursting with GlideinWMS: Means to satisfy ever increasing computing needs for Scientific Workflows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt

Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared overmore » the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by 'Big Data' will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.« less
From the desktop to the grid: scalable bioinformatics via workflow conversion.

PubMed

de la Garza, Luis; Veit, Johannes; Szolek, Andras; Röttig, Marc; Aiche, Stephan; Gesing, Sandra; Reinert, Knut; Kohlbacher, Oliver

2016-03-12

Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community. We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources. Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We strongly believe that our efforts not only decrease the turnaround time to obtain scientific results but also have a positive impact on reproducibility, thus elevating the quality of obtained scientific results.

Digital Library Storage using iRODS Data Grids

NASA Astrophysics Data System (ADS)

Hedges, Mark; Blanke, Tobias; Hasan, Adil

Digital repository software provides a powerful and flexible infrastructure for managing and delivering complex digital resources and metadata. However, issues can arise in managing the very large, distributed data files that may constitute these resources. This paper describes an implementation approach that combines the Fedora digital repository software with a storage layer implemented as a data grid, using the iRODS middleware developed by DICE (Data Intensive Cyber Environments) as the successor to SRB. This approach allows us to use Fedoras flexible architecture to manage the structure of resources and to provide application- layer services to users. The grid-based storage layer provides efficient support for managing and processing the underlying distributed data objects, which may be very large (e.g. audio-visual material). The Rule Engine built into iRODS is used to integrate complex workflows at the data level that need not be visible to users, e.g. digital preservation functionality.
Potential of knowledge discovery using workflows implemented in the C3Grid

NASA Astrophysics Data System (ADS)

Engel, Thomas; Fink, Andreas; Ulbrich, Uwe; Schartner, Thomas; Dobler, Andreas; Fritzsch, Bernadette; Hiller, Wolfgang; Bräuer, Benny

2013-04-01

With the increasing number of climate simulations, reanalyses and observations, new infrastructures to search and analyse distributed data are necessary. In recent years, the Grid architecture became an important technology to fulfill these demands. For the German project "Collaborative Climate Community Data and Processing Grid" (C3Grid) computer scientists and meteorologists developed a system that offers its users a webinterface to search and download climate data and use implemented analysis tools (called workflows) to further investigate them. In this contribution, two workflows that are implemented in the C3Grid architecture are presented: the Cyclone Tracking (CT) and Stormtrack workflow. They shall serve as an example on how to perform numerous investigations on midlatitude winterstorms on a large amount of analysis and climate model data without having an insight into the data source, program code and a low-to-moderate understanding of the theortical background. CT is based on the work of Murray and Simmonds (1991) to identify and track local minima in the mean sea level pressure (MSLP) field of the selected dataset. Adjustable thresholds for the curvature of the isobars as well as the minimum lifetime of a cyclone allow the distinction of weak subtropical heat low systems and stronger midlatitude cyclones e.g. in the Northern Atlantic. The user gets the resulting track data including statistics about the track density, average central pressure, average central curvature, cyclogenesis and cyclolysis as well as pre-built visualizations of these results. Stormtrack calculates the 2.5-6 day bandpassfiltered standard deviation of the geopotential height on a selected pressure level. Although this workflow needs much less computational effort compared to CT it shows structures that are in good agreement with the track density of the CT workflow. To what extent changes in the mid-level tropospheric storm track are reflected in trough density and intensity alteration of surface cyclones. A specific feature of C3Grid is the flexible Workflow Scheduling Service (WSS) which also allows for automated nightly analysis runs of CT, Stormtrack, etc. with different input parameter sets. The statistical results of these workflows can be accumulated afterwards by a scheduled final analysis step, thereby providing a tool for data intensive analytics for the massive amounts of climate model data accessible through C3Grid. First tests with these automated analysis workflows show promising results to speed up the investigation of high volume modeling data. This example is relevant to the thorough analysis of future changes in storminess in Europe and is just one example of the potential of knowledge discovery using automated workflows implemented in the C3Grid architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Garzoglio, Gabriele

The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center are working on a multi-year Collaborative Research and Development Agreement.With the knowledge developed in the first year on how to provision and manage a federation of virtual machines through Cloud management systems. In this second year, we expanded the work on provisioning and federation, increasing both scale and diversity of solutions, and we started to build on-demand services on the established fabric, introducing the paradigm of Platform as a Service to assist with the execution of scientific workflows. We have enabled scientific workflows ofmore » stakeholders to run on multiple cloud resources at the scale of 1,000 concurrent machines. The demonstrations have been in the areas of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federation of Cloud Resources, and (c) On-demand Services for ScientificWorkflows.« less
Grid workflow job execution service 'Pilot'

NASA Astrophysics Data System (ADS)

Shamardin, Lev; Kryukov, Alexander; Demichev, Andrey; Ilyin, Vyacheslav

2011-12-01

'Pilot' is a grid job execution service for workflow jobs. The main goal for the service is to automate computations with multiple stages since they can be expressed as simple workflows. Each job is a directed acyclic graph of tasks and each task is an execution of something on a grid resource (or 'computing element'). Tasks may be submitted to any WS-GRAM (Globus Toolkit 4) service. The target resources for the tasks execution are selected by the Pilot service from the set of available resources which match the specific requirements from the task and/or job definition. Some simple conditional execution logic is also provided. The 'Pilot' service is built on the REST concepts and provides a simple API through authenticated HTTPS. This service is deployed and used in production in a Russian national grid project GridNNN.
A Workflow-based Intelligent Network Data Movement Advisor with End-to-end Performance Optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Michelle M.; Wu, Chase Q.

2013-11-07

Next-generation eScience applications often generate large amounts of simulation, experimental, or observational data that must be shared and managed by collaborative organizations. Advanced networking technologies and services have been rapidly developed and deployed to facilitate such massive data transfer. However, these technologies and services have not been fully utilized mainly because their use typically requires significant domain knowledge and in many cases application users are even not aware of their existence. By leveraging the functionalities of an existing Network-Aware Data Movement Advisor (NADMA) utility, we propose a new Workflow-based Intelligent Network Data Movement Advisor (WINDMA) with end-to-end performance optimization formore » this DOE funded project. This WINDMA system integrates three major components: resource discovery, data movement, and status monitoring, and supports the sharing of common data movement workflows through account and database management. This system provides a web interface and interacts with existing data/space management and discovery services such as Storage Resource Management, transport methods such as GridFTP and GlobusOnline, and network resource provisioning brokers such as ION and OSCARS. We demonstrate the efficacy of the proposed transport-support workflow system in several use cases based on its implementation and deployment in DOE wide-area networks.« less
AstroGrid: Taverna in the Virtual Observatory .

NASA Astrophysics Data System (ADS)

Benson, K. M.; Walton, N. A.

This paper reports on the implementation of the Taverna workbench by AstroGrid, a tool for designing and executing workflows of tasks in the Virtual Observatory. The workflow approach helps astronomers perform complex task sequences with little technical effort. Visual approach to workflow construction streamlines highly complex analysis over public and private data and uses computational resources as minimal as a desktop computer. Some integration issues and future work are discussed in this article.
Workflow management in large distributed systems

NASA Astrophysics Data System (ADS)

Legrand, I.; Newman, H.; Voicu, R.; Dobre, C.; Grigoras, C.

2011-12-01

The MonALISA (Monitoring Agents using a Large Integrated Services Architecture) framework provides a distributed service system capable of controlling and optimizing large-scale, data-intensive applications. An essential part of managing large-scale, distributed data-processing facilities is a monitoring system for computing facilities, storage, networks, and the very large number of applications running on these systems in near realtime. All this monitoring information gathered for all the subsystems is essential for developing the required higher-level services—the components that provide decision support and some degree of automated decisions—and for maintaining and optimizing workflow in large-scale distributed systems. These management and global optimization functions are performed by higher-level agent-based services. We present several applications of MonALISA's higher-level services including optimized dynamic routing, control, data-transfer scheduling, distributed job scheduling, dynamic allocation of storage resource to running jobs and automated management of remote services among a large set of grid facilities.
AstroGrid-D: Grid technology for astronomical science

NASA Astrophysics Data System (ADS)

Enke, Harry; Steinmetz, Matthias; Adorf, Hans-Martin; Beck-Ratzka, Alexander; Breitling, Frank; Brüsemeister, Thomas; Carlson, Arthur; Ensslin, Torsten; Högqvist, Mikael; Nickelt, Iliya; Radke, Thomas; Reinefeld, Alexander; Reiser, Angelika; Scholl, Tobias; Spurzem, Rainer; Steinacker, Jürgen; Voges, Wolfgang; Wambsganß, Joachim; White, Steve

2011-02-01

We present status and results of AstroGrid-D, a joint effort of astrophysicists and computer scientists to employ grid technology for scientific applications. AstroGrid-D provides access to a network of distributed machines with a set of commands as well as software interfaces. It allows simple use of computer and storage facilities and to schedule or monitor compute tasks and data management. It is based on the Globus Toolkit middleware (GT4). Chapter 1 describes the context which led to the demand for advanced software solutions in Astrophysics, and we state the goals of the project. We then present characteristic astrophysical applications that have been implemented on AstroGrid-D in chapter 2. We describe simulations of different complexity, compute-intensive calculations running on multiple sites (Section 2.1), and advanced applications for specific scientific purposes (Section 2.2), such as a connection to robotic telescopes (Section 2.2.3). We can show from these examples how grid execution improves e.g. the scientific workflow. Chapter 3 explains the software tools and services that we adapted or newly developed. Section 3.1 is focused on the administrative aspects of the infrastructure, to manage users and monitor activity. Section 3.2 characterises the central components of our architecture: The AstroGrid-D information service to collect and store metadata, a file management system, the data management system, and a job manager for automatic submission of compute tasks. We summarise the successfully established infrastructure in chapter 4, concluding with our future plans to establish AstroGrid-D as a platform of modern e-Astronomy.
Initial steps towards a production platform for DNA sequence analysis on the grid.

PubMed

Luyf, Angela C M; van Schaik, Barbera D C; de Vries, Michel; Baas, Frank; van Kampen, Antoine H C; Olabarriaga, Silvia D

2010-12-14

Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/
Task Management in the New ATLAS Production System

NASA Astrophysics Data System (ADS)

De, K.; Golubkov, D.; Klimentov, A.; Potekhin, M.; Vaniachine, A.; Atlas Collaboration

2014-06-01

This document describes the design of the new Production System of the ATLAS experiment at the LHC [1]. The Production System is the top level workflow manager which translates physicists' needs for production level processing and analysis into actual workflows executed across over a hundred Grid sites used globally by ATLAS. As the production workload increased in volume and complexity in recent years (the ATLAS production tasks count is above one million, with each task containing hundreds or thousands of jobs) there is a need to upgrade the Production System to meet the challenging requirements of the next LHC run while minimizing the operating costs. In the new design, the main subsystems are the Database Engine for Tasks (DEFT) and the Job Execution and Definition Interface (JEDI). Based on users' requests, DEFT manages inter-dependent groups of tasks (Meta-Tasks) and generates corresponding data processing workflows. The JEDI component then dynamically translates the task definitions from DEFT into actual workload jobs executed in the PanDA Workload Management System [2]. We present the requirements, design parameters, basics of the object model and concrete solutions utilized in building the new Production System and its components.
Managing the CMS Data and Monte Carlo Processing during LHC Run 2

NASA Astrophysics Data System (ADS)

Wissing, C.; CMS Collaboration

2017-10-01

In order to cope with the challenges expected during the LHC Run 2 CMS put in a number of enhancements into the main software packages and the tools used for centrally managed processing. In the presentation we will highlight these improvements that allow CMS to deal with the increased trigger output rate, the increased pileup and the evolution in computing technology. The overall system aims at high flexibility, improved operational flexibility and largely automated procedures. The tight coupling of workflow classes to types of sites has been drastically relaxed. Reliable and high-performing networking between most of the computing sites and the successful deployment of a data-federation allow the execution of workflows using remote data access. That required the development of a largely automatized system to assign workflows and to handle necessary pre-staging of data. Another step towards flexibility has been the introduction of one large global HTCondor Pool for all types of processing workflows and analysis jobs. Besides classical Grid resources also some opportunistic resources as well as Cloud resources have been integrated into that Pool, which gives reach to more than 200k CPU cores.
Incorporating Brokers within Collaboration Environments

NASA Astrophysics Data System (ADS)

Rajasekar, A.; Moore, R.; de Torcy, A.

2013-12-01

A collaboration environment, such as the integrated Rule Oriented Data System (iRODS - http://irods.diceresearch.org), provides interoperability mechanisms for accessing storage systems, authentication systems, messaging systems, information catalogs, networks, and policy engines from a wide variety of clients. The interoperability mechanisms function as brokers, translating actions requested by clients to the protocol required by a specific technology. The iRODS data grid is used to enable collaborative research within hydrology, seismology, earth science, climate, oceanography, plant biology, astronomy, physics, and genomics disciplines. Although each domain has unique resources, data formats, semantics, and protocols, the iRODS system provides a generic framework that is capable of managing collaborative research initiatives that span multiple disciplines. Each interoperability mechanism (broker) is linked to a name space that enables unified access across the heterogeneous systems. The collaboration environment provides not only support for brokers, but also support for virtualization of name spaces for users, files, collections, storage systems, metadata, and policies. The broker enables access to data or information in a remote system using the appropriate protocol, while the collaboration environment provides a uniform naming convention for accessing and manipulating each object. Within the NSF DataNet Federation Consortium project (http://www.datafed.org), three basic types of interoperability mechanisms have been identified and applied: 1) drivers for managing manipulation at the remote resource (such as data subsetting), 2) micro-services that execute the protocol required by the remote resource, and 3) policies for controlling the execution. For example, drivers have been written for manipulating NetCDF and HDF formatted files within THREDDS servers. Micro-services have been written that manage interactions with the CUAHSI data repository, the DataONE information catalog, and the GeoBrain broker. Policies have been written that manage transfer of messages between an iRODS message queue and the Advanced Message Queuing Protocol. Examples of these brokering mechanisms will be presented. The DFC collaboration environment serves as the intermediary between community resources and compute grids, enabling reproducible data-driven research. It is possible to create an analysis workflow that retrieves data subsets from a remote server, assemble the required input files, automate the execution of the workflow, automatically track the provenance of the workflow, and share the input files, workflow, and output files. A collaborator can re-execute a shared workflow, compare results, change input files, and re-execute an analysis.
Design and implementation of GRID-based PACS in a hospital with multiple imaging departments

NASA Astrophysics Data System (ADS)

Yang, Yuanyuan; Jin, Jin; Sun, Jianyong; Zhang, Jianguo

2008-03-01

Usually, there were multiple clinical departments providing imaging-enabled healthcare services in enterprise healthcare environment, such as radiology, oncology, pathology, and cardiology, the picture archiving and communication system (PACS) is now required to support not only radiology-based image display, workflow and data flow management, but also to have more specific expertise imaging processing and management tools for other departments providing imaging-guided diagnosis and therapy, and there were urgent demand to integrate the multiple PACSs together to provide patient-oriented imaging services for enterprise collaborative healthcare. In this paper, we give the design method and implementation strategy of developing grid-based PACS (Grid-PACS) for a hospital with multiple imaging departments or centers. The Grid-PACS functions as a middleware between the traditional PACS archiving servers and workstations or image viewing clients and provide DICOM image communication and WADO services to the end users. The images can be stored in distributed multiple archiving servers, but can be managed with central mode. The grid-based PACS has auto image backup and disaster recovery services and can provide best image retrieval path to the image requesters based on the optimal algorithms. The designed grid-based PACS has been implemented in Shanghai Huadong Hospital and been running for two years smoothly.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.

PubMed

Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A

2016-01-01

High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid

PubMed Central

Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.

2016-01-01

High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617
Integration of Earth System Models and Workflow Management under iRODS for the Northeast Regional Earth System Modeling Project

NASA Astrophysics Data System (ADS)

Lengyel, F.; Yang, P.; Rosenzweig, B.; Vorosmarty, C. J.

2012-12-01

The Northeast Regional Earth System Model (NE-RESM, NSF Award #1049181) integrates weather research and forecasting models, terrestrial and aquatic ecosystem models, a water balance/transport model, and mesoscale and energy systems input-out economic models developed by interdisciplinary research team from academia and government with expertise in physics, biogeochemistry, engineering, energy, economics, and policy. NE-RESM is intended to forecast the implications of planning decisions on the region's environment, ecosystem services, energy systems and economy through the 21st century. Integration of model components and the development of cyberinfrastructure for interacting with the system is facilitated with the integrated Rule Oriented Data System (iRODS), a distributed data grid that provides archival storage with metadata facilities and a rule-based workflow engine for automating and auditing scientific workflows.
The QuakeSim Project: Web Services for Managing Geophysical Data and Applications

NASA Astrophysics Data System (ADS)

Pierce, Marlon E.; Fox, Geoffrey C.; Aktas, Mehmet S.; Aydin, Galip; Gadgil, Harshawardhan; Qi, Zhigang; Sayar, Ahmet

2008-04-01

We describe our distributed systems research efforts to build the “cyberinfrastructure” components that constitute a geophysical Grid, or more accurately, a Grid of Grids. Service-oriented computing principles are used to build a distributed infrastructure of Web accessible components for accessing data and scientific applications. Our data services fall into two major categories: Archival, database-backed services based around Geographical Information System (GIS) standards from the Open Geospatial Consortium, and streaming services that can be used to filter and route real-time data sources such as Global Positioning System data streams. Execution support services include application execution management services and services for transferring remote files. These data and execution service families are bound together through metadata information and workflow services for service orchestration. Users may access the system through the QuakeSim scientific Web portal, which is built using a portlet component approach.
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

NASA Astrophysics Data System (ADS)

Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

2012-09-01

By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data, and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.
Semantics-enabled service discovery framework in the SIMDAT pharma grid.

PubMed

Qu, Cangtao; Zimmermann, Falk; Kumpf, Kai; Kamuzinzi, Richard; Ledent, Valérie; Herzog, Robert

2008-03-01

We present the design and implementation of a semantics-enabled service discovery framework in the data Grids for process and product development using numerical simulation and knowledge discovery (SIMDAT) Pharma Grid, an industry-oriented Grid environment for integrating thousands of Grid-enabled biological data services and analysis services. The framework consists of three major components: the Web ontology language (OWL)-description logic (DL)-based biological domain ontology, OWL Web service ontology (OWL-S)-based service annotation, and semantic matchmaker based on the ontology reasoning. Built upon the framework, workflow technologies are extensively exploited in the SIMDAT to assist biologists in (semi)automatically performing in silico experiments. We present a typical usage scenario through the case study of a biological workflow: IXodus.
Making the most of cloud storage - a toolkit for exploitation by WLCG experiments

NASA Astrophysics Data System (ADS)

Alvarez Ayllon, Alejandro; Arsuaga Rios, Maria; Bitzes, Georgios; Furano, Fabrizio; Keeble, Oliver; Manzi, Andrea

2017-10-01

Understanding how cloud storage can be effectively used, either standalone or in support of its associated compute, is now an important consideration for WLCG. We report on a suite of extensions to familiar tools targeted at enabling the integration of cloud object stores into traditional grid infrastructures and workflows. Notable updates include support for a number of object store flavours in FTS3, Davix and gfal2, including mitigations for lack of vector reads; the extension of Dynafed to operate as a bridge between grid and cloud domains; protocol translation in FTS3; the implementation of extensions to DPM (also implemented by the dCache project) to allow 3rd party transfers over HTTP. The result is a toolkit which facilitates data movement and access between grid and cloud infrastructures, broadening the range of workflows suitable for cloud. We report on deployment scenarios and prototype experience, explaining how, for example, an Amazon S3 or Azure allocation can be exploited by grid workflows.

WRF4SG: A Scientific Gateway for climate experiment workflows

NASA Astrophysics Data System (ADS)

Blanco, Carlos; Cofino, Antonio S.; Fernandez-Quiruelas, Valvanuz

2013-04-01

The Weather Research and Forecasting model (WRF) is a community-driven and public domain model widely used by the weather and climate communities. As opposite to other application-oriented models, WRF provides a flexible and computationally-efficient framework which allows solving a variety of problems for different time-scales, from weather forecast to climate change projection. Furthermore, WRF is also widely used as a research tool in modeling physics, dynamics, and data assimilation by the research community. Climate experiment workflows based on Weather Research and Forecasting (WRF) are nowadays among the one of the most cutting-edge applications. These workflows are complex due to both large storage and the huge number of simulations executed. In order to manage that, we have developed a scientific gateway (SG) called WRF for Scientific Gateway (WRF4SG) based on WS-PGRADE/gUSE and WRF4G frameworks to ease achieve WRF users needs (see [1] and [2]). WRF4SG provides services for different use cases that describe the different interactions between WRF users and the WRF4SG interface in order to show how to run a climate experiment. As WS-PGRADE/gUSE uses portlets (see [1]) to interact with users, its portlets will support these use cases. A typical experiment to be carried on by a WRF user will consist on a high-resolution regional re-forecast. These re-forecasts are common experiments used as input data form wind power energy and natural hazards (wind and precipitation fields). In the cases below, the user is able to access to different resources such as Grid due to the fact that WRF needs a huge amount of computing resources in order to generate useful simulations: * Resource configuration and user authentication: The first step is to authenticate on users' Grid resources by virtual organizations. After login, the user is able to select which virtual organization is going to be used by the experiment. * Data assimilation: In order to assimilate the data sources, the user has to select them browsing through LFC Portlet. * Design Experiment workflow: In order to configure the experiment, the user will define the type of experiment (i.e. re-forecast), and its attributes to simulate. In this case the main attributes are: the field of interest (wind, precipitation, ...), the start and end date simulation and the requirements of the experiment. * Monitor workflow: In order to monitor the experiment the user will receive notification messages based on events and also the gateway will display the progress of the experiment. * Data storage: Like Data assimilation case, the user is able to browse and view the output data simulations using LFC Portlet. The objectives of WRF4SG can be described by considering two goals. The first goal is to show how WRF4SG facilitates to execute, monitor and manage climate workflows based on the WRF4G framework. And the second goal of WRF4SG is to help WRF users to execute their experiment workflows concurrently using heterogeneous computing resources such as HPC and Grid. [1] Kacsuk, P.: P-GRADE portal family for grid infrastructures. Concurrency and Computation: Practice and Experience. 23, 235-245 (2011). [2] http://www.meteo.unican.es/software/wrf4g
A novel approach to optimize workflow in grid-based teleradiology applications.

PubMed

Yılmaz, Ayhan Ozan; Baykal, Nazife

2016-01-01

This study proposes an infrastructure with a reporting workflow optimization algorithm (RWOA) in order to interconnect facilities, reporting units and radiologists on a single access interface, to increase the efficiency of the reporting process by decreasing the medical report turnaround time and to increase the quality of medical reports by determining the optimum match between the inspection and radiologist in terms of subspecialty, workload and response time. Workflow centric network architecture with an enhanced caching, querying and retrieving mechanism is implemented by seamlessly integrating Grid Agent and Grid Manager to conventional digital radiology systems. The inspection and radiologist attributes are modelled using a hierarchical ontology structure. Attribute preferences rated by radiologists and technical experts are formed into reciprocal matrixes and weights for entities are calculated utilizing Analytic Hierarchy Process (AHP). The assignment alternatives are processed by relation-based semantic matching (RBSM) and Integer Linear Programming (ILP). The results are evaluated based on both real case applications and simulated process data in terms of subspecialty, response time and workload success rates. Results obtained using simulated data are compared with the outcomes obtained by applying Round Robin, Shortest Queue and Random distribution policies. The proposed algorithm is also applied to a real case teleradiology application process data where medical reporting workflow was performed based on manual assignments by the chief radiologist for 6225 inspections. RBSM gives the highest subspecialty success rate and integrating ILP with RBSM ratings as RWOA provides a better response time and workload distribution success rate. RWOA based image delivery also prevents bandwidth, storage or hardware related stuck and latencies. When compared with a real case teleradiology application where inspection assignments were performed manually, the proposed solution was found to increase the experience success rate by 13.25%, workload success rate by 63.76% and response time success rate by 120%. The total response time in the real case application data was improved by 22.39%. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Toward server-side, high performance climate change data analytics in the Earth System Grid Federation (ESGF) eco-system

NASA Astrophysics Data System (ADS)

Fiore, Sandro; Williams, Dean; Aloisio, Giovanni

2016-04-01

In many scientific domains such as climate, data is often n-dimensional and requires tools that support specialized data types and primitives to be properly stored, accessed, analysed and visualized. Moreover, new challenges arise in large-scale scenarios and eco-systems where petabytes (PB) of data can be available and data can be distributed and/or replicated (e.g., the Earth System Grid Federation (ESGF) serving the Coupled Model Intercomparison Project, Phase 5 (CMIP5) experiment, providing access to 2.5PB of data for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5). Most of the tools currently available for scientific data analysis in the climate domain fail at large scale since they: (1) are desktop based and need the data locally; (2) are sequential, so do not benefit from available multicore/parallel machines; (3) do not provide declarative languages to express scientific data analysis tasks; (4) are domain-specific, which ties their adoption to a specific domain; and (5) do not provide a workflow support, to enable the definition of complex "experiments". The Ophidia project aims at facing most of the challenges highlighted above by providing a big data analytics framework for eScience. Ophidia provides declarative, server-side, and parallel data analysis, jointly with an internal storage model able to efficiently deal with multidimensional data and a hierarchical data organization to manage large data volumes ("datacubes"). The project relies on a strong background of high performance database management and OLAP systems to manage large scientific data sets. It also provides a native workflow management support, to define processing chains and workflows with tens to hundreds of data analytics operators to build real scientific use cases. With regard to interoperability aspects, the talk will present the contribution provided both to the RDA Working Group on Array Databases, and the Earth System Grid Federation (ESGF) Compute Working Team. Also highlighted will be the results of large scale climate model intercomparison data analysis experiments, for example: (1) defined in the context of the EU H2020 INDIGO-DataCloud project; (2) implemented in a real geographically distributed environment involving CMCC (Italy) and LLNL (US) sites; (3) exploiting Ophidia as server-side, parallel analytics engine; and (4) applied on real CMIP5 data sets available through ESGF.
Grid Enabled Geospatial Catalogue Web Service

NASA Technical Reports Server (NTRS)

Chen, Ai-Jun; Di, Li-Ping; Wei, Ya-Xing; Liu, Yang; Bui, Yu-Qi; Hu, Chau-Min; Mehrotra, Piyush

2004-01-01

Geospatial Catalogue Web Service is a vital service for sharing and interoperating volumes of distributed heterogeneous geospatial resources, such as data, services, applications, and their replicas over the web. Based on the Grid technology and the Open Geospatial Consortium (0GC) s Catalogue Service - Web Information Model, this paper proposes a new information model for Geospatial Catalogue Web Service, named as GCWS which can securely provides Grid-based publishing, managing and querying geospatial data and services, and the transparent access to the replica data and related services under the Grid environment. This information model integrates the information model of the Grid Replica Location Service (RLS)/Monitoring & Discovery Service (MDS) with the information model of OGC Catalogue Service (CSW), and refers to the geospatial data metadata standards from IS0 19115, FGDC and NASA EOS Core System and service metadata standards from IS0 191 19 to extend itself for expressing geospatial resources. Using GCWS, any valid geospatial user, who belongs to an authorized Virtual Organization (VO), can securely publish and manage geospatial resources, especially query on-demand data in the virtual community and get back it through the data-related services which provide functions such as subsetting, reformatting, reprojection etc. This work facilitates the geospatial resources sharing and interoperating under the Grid environment, and implements geospatial resources Grid enabled and Grid technologies geospatial enabled. It 2!so makes researcher to focus on science, 2nd not cn issues with computing ability, data locztic, processir,g and management. GCWS also is a key component for workflow-based virtual geospatial data producing.
Data Provenance Hybridization Supporting Extreme-Scale Scientific WorkflowApplications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elsethagen, Todd O.; Stephan, Eric G.; Raju, Bibi

As high performance computing (HPC) infrastructures continue to grow in capability and complexity, so do the applications that they serve. HPC and distributed-area computing (DAC) (e.g. grid and cloud) users are looking increasingly toward workflow solutions to orchestrate their complex application coupling, pre- and post-processing needs To gain insight and a more quantitative understanding of a workflow’s performance our method includes not only the capture of traditional provenance information, but also the capture and integration of system environment metrics helping to give context and explanation for a workflow’s execution. In this paper, we describe IPPD’s provenance management solution (ProvEn) andmore » its hybrid data store combining both of these data provenance perspectives.« less
Automation and quality assurance of the production cycle

NASA Astrophysics Data System (ADS)

Hajdu, L.; Didenko, L.; Lauret, J.

2010-04-01

Processing datasets on the order of tens of terabytes is an onerous task, faced by production coordinators everywhere. Users solicit data productions and, especially for simulation data, the vast amount of parameters (and sometime incomplete requests) point at the need for a tracking, control and archiving all requests made so a coordinated handling could be made by the production team. With the advent of grid computing the parallel processing power has increased but traceability has also become increasing problematic due to the heterogeneous nature of Grids. Any one of a number of components may fail invalidating the job or execution flow in various stages of completion and re-submission of a few of the multitude of jobs (keeping the entire dataset production consistency) a difficult and tedious process. From the definition of the workflow to its execution, there is a strong need for validation, tracking, monitoring and reporting of problems. To ease the process of requesting production workflow, STAR has implemented several components addressing the full workflow consistency. A Web based online submission request module, implemented using Drupal's Content Management System API, enforces ahead that all parameters are described in advance in a uniform fashion. Upon submission, all jobs are independently tracked and (sometime experiment-specific) discrepancies are detected and recorded providing detailed information on where/how/when the job failed. Aggregate information on success and failure are also provided in near real-time.
MIDG-Emerging grid technologies for multi-site preclinical molecular imaging research communities.

PubMed

Lee, Jasper; Documet, Jorge; Liu, Brent; Park, Ryan; Tank, Archana; Huang, H K

2011-03-01

Molecular imaging is the visualization and identification of specific molecules in anatomy for insight into metabolic pathways, tissue consistency, and tracing of solute transport mechanisms. This paper presents the Molecular Imaging Data Grid (MIDG) which utilizes emerging grid technologies in preclinical molecular imaging to facilitate data sharing and discovery between preclinical molecular imaging facilities and their collaborating investigator institutions to expedite translational sciences research. Grid-enabled archiving, management, and distribution of animal-model imaging datasets help preclinical investigators to monitor, access and share their imaging data remotely, and promote preclinical imaging facilities to share published imaging datasets as resources for new investigators. The system architecture of the Molecular Imaging Data Grid is described in a four layer diagram. A data model for preclinical molecular imaging datasets is also presented based on imaging modalities currently used in a molecular imaging center. The MIDG system components and connectivity are presented. And finally, the workflow steps for grid-based archiving, management, and retrieval of preclincial molecular imaging data are described. Initial performance tests of the Molecular Imaging Data Grid system have been conducted at the USC IPILab using dedicated VMware servers. System connectivity, evaluated datasets, and preliminary results are presented. The results show the system's feasibility, limitations, direction of future research. Translational and interdisciplinary research in medicine is increasingly interested in cellular and molecular biology activity at the preclinical levels, utilizing molecular imaging methods on animal models. The task of integrated archiving, management, and distribution of these preclinical molecular imaging datasets at preclinical molecular imaging facilities is challenging due to disparate imaging systems and multiple off-site investigators. A Molecular Imaging Data Grid design, implementation, and initial evaluation is presented to demonstrate the secure and novel data grid solution for sharing preclinical molecular imaging data across the wide-area-network (WAN).
Advances in Grid Computing for the Fabric for Frontier Experiments Project at Fermilab

NASA Astrophysics Data System (ADS)

Herner, K.; Alba Hernandez, A. F.; Bhat, S.; Box, D.; Boyd, J.; Di Benedetto, V.; Ding, P.; Dykstra, D.; Fattoruso, M.; Garzoglio, G.; Kirby, M.; Kreymer, A.; Levshina, T.; Mazzacane, A.; Mengel, M.; Mhashilkar, P.; Podstavkov, V.; Retzke, K.; Sharma, N.; Teheran, J.

2017-10-01

The Fabric for Frontier Experiments (FIFE) project is a major initiative within the Fermilab Scientific Computing Division charged with leading the computing model for Fermilab experiments. Work within the FIFE project creates close collaboration between experimenters and computing professionals to serve high-energy physics experiments of differing size, scope, and physics area. The FIFE project has worked to develop common tools for job submission, certificate management, software and reference data distribution through CVMFS repositories, robust data transfer, job monitoring, and databases for project tracking. Since the projects inception the experiments under the FIFE umbrella have significantly matured, and present an increasingly complex list of requirements to service providers. To meet these requirements, the FIFE project has been involved in transitioning the Fermilab General Purpose Grid cluster to support a partitionable slot model, expanding the resources available to experiments via the Open Science Grid, assisting with commissioning dedicated high-throughput computing resources for individual experiments, supporting the efforts of the HEP Cloud projects to provision a variety of back end resources, including public clouds and high performance computers, and developing rapid onboarding procedures for new experiments and collaborations. The larger demands also require enhanced job monitoring tools, which the project has developed using such tools as ElasticSearch and Grafana. in helping experiments manage their large-scale production workflows. This group in turn requires a structured service to facilitate smooth management of experiment requests, which FIFE provides in the form of the Production Operations Management Service (POMS). POMS is designed to track and manage requests from the FIFE experiments to run particular workflows, and support troubleshooting and triage in case of problems. Recently a new certificate management infrastructure called Distributed Computing Access with Federated Identities (DCAFI) has been put in place that has eliminated our dependence on a Fermilab-specific third-party Certificate Authority service and better accommodates FIFE collaborators without a Fermilab Kerberos account. DCAFI integrates the existing InCommon federated identity infrastructure, CILogon Basic CA, and a MyProxy service using a new general purpose open source tool. We will discuss the general FIFE onboarding strategy, progress in expanding FIFE experiments presence on the Open Science Grid, new tools for job monitoring, the POMS service, and the DCAFI project.
Autonomic Management of Application Workflows on Hybrid Computing Infrastructure

DOE PAGES

Kim, Hyunjoo; el-Khamra, Yaakoub; Rodero, Ivan; ...

2011-01-01

In this paper, we present a programming and runtime framework that enables the autonomic management of complex application workflows on hybrid computing infrastructures. The framework is designed to address system and application heterogeneity and dynamics to ensure that application objectives and constraints are satisfied. The need for such autonomic system and application management is becoming critical as computing infrastructures become increasingly heterogeneous, integrating different classes of resources from high-end HPC systems to commodity clusters and clouds. For example, the framework presented in this paper can be used to provision the appropriate mix of resources based on application requirements and constraints.more » The framework also monitors the system/application state and adapts the application and/or resources to respond to changing requirements or environment. To demonstrate the operation of the framework and to evaluate its ability, we employ a workflow used to characterize an oil reservoir executing on a hybrid infrastructure composed of TeraGrid nodes and Amazon EC2 instances of various types. Specifically, we show how different applications objectives such as acceleration, conservation and resilience can be effectively achieved while satisfying deadline and budget constraints, using an appropriate mix of dynamically provisioned resources. Our evaluations also demonstrate that public clouds can be used to complement and reinforce the scheduling and usage of traditional high performance computing infrastructure.« less
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services

NASA Astrophysics Data System (ADS)

Risch, Marcel; Altmann, Jörn; Guo, Li; Fleming, Alan; Courcoubetis, Costas

Within this paper, we present the GridEcon Platform, a testbed for designing and evaluating economics-aware services in a commercial Cloud computing setting. The Platform is based on the idea that the exact working of such services is difficult to predict in the context of a market and, therefore, an environment for evaluating its behavior in an emulated market is needed. To identify the components of the GridEcon Platform, a number of economics-aware services and their interactions have been envisioned. The two most important components of the platform are the Marketplace and the Workflow Engine. The Workflow Engine allows the simple composition of a market environment by describing the service interactions between economics-aware services. The Marketplace allows trading goods using different market mechanisms. The capabilities of these components of the GridEcon Platform in conjunction with the economics-aware services are described in this paper in detail. The validation of an implemented market mechanism and a capacity planning service using the GridEcon Platform also demonstrated the usefulness of the GridEcon Platform.
Metadata Management on the SCEC PetaSHA Project: Helping Users Describe, Discover, Understand, and Use Simulation Data in a Large-Scale Scientific Collaboration

NASA Astrophysics Data System (ADS)

Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.

2007-12-01

Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
NG6: Integrated next generation sequencing storage and processing environment.

PubMed

Mariette, Jérôme; Escudié, Frédéric; Allias, Nicolas; Salin, Gérald; Noirot, Céline; Thomas, Sylvain; Klopp, Christophe

2012-09-09

Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data.
Providing traceability for neuroimaging analyses.

PubMed

McClatchey, Richard; Branson, Andrew; Anjum, Ashiq; Bloodsworth, Peter; Habib, Irfan; Munir, Kamran; Shamdasani, Jetendr; Soomro, Kamran

2013-09-01

With the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. Few examples exist, of deployed medical systems based on Grids that provide the traceability of research data needed to facilitate complex analyses and none have been evaluated in practice. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimer's disease. The paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimer's disease. In the neuGRID and N4U projects a Provenance Service has been delivered that captures and reconstructs the workflow information needed to facilitate researchers in conducting neuroimaging analyses. The software enables neuroscientists to track the evolution of workflows and datasets. It also tracks the outcomes of various analyses and provides provenance traceability throughout the lifecycle of their studies. As the Provenance Service has been designed to be generic it can be applied across the medical domain as a reusable tool for supporting medical researchers thus providing communities of researchers for the first time with the necessary tools to conduct widely distributed collaborative programmes of medical analysis. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Automatic Integration Testbeds validation on Open Science Grid

NASA Astrophysics Data System (ADS)

Caballero, J.; Thapa, S.; Gardner, R.; Potekhin, M.

2011-12-01

A recurring challenge in deploying high quality production middleware is the extent to which realistic testing occurs before release of the software into the production environment. We describe here an automated system for validating releases of the Open Science Grid software stack that leverages the (pilot-based) PanDA job management system developed and used by the ATLAS experiment. The system was motivated by a desire to subject the OSG Integration Testbed to more realistic validation tests. In particular those which resemble to every extent possible actual job workflows used by the experiments thus utilizing job scheduling at the compute element (CE), use of the worker node execution environment, transfer of data to/from the local storage element (SE), etc. The context is that candidate releases of OSG compute and storage elements can be tested by injecting large numbers of synthetic jobs varying in complexity and coverage of services tested. The native capabilities of the PanDA system can thus be used to define jobs, monitor their execution, and archive the resulting run statistics including success and failure modes. A repository of generic workflows and job types to measure various metrics of interest has been created. A command-line toolset has been developed so that testbed managers can quickly submit "VO-like" jobs into the system when newly deployed services are ready for testing. A system for automatic submission has been crafted to send jobs to integration testbed sites, collecting the results in a central service and generating regular reports for performance and reliability.
Analyzing data flows of WLCG jobs at batch job level

NASA Astrophysics Data System (ADS)

Kuehn, Eileen; Fischer, Max; Giffels, Manuel; Jung, Christopher; Petzold, Andreas

2015-05-01

With the introduction of federated data access to the workflows of WLCG, it is becoming increasingly important for data centers to understand specific data flows regarding storage element accesses, firewall configurations, as well as the scheduling of batch jobs themselves. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the GridKa Tier 1 center for monitoring continuous data streams and characteristics of WLCG jobs and pilots. Long term measurements and data collection are in progress. These measurements already have been proven to be useful analyzing misbehaviors and various issues. Therefore we aim for an automated, realtime approach for anomaly detection. As a requirement, prototypes for standard workflows have to be examined. Based on measurements of several months, different features of HEP jobs are evaluated regarding their effectiveness for data mining approaches to identify these common workflows. The paper will introduce the actual measurement approach and statistics as well as the general concept and first results classifying different HEP job workflows derived from the measurements at GridKa.
iRODS: A Distributed Data Management Cyberinfrastructure for Observatories

NASA Astrophysics Data System (ADS)

Rajasekar, A.; Moore, R.; Vernon, F.

2007-12-01

Large-scale and long-term preservation of both observational and synthesized data requires a system that virtualizes data management concepts. A methodology is needed that can work across long distances in space (distribution) and long-periods in time (preservation). The system needs to manage data stored on multiple types of storage systems including new systems that become available in the future. This concept is called infrastructure independence, and is typically implemented through virtualization mechanisms. Data grids are built upon concepts of data and trust virtualization. These concepts enable the management of collections of data that are distributed across multiple institutions, stored on multiple types of storage systems, and accessed by multiple types of clients. Data virtualization ensures that the name spaces used to identify files, users, and storage systems are persistent, even when files are migrated onto future technology. This is required to preserve authenticity, the link between the record and descriptive and provenance metadata. Trust virtualization ensures that access controls remain invariant as files are moved within the data grid. This is required to track the chain of custody of records over time. The Storage Resource Broker (http://www.sdsc.edu/srb) is one such data grid used in a wide variety of applications in earth and space sciences such as ROADNet (roadnet.ucsd.edu), SEEK (seek.ecoinformatics.org), GEON (www.geongrid.org) and NOAO (www.noao.edu). Recent extensions to data grids provide one more level of virtualization - policy or management virtualization. Management virtualization ensures that execution of management policies can be automated, and that rules can be created that verify assertions about the shared collections of data. When dealing with distributed large-scale data over long periods of time, the policies used to manage the data and provide assurances about the authenticity of the data become paramount. The integrated Rule-Oriented Data System (iRODS) (http://irods.sdsc.edu) provides the mechanisms needed to describe not only management policies, but also to track how the policies are applied and their execution results. The iRODS data grid maps management policies to rules that control the execution of the remote micro-services. As an example, a rule can be created that automatically creates a replica whenever a file is added to a specific collection, or extracts its metadata automatically and registers it in a searchable catalog. For the replication operation, the persistent state information consists of the replica location, the creation date, the owner, the replica size, etc. The mechanism used by iRODS for providing policy virtualization is based on well-defined functions, called micro-services, which are chained into alternative workflows using rules. A rule engine, based on the event-condition-action paradigm executes the rule-based workflows after an event. Rules can be deferred to a pre-determined time or executed on a periodic basis. As the data management policies evolve, the iRODS system can implement new rules, new micro-services, and new state information (metadata content) needed to manage the new policies. Each sub- collection can be managed using a different set of policies. The discussion of the concepts in rule-based policy virtualization and its application to long-term and large-scale data management for observatories such as ORION and NEON will be the basis of the paper.
Workflow management systems in radiology

NASA Astrophysics Data System (ADS)

Wendler, Thomas; Meetz, Kirsten; Schmidt, Joachim

1998-07-01

In a situation of shrinking health care budgets, increasing cost pressure and growing demands to increase the efficiency and the quality of medical services, health care enterprises are forced to optimize or complete re-design their processes. Although information technology is agreed to potentially contribute to cost reduction and efficiency improvement, the real success factors are the re-definition and automation of processes: Business Process Re-engineering and Workflow Management. In this paper we discuss architectures for the use of workflow management systems in radiology. We propose to move forward from information systems in radiology (RIS, PACS) to Radiology Management Systems, in which workflow functionality (process definitions and process automation) is implemented through autonomous workflow management systems (WfMS). In a workflow oriented architecture, an autonomous workflow enactment service communicates with workflow client applications via standardized interfaces. In this paper, we discuss the need for and the benefits of such an approach. The separation of workflow management system and application systems is emphasized, and the consequences that arise for the architecture of workflow oriented information systems. This includes an appropriate workflow terminology, and the definition of standard interfaces for workflow aware application systems. Workflow studies in various institutions have shown that most of the processes in radiology are well structured and suited for a workflow management approach. Numerous commercially available Workflow Management Systems (WfMS) were investigated, and some of them, which are process- oriented and application independent, appear suitable for use in radiology.
Accelerating Cancer Systems Biology Research through Semantic Web Technology

PubMed Central

Wang, Zhihui; Sagotsky, Jonathan; Taylor, Thomas; Shironoshita, Patrick; Deisboeck, Thomas S.

2012-01-01

Cancer systems biology is an interdisciplinary, rapidly expanding research field in which collaborations are a critical means to advance the field. Yet the prevalent database technologies often isolate data rather than making it easily accessible. The Semantic Web has the potential to help facilitate web-based collaborative cancer research by presenting data in a manner that is self-descriptive, human and machine readable, and easily sharable. We have created a semantically linked online Digital Model Repository (DMR) for storing, managing, executing, annotating, and sharing computational cancer models. Within the DMR, distributed, multidisciplinary, and inter-organizational teams can collaborate on projects, without forfeiting intellectual property. This is achieved by the introduction of a new stakeholder to the collaboration workflow, the institutional licensing officer, part of the Technology Transfer Office. Furthermore, the DMR has achieved silver level compatibility with the National Cancer Institute’s caBIG®, so users can not only interact with the DMR through a web browser but also through a semantically annotated and secure web service. We also discuss the technology behind the DMR leveraging the Semantic Web, ontologies, and grid computing to provide secure inter-institutional collaboration on cancer modeling projects, online grid-based execution of shared models, and the collaboration workflow protecting researchers’ intellectual property. PMID:23188758
Accelerating cancer systems biology research through Semantic Web technology.

PubMed

Wang, Zhihui; Sagotsky, Jonathan; Taylor, Thomas; Shironoshita, Patrick; Deisboeck, Thomas S

2013-01-01

Cancer systems biology is an interdisciplinary, rapidly expanding research field in which collaborations are a critical means to advance the field. Yet the prevalent database technologies often isolate data rather than making it easily accessible. The Semantic Web has the potential to help facilitate web-based collaborative cancer research by presenting data in a manner that is self-descriptive, human and machine readable, and easily sharable. We have created a semantically linked online Digital Model Repository (DMR) for storing, managing, executing, annotating, and sharing computational cancer models. Within the DMR, distributed, multidisciplinary, and inter-organizational teams can collaborate on projects, without forfeiting intellectual property. This is achieved by the introduction of a new stakeholder to the collaboration workflow, the institutional licensing officer, part of the Technology Transfer Office. Furthermore, the DMR has achieved silver level compatibility with the National Cancer Institute's caBIG, so users can interact with the DMR not only through a web browser but also through a semantically annotated and secure web service. We also discuss the technology behind the DMR leveraging the Semantic Web, ontologies, and grid computing to provide secure inter-institutional collaboration on cancer modeling projects, online grid-based execution of shared models, and the collaboration workflow protecting researchers' intellectual property. Copyright © 2012 Wiley Periodicals, Inc.
Flexible workflow sharing and execution services for e-scientists

NASA Astrophysics Data System (ADS)

Kacsuk, Péter; Terstyanszky, Gábor; Kiss, Tamas; Sipos, Gergely

2013-04-01

The sequence of computational and data manipulation steps required to perform a specific scientific analysis is called a workflow. Workflows that orchestrate data and/or compute intensive applications on Distributed Computing Infrastructures (DCIs) recently became standard tools in e-science. At the same time the broad and fragmented landscape of workflows and DCIs slows down the uptake of workflow-based work. The development, sharing, integration and execution of workflows is still a challenge for many scientists. The FP7 "Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs" (SHIWA) project significantly improved the situation, with a simulation platform that connects different workflow systems, different workflow languages, different DCIs and workflows into a single, interoperable unit. The SHIWA Simulation Platform is a service package, already used by various scientific communities, and used as a tool by the recently started ER-flow FP7 project to expand the use of workflows among European scientists. The presentation will introduce the SHIWA Simulation Platform and the services that ER-flow provides based on the platform to space and earth science researchers. The SHIWA Simulation Platform includes: 1. SHIWA Repository: A database where workflows and meta-data about workflows can be stored. The database is a central repository to discover and share workflows within and among communities . 2. SHIWA Portal: A web portal that is integrated with the SHIWA Repository and includes a workflow executor engine that can orchestrate various types of workflows on various grid and cloud platforms. 3. SHIWA Desktop: A desktop environment that provides similar access capabilities than the SHIWA Portal, however it runs on the users' desktops/laptops instead of a portal server. 4. Workflow engines: the ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflow engines are already integrated with the execution engine of the SHIWA Portal. Other engines can be added when required. Through the SHIWA Portal one can define and run simulations on the SHIWA Virtual Organisation, an e-infrastructure that gathers computing and data resources from various DCIs, including the European Grid Infrastructure. The Portal via third party workflow engines provides support for the most widely used academic workflow engines and it can be extended with other engines on demand. Such extensions translate between workflow languages and facilitate the nesting of workflows into larger workflows even when those are written in different languages and require different interpreters for execution. Through the workflow repository and the portal lonely scientists and scientific collaborations can share and offer workflows for reuse and execution. Given the integrated nature of the SHIWA Simulation Platform the shared workflows can be executed online, without installing any special client environment and downloading workflows. The FP7 "Building a European Research Community through Interoperable Workflows and Data" (ER-flow) project disseminates the achievements of the SHIWA project and use these achievements to build workflow user communities across Europe. ER-flow provides application supports to research communities within and beyond the project consortium to develop, share and run workflows with the SHIWA Simulation Platform.

Land Cover Change Community-based Processing and Analysis System (LC-ComPS): Lessons Learned from Technology Infusion

NASA Astrophysics Data System (ADS)

Masek, J.; Rao, A.; Gao, F.; Davis, P.; Jackson, G.; Huang, C.; Weinstein, B.

2008-12-01

The Land Cover Change Community-based Processing and Analysis System (LC-ComPS) combines grid technology, existing science modules, and dynamic workflows to enable users to complete advanced land data processing on data available from local and distributed archives. Changes in land cover represent a direct link between human activities and the global environment, and in turn affect Earth's climate. Thus characterizing land cover change has become a major goal for Earth observation science. Many science algorithms exist to generate new products (e.g., surface reflectance, change detection) used to study land cover change. The overall objective of the LC-ComPS is to release a set of tools and services to the land science community that can be implemented as a flexible LC-ComPS to produce surface reflectance and land-cover change information with ground resolution on the order of Landsat-class instruments. This package includes software modules for pre-processing Landsat-type satellite imagery (calibration, atmospheric correction, orthorectification, precision registration, BRDF correction) for performing land-cover change analysis and includes pre-built workflow chains to automatically generate surface reflectance and land-cover change products based on user input. In order to meet the project objectives, the team created the infrastructure (i.e., client-server system with graphical and machine interfaces) to expand the use of these existing science algorithm capabilities in a community with distributed, large data archives and processing centers. Because of the distributed nature of the user community, grid technology was chosen to unite the dispersed community resources. At that time, grid computing was not used consistently and operationally within the Earth science research community. Therefore, there was a learning curve to configure and implement the underlying public key infrastructure (PKI) interfaces, required for the user authentication, secure file transfer and remote job execution on the grid network of machines. In addition, science support was needed to vet that the grid technology did not have any adverse affects of the science module outputs. Other open source, unproven technologies, such as a workflow package to manage jobs submitted by the user, were infused into the overall system with successful results. This presentation will discuss the basic capabilities of LC-ComPS, explain how the technology was infused, and provide lessons learned for using and integrating the various technologies while developing and operating the system, and finally outline plans moving forward (maintenance and operations decisions) based on the experience to date.
Research and Implementation of Key Technologies in Multi-Agent System to Support Distributed Workflow

NASA Astrophysics Data System (ADS)

Pan, Tianheng

2018-01-01

In recent years, the combination of workflow management system and Multi-agent technology is a hot research field. The problem of lack of flexibility in workflow management system can be improved by introducing multi-agent collaborative management. The workflow management system adopts distributed structure. It solves the problem that the traditional centralized workflow structure is fragile. In this paper, the agent of Distributed workflow management system is divided according to its function. The execution process of each type of agent is analyzed. The key technologies such as process execution and resource management are analyzed.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Xing, Z.; Fetzer, E.

2008-12-01

NASA's Earth Observing System (EOS) is the world's most ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the A-Train platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the cloud scenes from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time matchups between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, and assemble merged datasets for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, and perform pairwise instrument matchups for A-Train datasets. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System

NASA Astrophysics Data System (ADS)

Wilson, B.; Manipon, G.; Xing, Z.; Fetzer, E.

2009-04-01

NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, perform pairwise instrument matchups for A-Train datasets, and compute fused products. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Low Latency Workflow Scheduling and an Application of Hyperspectral Brightness Temperatures

NASA Astrophysics Data System (ADS)

Nguyen, P. T.; Chapman, D. R.; Halem, M.

2012-12-01

New system analytics for Big Data computing holds the promise of major scientific breakthroughs and discoveries from the exploration and mining of the massive data sets becoming available to the science community. However, such data intensive scientific applications face severe challenges in accessing, managing and analyzing petabytes of data. While the Hadoop MapReduce environment has been successfully applied to data intensive problems arising in business, there are still many scientific problem domains where limitations in the functionality of MapReduce systems prevent its wide adoption by those communities. This is mainly because MapReduce does not readily support the unique science discipline needs such as special science data formats, graphic and computational data analysis tools, maintaining high degrees of computational accuracies, and interfacing with application's existing components across heterogeneous computing processors. We address some of these limitations by exploiting the MapReduce programming model for satellite data intensive scientific problems and address scalability, reliability, scheduling, and data management issues when dealing with climate data records and their complex observational challenges. In addition, we will present techniques to support the unique Earth science discipline needs such as dealing with special science data formats (HDF and NetCDF). We have developed a Hadoop task scheduling algorithm that improves latency by 2x for a scientific workflow including the gridding of the EOS AIRS hyperspectral Brightness Temperatures (BT). This workflow processing algorithm has been tested at the Multicore Computing Center private Hadoop based Intel Nehalem cluster, as well as in a virtual mode under the Open Source Eucalyptus cloud. The 55TB AIRS hyperspectral L1b Brightness Temperature record has been gridded at the resolution of 0.5x1.0 degrees, and we have computed a 0.9 annual anti-correlation to the El Nino Southern oscillation in the Nino 4 region, as well as a 1.9 Kelvin decadal Arctic warming in the 4u and 12u spectral regions. Additionally, we will present the frequency of extreme global warming events by the use of a normalized maximum BT in a grid cell relative to its local standard deviation. A low-latency Hadoop scheduling environment maintains data integrity and fault tolerance in a MapReduce data intensive Cloud environment while improving the "time to solution" metric by 35% when compared to a more traditional parallel processing system for the same dataset. Our next step will be to improve the usability of our Hadoop task scheduling system, to enable rapid prototyping of data intensive experiments by means of processing "kernels". We will report on the performance and experience of implementing these experiments on the NEX testbed, and propose the use of a graphical directed acyclic graph (DAG) interface to help us develop on-demand scientific experiments. Our workflow system works within Hadoop infrastructure as a replacement for the FIFO or FairScheduler, thus the use of Apache "Pig" latin or other Apache tools may also be worth investigating on the NEX system to improve the usability of our workflow scheduling infrastructure for rapid experimentation.
Agile parallel bioinformatics workflow management using Pwrake.

PubMed

Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro

2011-09-08

In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Agile parallel bioinformatics workflow management using Pwrake

PubMed Central

2011-01-01

Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows. PMID:21899774
GUEST EDITOR'S INTRODUCTION: Guest Editor's introduction

NASA Astrophysics Data System (ADS)

Chrysanthis, Panos K.

1996-12-01

Computer Science Department, University of Pittsburgh, Pittsburgh, PA 15260, USA This special issue focuses on current efforts to represent and support workflows that integrate information systems and human resources within a business or manufacturing enterprise. Workflows may also be viewed as an emerging computational paradigm for effective structuring of cooperative applications involving human users and access to diverse data types not necessarily maintained by traditional database management systems. A workflow is an automated organizational process (also called business process) which consists of a set of activities or tasks that need to be executed in a particular controlled order over a combination of heterogeneous database systems and legacy systems. Within workflows, tasks are performed cooperatively by either human or computational agents in accordance with their roles in the organizational hierarchy. The challenge in facilitating the implementation of workflows lies in developing efficient workflow management systems. A workflow management system (also called workflow server, workflow engine or workflow enactment system) provides the necessary interfaces for coordination and communication among human and computational agents to execute the tasks involved in a workflow and controls the execution orderings of tasks as well as the flow of data that these tasks manipulate. That is, the workflow management system is responsible for correctly and reliably supporting the specification, execution, and monitoring of workflows. The six papers selected (out of the twenty-seven submitted for this special issue of Distributed Systems Engineering) address different aspects of these three functional components of a workflow management system. In the first paper, `Correctness issues in workflow management', Kamath and Ramamritham discuss the important issue of correctness in workflow management that constitutes a prerequisite for the use of workflows in the automation of the critical organizational/business processes. In particular, this paper examines the issues of execution atomicity and failure atomicity, differentiating between correctness requirements of system failures and logical failures, and surveys techniques that can be used to ensure data consistency in workflow management systems. While the first paper is concerned with correctness assuming transactional workflows in which selective transactional properties are associated with individual tasks or the entire workflow, the second paper, `Scheduling workflows by enforcing intertask dependencies' by Attie et al, assumes that the tasks can be either transactions or other activities involving legacy systems. This second paper describes the modelling and specification of conditions involving events and dependencies among tasks within a workflow using temporal logic and finite state automata. It also presents a scheduling algorithm that enforces all stated dependencies by executing at any given time only those events that are allowed by all the dependency automata and in an order as specified by the dependencies. In any system with decentralized control, there is a need to effectively cope with the tension that exists between autonomy and consistency requirements. In `A three-level atomicity model for decentralized workflow management systems', Ben-Shaul and Heineman focus on the specific requirement of enforcing failure atomicity in decentralized, autonomous and interacting workflow management systems. Their paper describes a model in which each workflow manager must be able to specify the sequence of tasks that comprise an atomic unit for the purposes of correctness, and the degrees of local and global atomicity for the purpose of cooperation with other workflow managers. The paper also discusses a realization of this model in which treaties and summits provide an agreement mechanism, while underlying transaction managers are responsible for maintaining failure atomicity. The fourth and fifth papers are experience papers describing a workflow management system and a large scale workflow application, respectively. Schill and Mittasch, in `Workflow management systems on top of OSF DCE and OMG CORBA', describe a decentralized workflow management system and discuss its implementation using two standardized middleware platforms, namely, OSF DCE and OMG CORBA. The system supports a new approach to workflow management, introducing several new concepts such as data type management for integrating various types of data and quality of service for various services provided by servers. A problem common to both database applications and workflows is the handling of missing and incomplete information. This is particularly pervasive in an `electronic market' with a huge number of retail outlets producing and exchanging volumes of data, the application discussed in `Information flow in the DAMA project beyond database managers: information flow managers'. Motivated by the need for a method that allows a task to proceed in a timely manner if not all data produced by other tasks are available by its deadline, Russell et al propose an architectural framework and a language that can be used to detect, approximate and, later on, to adjust missing data if necessary. The final paper, `The evolution towards flexible workflow systems' by Nutt, is complementary to the other papers and is a survey of issues and of work related to both workflow and computer supported collaborative work (CSCW) areas. In particular, the paper provides a model and a categorization of the dimensions which workflow management and CSCW systems share. Besides summarizing the recent advancements towards efficient workflow management, the papers in this special issue suggest areas open to investigation and it is our hope that they will also provide the stimulus for further research and development in the area of workflow management systems.
Auspice: Automatic Service Planning in Cloud/Grid Environments

NASA Astrophysics Data System (ADS)

Chiu, David; Agrawal, Gagan

Recent scientific advances have fostered a mounting number of services and data sets available for utilization. These resources, though scattered across disparate locations, are often loosely coupled both semantically and operationally. This loosely coupled relationship implies the possibility of linking together operations and data sets to answer queries. This task, generally known as automatic service composition, therefore abstracts the process of complex scientific workflow planning from the user. We have been exploring a metadata-driven approach toward automatic service workflow composition, among other enabling mechanisms, in our system, Auspice: Automatic Service Planning in Cloud/Grid Environments. In this paper, we present a complete overview of our system's unique features and outlooks for future deployment as the Cloud computing paradigm becomes increasingly eminent in enabling scientific computing.
A characterization of workflow management systems for extreme-scale applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia

We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less
A characterization of workflow management systems for extreme-scale applications

DOE PAGES

Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia; ...

2017-02-16

We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less
Synchrotron Imaging Computations on the Grid without the Computing Element

NASA Astrophysics Data System (ADS)

Curri, A.; Pugliese, R.; Borghes, R.; Kourousias, G.

2011-12-01

Besides the heavy use of the Grid in the Synchrotron Radiation Facility (SRF) Elettra, additional special requirements from the beamlines had to be satisfied through a novel solution that we present in this work. In the traditional Grid Computing paradigm the computations are performed on the Worker Nodes of the grid element known as the Computing Element. A Grid middleware extension that our team has been working on, is that of the Instrument Element. In general it is used to Grid-enable instrumentation; and it can be seen as a neighbouring concept to that of the traditional Control Systems. As a further extension we demonstrate the Instrument Element as the steering mechanism for a series of computations. In our deployment it interfaces a Control System that manages a series of computational demanding Scientific Imaging tasks in an online manner. The instrument control in Elettra is done through a suitable Distributed Control System, a common approach in the SRF community. The applications that we present are for a beamline working in medical imaging. The solution resulted to a substantial improvement of a Computed Tomography workflow. The near-real-time requirements could not have been easily satisfied from our Grid's middleware (gLite) due to the various latencies often occurred during the job submission and queuing phases. Moreover the required deployment of a set of TANGO devices could not have been done in a standard gLite WN. Besides the avoidance of certain core Grid components, the Grid Security infrastructure has been utilised in the final solution.
Differentiated protection services with failure probability guarantee for workflow-based applications

NASA Astrophysics Data System (ADS)

Zhong, Yaoquan; Guo, Wei; Jin, Yaohui; Sun, Weiqiang; Hu, Weisheng

2010-12-01

A cost-effective and service-differentiated provisioning strategy is very desirable to service providers so that they can offer users satisfactory services, while optimizing network resource allocation. Providing differentiated protection services to connections for surviving link failure has been extensively studied in recent years. However, the differentiated protection services for workflow-based applications, which consist of many interdependent tasks, have scarcely been studied. This paper investigates the problem of providing differentiated services for workflow-based applications in optical grid. In this paper, we develop three differentiated protection services provisioning strategies which can provide security level guarantee and network-resource optimization for workflow-based applications. The simulation demonstrates that these heuristic algorithms provide protection cost-effectively while satisfying the applications' failure probability requirements.
The ATLAS Production System Evolution: New Data Processing and Analysis Paradigm for the LHC Run2 and High-Luminosity

NASA Astrophysics Data System (ADS)

Barreiro, F. H.; Borodin, M.; De, K.; Golubkov, D.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Padolski, S.; Wenaus, T.; ATLAS Collaboration

2017-10-01

The second generation of the ATLAS Production System called ProdSys2 is a distributed workload manager that runs daily hundreds of thousands of jobs, from dozens of different ATLAS specific workflows, across more than hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based on many criteria, such as input and output size, memory requirements and CPU consumption, with manageable scheduling policies and by supporting different kind of computational resources, such as GRID, clouds, supercomputers and volunteer-computers. The system dynamically assigns a group of jobs (task) to a group of geographically distributed computing resources. Dynamic assignment and resources utilization is one of the major features of the system, it didn’t exist in the earliest versions of the production system where Grid resources topology was predefined using national or/and geographical pattern. Production System has a sophisticated job fault-recovery mechanism, which efficiently allows to run multi-Terabyte tasks without human intervention. We have implemented “train” model and open-ended production which allow to submit tasks automatically as soon as new set of data is available and to chain physics groups data processing and analysis with central production by the experiment. We present an overview of the ATLAS Production System and its major components features and architecture: task definition, web user interface and monitoring. We describe the important design decisions and lessons learned from an operational experience during the first year of LHC Run2. We also report the performance of the designed system and how various workflows, such as data (re)processing, Monte-Carlo and physics group production, users analysis, are scheduled and executed within one production system on heterogeneous computing resources.
The swiss army knife of job submission tools: grid-control

NASA Astrophysics Data System (ADS)

Stober, F.; Fischer, M.; Schleper, P.; Stadie, H.; Garbers, C.; Lange, J.; Kovalchuk, N.

2017-10-01

grid-control is a lightweight and highly portable open source submission tool that supports all common workflows in high energy physics (HEP). It has been used by a sizeable number of HEP analyses to process tasks that sometimes consist of up to 100k jobs. grid-control is built around a powerful plugin and configuration system, that allows users to easily specify all aspects of the desired workflow. Job submission to a wide range of local or remote batch systems or grid middleware is supported. Tasks can be conveniently specified through the parameter space that will be processed, which can consist of any number of variables and data sources with complex dependencies on each other. Dataset information is processed through a configurable pipeline of dataset filters, partition plugins and partition filters. The partition plugins can take the number of files, size of the work units, metadata or combinations thereof into account. All changes to the input datasets or variables are propagated through the processing pipeline and can transparently trigger adjustments to the parameter space and the job submission. While the core functionality is completely experiment independent, full integration with the CMS computing environment is provided by a small set of plugins.
Bioinformatics workflows and web services in systems biology made easy for experimentalists.

PubMed

Jimenez, Rafael C; Corpas, Manuel

2013-01-01

Workflows are useful to perform data analysis and integration in systems biology. Workflow management systems can help users create workflows without any previous knowledge in programming and web services. However the computational skills required to build such workflows are usually above the level most biological experimentalists are comfortable with. In this chapter we introduce workflow management systems that reuse existing workflows instead of creating them, making it easier for experimentalists to perform computational tasks.
Radiology information system: a workflow-based approach.

PubMed

Zhang, Jinyan; Lu, Xudong; Nie, Hongchao; Huang, Zhengxing; van der Aalst, W M P

2009-09-01

Introducing workflow management technology in healthcare seems to be prospective in dealing with the problem that the current healthcare Information Systems cannot provide sufficient support for the process management, although several challenges still exist. The purpose of this paper is to study the method of developing workflow-based information system in radiology department as a use case. First, a workflow model of typical radiology process was established. Second, based on the model, the system could be designed and implemented as a group of loosely coupled components. Each component corresponded to one task in the process and could be assembled by the workflow management system. The legacy systems could be taken as special components, which also corresponded to the tasks and were integrated through transferring non-work- flow-aware interfaces to the standard ones. Finally, a workflow dashboard was designed and implemented to provide an integral view of radiology processes. The workflow-based Radiology Information System was deployed in the radiology department of Zhejiang Chinese Medicine Hospital in China. The results showed that it could be adjusted flexibly in response to the needs of changing process, and enhance the process management in the department. It can also provide a more workflow-aware integration method, comparing with other methods such as IHE-based ones. The workflow-based approach is a new method of developing radiology information system with more flexibility, more functionalities of process management and more workflow-aware integration. The work of this paper is an initial endeavor for introducing workflow management technology in healthcare.
VOMS/VOMRS utilization patterns and convergence plan

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ceccanti, A.; /INFN, CNAF; Ciaschini, V.

2010-01-01

The Grid community uses two well-established registration services, which allow users to be authenticated under the auspices of Virtual Organizations (VOs). The Virtual Organization Membership Service (VOMS), developed in the context of the Enabling Grid for E-sciencE (EGEE) project, is an Attribute Authority service that issues attributes expressing membership information of a subject within a VO. VOMS allows to partition users in groups, assign them roles and free-form attributes which are then used to drive authorization decisions. The VOMS administrative application, VOMS-Admin, manages and populates the VOMS database with membership information. The Virtual Organization Management Registration Service (VOMRS), developed atmore » Fermilab, extends the basic registration and management functionalities present in VOMS-Admin. It implements a registration workflow that requires VO usage policy acceptance and membership approval by administrators. VOMRS supports management of multiple grid certificates, and handling users' request for group and role assignments, and membership status. VOMRS is capable of interfacing to local systems with personnel information (e.g. the CERN Human Resource Database) and of pulling relevant member information from them. VOMRS synchronizes the relevant subset of information with VOMS. The recent development of new features in VOMS-Admin raises the possibility of rationalizing the support and converging on a single solution by continuing and extending existing collaborations between EGEE and OSG. Such strategy is supported by WLCG, OSG, US CMS, US Atlas, and other stakeholders worldwide. In this paper, we will analyze features in use by major experiments and the use cases for registration addressed by the mature single solution.« less
VOMS/VOMRS utilization patterns and convergence plan

NASA Astrophysics Data System (ADS)

Ceccanti, A.; Ciaschini, V.; Dimou, M.; Garzoglio, G.; Levshina, T.; Traylen, S.; Venturi, V.

2010-04-01

The Grid community uses two well-established registration services, which allow users to be authenticated under the auspices of Virtual Organizations (VOs). The Virtual Organization Membership Service (VOMS), developed in the context of the Enabling Grid for E-sciencE (EGEE) project, is an Attribute Authority service that issues attributes expressing membership information of a subject within a VO. VOMS allows to partition users in groups, assign them roles and free-form attributes which are then used to drive authorization decisions. The VOMS administrative application, VOMS-Admin, manages and populates the VOMS database with membership information. The Virtual Organization Management Registration Service (VOMRS), developed at Fermilab, extends the basic registration and management functionalities present in VOMS-Admin. It implements a registration workflow that requires VO usage policy acceptance and membership approval by administrators. VOMRS supports management of multiple grid certificates, and handling users' request for group and role assignments, and membership status. VOMRS is capable of interfacing to local systems with personnel information (e.g. the CERN Human Resource Database) and of pulling relevant member information from them. VOMRS synchronizes the relevant subset of information with VOMS. The recent development of new features in VOMS-Admin raises the possibility of rationalizing the support and converging on a single solution by continuing and extending existing collaborations between EGEE and OSG. Such strategy is supported by WLCG, OSG, US CMS, US Atlas, and other stakeholders worldwide. In this paper, we will analyze features in use by major experiments and the use cases for registration addressed by the mature single solution.
Progress in digital color workflow understanding in the International Color Consortium (ICC) Workflow WG

NASA Astrophysics Data System (ADS)

McCarthy, Ann

2006-01-01

The ICC Workflow WG serves as the bridge between ICC color management technologies and use of those technologies in real world color production applications. ICC color management is applicable to and is used in a wide range of color systems, from highly specialized digital cinema color special effects to high volume publications printing to home photography. The ICC Workflow WG works to align ICC technologies so that the color management needs of these diverse use case systems are addressed in an open, platform independent manner. This report provides a high level summary of the ICC Workflow WG objectives and work to date, focusing on the ways in which workflow can impact image quality and color systems performance. The 'ICC Workflow Primitives' and 'ICC Workflow Patterns and Dimensions' workflow models are covered in some detail. Consider the questions, "How much of dissatisfaction with color management today is the result of 'the wrong color transformation at the wrong time' and 'I can't get to the right conversion at the right point in my work process'?" Put another way, consider how image quality through a workflow can be negatively affected when the coordination and control level of the color management system is not sufficient.

Reproducible Large-Scale Neuroimaging Studies with the OpenMOLE Workflow Management System.

PubMed

Passerat-Palmbach, Jonathan; Reuillon, Romain; Leclaire, Mathieu; Makropoulos, Antonios; Robinson, Emma C; Parisot, Sarah; Rueckert, Daniel

2017-01-01

OpenMOLE is a scientific workflow engine with a strong emphasis on workload distribution. Workflows are designed using a high level Domain Specific Language (DSL) built on top of Scala. It exposes natural parallelism constructs to easily delegate the workload resulting from a workflow to a wide range of distributed computing environments. OpenMOLE hides the complexity of designing complex experiments thanks to its DSL. Users can embed their own applications and scale their pipelines from a small prototype running on their desktop computer to a large-scale study harnessing distributed computing infrastructures, simply by changing a single line in the pipeline definition. The construction of the pipeline itself is decoupled from the execution context. The high-level DSL abstracts the underlying execution environment, contrary to classic shell-script based pipelines. These two aspects allow pipelines to be shared and studies to be replicated across different computing environments. Workflows can be run as traditional batch pipelines or coupled with OpenMOLE's advanced exploration methods in order to study the behavior of an application, or perform automatic parameter tuning. In this work, we briefly present the strong assets of OpenMOLE and detail recent improvements targeting re-executability of workflows across various Linux platforms. We have tightly coupled OpenMOLE with CARE, a standalone containerization solution that allows re-executing on a Linux host any application that has been packaged on another Linux host previously. The solution is evaluated against a Python-based pipeline involving packages such as scikit-learn as well as binary dependencies. All were packaged and re-executed successfully on various HPC environments, with identical numerical results (here prediction scores) obtained on each environment. Our results show that the pair formed by OpenMOLE and CARE is a reliable solution to generate reproducible results and re-executable pipelines. A demonstration of the flexibility of our solution showcases three neuroimaging pipelines harnessing distributed computing environments as heterogeneous as local clusters or the European Grid Infrastructure (EGI).
Reproducible Large-Scale Neuroimaging Studies with the OpenMOLE Workflow Management System

PubMed Central

Passerat-Palmbach, Jonathan; Reuillon, Romain; Leclaire, Mathieu; Makropoulos, Antonios; Robinson, Emma C.; Parisot, Sarah; Rueckert, Daniel

2017-01-01

OpenMOLE is a scientific workflow engine with a strong emphasis on workload distribution. Workflows are designed using a high level Domain Specific Language (DSL) built on top of Scala. It exposes natural parallelism constructs to easily delegate the workload resulting from a workflow to a wide range of distributed computing environments. OpenMOLE hides the complexity of designing complex experiments thanks to its DSL. Users can embed their own applications and scale their pipelines from a small prototype running on their desktop computer to a large-scale study harnessing distributed computing infrastructures, simply by changing a single line in the pipeline definition. The construction of the pipeline itself is decoupled from the execution context. The high-level DSL abstracts the underlying execution environment, contrary to classic shell-script based pipelines. These two aspects allow pipelines to be shared and studies to be replicated across different computing environments. Workflows can be run as traditional batch pipelines or coupled with OpenMOLE's advanced exploration methods in order to study the behavior of an application, or perform automatic parameter tuning. In this work, we briefly present the strong assets of OpenMOLE and detail recent improvements targeting re-executability of workflows across various Linux platforms. We have tightly coupled OpenMOLE with CARE, a standalone containerization solution that allows re-executing on a Linux host any application that has been packaged on another Linux host previously. The solution is evaluated against a Python-based pipeline involving packages such as scikit-learn as well as binary dependencies. All were packaged and re-executed successfully on various HPC environments, with identical numerical results (here prediction scores) obtained on each environment. Our results show that the pair formed by OpenMOLE and CARE is a reliable solution to generate reproducible results and re-executable pipelines. A demonstration of the flexibility of our solution showcases three neuroimaging pipelines harnessing distributed computing environments as heterogeneous as local clusters or the European Grid Infrastructure (EGI). PMID:28381997
Managing large-scale workflow execution from resource provisioning to provenance tracking: The CyberShake example

USGS Publications Warehouse

Deelman, E.; Callaghan, S.; Field, E.; Francoeur, H.; Graves, R.; Gupta, N.; Gupta, V.; Jordan, T.H.; Kesselman, C.; Maechling, P.; Mehringer, J.; Mehta, G.; Okaya, D.; Vahi, K.; Zhao, L.

2006-01-01

This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid. ?? 2006 IEEE.
Distributed Trust Management for Validating SLA Choreographies

NASA Astrophysics Data System (ADS)

Haq, Irfan Ul; Alnemr, Rehab; Paschke, Adrian; Schikuta, Erich; Boley, Harold; Meinel, Christoph

For business workflow automation in a service-enriched environment such as a grid or a cloud, services scattered across heterogeneous Virtual Organizations (VOs) can be aggregated in a producer-consumer manner, building hierarchical structures of added value. In order to preserve the supply chain, the Service Level Agreements (SLAs) corresponding to the underlying choreography of services should also be incrementally aggregated. This cross-VO hierarchical SLA aggregation requires validation, for which a distributed trust system becomes a prerequisite. Elaborating our previous work on rule-based SLA validation, we propose a hybrid distributed trust model. This new model is based on Public Key Infrastructure (PKI) and reputation-based trust systems. It helps preventing SLA violations by identifying violation-prone services at service selection stage and actively contributes in breach management at the time of penalty enforcement.
A three-level atomicity model for decentralized workflow management systems

NASA Astrophysics Data System (ADS)

Ben-Shaul, Israel Z.; Heineman, George T.

1996-12-01

A workflow management system (WFMS) employs a workflow manager (WM) to execute and automate the various activities within a workflow. To protect the consistency of data, the WM encapsulates each activity with a transaction; a transaction manager (TM) then guarantees the atomicity of activities. Since workflows often group several activities together, the TM is responsible for guaranteeing the atomicity of these units. There are scalability issues, however, with centralized WFMSs. Decentralized WFMSs provide an architecture for multiple autonomous WFMSs to interoperate, thus accommodating multiple workflows and geographically-dispersed teams. When atomic units are composed of activities spread across multiple WFMSs, however, there is a conflict between global atomicity and local autonomy of each WFMS. This paper describes a decentralized atomicity model that enables workflow administrators to specify the scope of multi-site atomicity based upon the desired semantics of multi-site tasks in the decentralized WFMS. We describe an architecture that realizes our model and execution paradigm.
Generic worklist handler for workflow-enabled products

NASA Astrophysics Data System (ADS)

Schmidt, Joachim; Meetz, Kirsten; Wendler, Thomas

1999-07-01

Workflow management (WfM) is an emerging field of medical information technology. It appears as a promising key technology to model, optimize and automate processes, for the sake of improved efficiency, reduced costs and improved patient care. The Application of WfM concepts requires the standardization of architectures and interfaces. A component of central interest proposed in this report is a generic work list handler: A standardized interface between a workflow enactment service and application system. Application systems with embedded work list handlers will be called 'Workflow Enabled Application Systems'. In this paper we discus functional requirements of work list handlers, as well as their integration into workflow architectures and interfaces. To lay the foundation for this specification, basic workflow terminology, the fundamentals of workflow management and - later in the paper - the available standards as defined by the Workflow Management Coalition are briefly reviewed.
Analysis, Mining and Visualization Service at NCSA

NASA Astrophysics Data System (ADS)

Wilhelmson, R.; Cox, D.; Welge, M.

2004-12-01

NCSA's goal is to create a balanced system that fully supports high-end computing as well as: 1) high-end data management and analysis; 2) visualization of massive, highly complex data collections; 3) large databases; 4) geographically distributed Grid computing; and 5) collaboratories, all based on a secure computational environment and driven with workflow-based services. To this end NCSA has defined a new technology path that includes the integration and provision of cyberservices in support of data analysis, mining, and visualization. NCSA has begun to develop and apply a data mining system-NCSA Data-to-Knowledge (D2K)-in conjunction with both the application and research communities. NCSA D2K will enable the formation of model-based application workflows and visual programming interfaces for rapid data analysis. The Java-based D2K framework, which integrates analytical data mining methods with data management, data transformation, and information visualization tools, will be configurable from the cyberservices (web and grid services, tools, ..) viewpoint to solve a wide range of important data mining problems. This effort will use modules, such as a new classification methods for the detection of high-risk geoscience events, and existing D2K data management, machine learning, and information visualization modules. A D2K cyberservices interface will be developed to seamlessly connect client applications with remote back-end D2K servers, providing computational resources for data mining and integration with local or remote data stores. This work is being coordinated with SDSC's data and services efforts. The new NCSA Visualization embedded workflow environment (NVIEW) will be integrated with D2K functionality to tightly couple informatics and scientific visualization with the data analysis and management services. Visualization services will access and filter disparate data sources, simplifying tasks such as fusing related data from distinct sources into a coherent visual representation. This approach enables collaboration among geographically dispersed researchers via portals and front-end clients, and the coupling with data management services enables recording associations among datasets and building annotation systems into visualization tools and portals, giving scientists a persistent, shareable, virtual lab notebook. To facilitate provision of these cyberservices to the national community, NCSA will be providing a computational environment for large-scale data assimilation, analysis, mining, and visualization. This will be initially implemented on the new 512 processor shared memory SGI's recently purchased by NCSA. In addition to standard batch capabilities, NCSA will provide on-demand capabilities for those projects requiring rapid response (e.g., development of severe weather, earthquake events) for decision makers. It will also be used for non-sequential interactive analysis of data sets where it is important have access to large data volumes over space and time.
An Auto-management Thesis Program WebMIS Based on Workflow

NASA Astrophysics Data System (ADS)

Chang, Li; Jie, Shi; Weibo, Zhong

An auto-management WebMIS based on workflow for bachelor thesis program is given in this paper. A module used for workflow dispatching is designed and realized using MySQL and J2EE according to the work principle of workflow engine. The module can automatively dispatch the workflow according to the date of system, login information and the work status of the user. The WebMIS changes the management from handwork to computer-work which not only standardizes the thesis program but also keeps the data and documents clean and consistent.
Evolution of user analysis on the grid in ATLAS

NASA Astrophysics Data System (ADS)

Dewhurst, A.; Legger, F.; ATLAS Collaboration

2017-10-01

More than one thousand physicists analyse data collected by the ATLAS experiment at the Large Hadron Collider (LHC) at CERN through 150 computing facilities around the world. Efficient distributed analysis requires optimal resource usage and the interplay of several factors: robust grid and software infrastructures, and system capability to adapt to different workloads. The continuous automatic validation of grid sites and the user support provided by a dedicated team of expert shifters have been proven to provide a solid distributed analysis system for ATLAS users. Typical user workflows on the grid, and their associated metrics, are discussed. Measurements of user job performance and typical requirements are also shown.
Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Chase Qishi; Zhu, Michelle Mengxia

The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
Distributed Monitoring Infrastructure for Worldwide LHC Computing Grid

NASA Astrophysics Data System (ADS)

Andrade, P.; Babik, M.; Bhatt, K.; Chand, P.; Collados, D.; Duggal, V.; Fuente, P.; Hayashi, S.; Imamagic, E.; Joshi, P.; Kalmady, R.; Karnani, U.; Kumar, V.; Lapka, W.; Quick, R.; Tarragon, J.; Teige, S.; Triantafyllidis, C.

2012-12-01

The journey of a monitoring probe from its development phase to the moment its execution result is presented in an availability report is a complex process. It goes through multiple phases such as development, testing, integration, release, deployment, execution, data aggregation, computation, and reporting. Further, it involves people with different roles (developers, site managers, VO[1] managers, service managers, management), from different middleware providers (ARC[2], dCache[3], gLite[4], UNICORE[5] and VDT[6]), consortiums (WLCG[7], EMI[11], EGI[15], OSG[13]), and operational teams (GOC[16], OMB[8], OTAG[9], CSIRT[10]). The seamless harmonization of these distributed actors is in daily use for monitoring of the WLCG infrastructure. In this paper we describe the monitoring of the WLCG infrastructure from the operational perspective. We explain the complexity of the journey of a monitoring probe from its execution on a grid node to the visualization on the MyWLCG[27] portal where it is exposed to other clients. This monitoring workflow profits from the interoperability established between the SAM[19] and RSV[20] frameworks. We show how these two distributed structures are capable of uniting technologies and hiding the complexity around them, making them easy to be used by the community. Finally, the different supported deployment strategies, tailored not only for monitoring the entire infrastructure but also for monitoring sites and virtual organizations, are presented and the associated operational benefits highlighted.
Public storage for the Open Science Grid

NASA Astrophysics Data System (ADS)

Levshina, T.; Guru, A.

2014-06-01

The Open Science Grid infrastructure doesn't provide efficient means to manage public storage offered by participating sites. A Virtual Organization that relies on opportunistic storage has difficulties finding appropriate storage, verifying its availability, and monitoring its utilization. The involvement of the production manager, site administrators and VO support personnel is required to allocate or rescind storage space. One of the main requirements for Public Storage implementation is that it should use SRM or GridFTP protocols to access the Storage Elements provided by the OSG Sites and not put any additional burden on sites. By policy, no new services related to Public Storage can be installed and run on OSG sites. Opportunistic users also have difficulties in accessing the OSG Storage Elements during the execution of jobs. A typical users' data management workflow includes pre-staging common data on sites before a job's execution, then storing for a subsequent download to a local institution the output data produced by a job on a worker node. When the amount of data is significant, the only means to temporarily store the data is to upload it to one of the Storage Elements. In order to do that, a user's job should be aware of the storage location, availability, and free space. After a successful data upload, users must somehow keep track of the data's location for future access. In this presentation we propose solutions for storage management and data handling issues in the OSG. We are investigating the feasibility of using the integrated Rule-Oriented Data System developed at RENCI as a front-end service to the OSG SEs. The current architecture, state of deployment and performance test results will be discussed. We will also provide examples of current usage of the system by beta-users.
An integrated workflow for stress and flow modelling using outcrop-derived discrete fracture networks

NASA Astrophysics Data System (ADS)

Bisdom, K.; Nick, H. M.; Bertotti, G.

2017-06-01

Fluid flow in naturally fractured reservoirs is often controlled by subseismic-scale fracture networks. Although the fracture network can be partly sampled in the direct vicinity of wells, the inter-well scale network is poorly constrained in fractured reservoir models. Outcrop analogues can provide data for populating domains of the reservoir model where no direct measurements are available. However, extracting relevant statistics from large outcrops representative of inter-well scale fracture networks remains challenging. Recent advances in outcrop imaging provide high-resolution datasets that can cover areas of several hundred by several hundred meters, i.e. the domain between adjacent wells, but even then, data from the high-resolution models is often upscaled to reservoir flow grids, resulting in loss of accuracy. We present a workflow that uses photorealistic georeferenced outcrop models to construct geomechanical and fluid flow models containing thousands of discrete fractures covering sufficiently large areas, that does not require upscaling to model permeability. This workflow seamlessly integrates geomechanical Finite Element models with flow models that take into account stress-sensitive fracture permeability and matrix flow to determine the full permeability tensor. The applicability of this workflow is illustrated using an outcropping carbonate pavement in the Potiguar basin in Brazil, from which 1082 fractures are digitised. The permeability tensor for a range of matrix permeabilities shows that conventional upscaling to effective grid properties leads to potential underestimation of the true permeability and the orientation of principal permeabilities. The presented workflow yields the full permeability tensor model of discrete fracture networks with stress-induced apertures, instead of relying on effective properties as most conventional flow models do.
Progress on the Fabric for Frontier Experiments Project at Fermilab

NASA Astrophysics Data System (ADS)

Box, Dennis; Boyd, Joseph; Dykstra, Dave; Garzoglio, Gabriele; Herner, Kenneth; Kirby, Michael; Kreymer, Arthur; Levshina, Tanya; Mhashilkar, Parag; Sharma, Neha

2015-12-01

The FabrIc for Frontier Experiments (FIFE) project is an ambitious, major-impact initiative within the Fermilab Scientific Computing Division designed to lead the computing model for Fermilab experiments. FIFE is a collaborative effort between experimenters and computing professionals to design and develop integrated computing models for experiments of varying needs and infrastructure. The major focus of the FIFE project is the development, deployment, and integration of Open Science Grid solutions for high throughput computing, data management, database access and collaboration within experiment. To accomplish this goal, FIFE has developed workflows that utilize Open Science Grid sites along with dedicated and commercial cloud resources. The FIFE project has made significant progress integrating into experiment computing operations several services including new job submission services, software and reference data distribution through CVMFS repositories, flexible data transfer client, and access to opportunistic resources on the Open Science Grid. The progress with current experiments and plans for expansion with additional projects will be discussed. FIFE has taken a leading role in the definition of the computing model for Fermilab experiments, aided in the design of computing for experiments beyond Fermilab, and will continue to define the future direction of high throughput computing for future physics experiments worldwide.
Common Data Models and Efficient Reproducible Workflows for Distributed Ocean Model Skill Assessment

NASA Astrophysics Data System (ADS)

Signell, R. P.; Snowden, D. P.; Howlett, E.; Fernandes, F. A.

2014-12-01

Model skill assessment requires discovery, access, analysis, and visualization of information from both sensors and models, and traditionally has been possible only by a few experts. The US Integrated Ocean Observing System (US-IOOS) consists of 17 Federal Agencies and 11 Regional Associations that produce data from various sensors and numerical models; exactly the information required for model skill assessment. US-IOOS is seeking to develop documented skill assessment workflows that are standardized, efficient, and reproducible so that a much wider community can participate in the use and assessment of model results. Standardization requires common data models for observational and model data. US-IOOS relies on the CF Conventions for observations and structured grid data, and on the UGRID Conventions for unstructured (e.g. triangular) grid data. This allows applications to obtain only the data they require in a uniform and parsimonious way using web services: OPeNDAP for model output and OGC Sensor Observation Service (SOS) for observed data. Reproducibility is enabled with IPython Notebooks shared on GitHub (http://github.com/ioos). These capture the entire skill assessment workflow, including user input, search, access, analysis, and visualization, ensuring that workflows are self-documenting and reproducible by anyone, using free software. Python packages for common data models are Pyugrid and the British Met Office Iris package. Python packages required to run the workflows (pyugrid, pyoos, and the British Met Office Iris package) are also available on GitHub and on Binstar.org so that users can run scenarios using the free Anaconda Python distribution. Hosted services such as Wakari enable anyone to reproduce these workflows for free, without installing any software locally, using just their web browser. We are also experimenting with Wakari Enterprise, which allows multi-user access from a web browser to an IPython Server running where large quantities of model output reside, increasing the efficiency. The open development and distribution of these workflows, and the software on which they depend, is an educational resource for those new to the field and a center of focus where practitioners can contribute new software and ideas.
Neuroimaging Study Designs, Computational Analyses and Data Provenance Using the LONI Pipeline

PubMed Central

Dinov, Ivo; Lozev, Kamen; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Pierce, Jonathan; Zamanyan, Alen; Chakrapani, Shruthi; Van Horn, John; Parker, D. Stott; Magsipoc, Rico; Leung, Kelvin; Gutman, Boris; Woods, Roger; Toga, Arthur

2010-01-01

Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges—management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu. PMID:20927408
High-volume workflow management in the ITN/FBI system

NASA Astrophysics Data System (ADS)

Paulson, Thomas L.

1997-02-01

The Identification Tasking and Networking (ITN) Federal Bureau of Investigation system will manage the processing of more than 70,000 submissions per day. The workflow manager controls the routing of each submission through a combination of automated and manual processing steps whose exact sequence is dynamically determined by the results at each step. For most submissions, one or more of the steps involve the visual comparison of fingerprint images. The ITN workflow manager is implemented within a scaleable client/server architecture. The paper describes the key aspects of the ITN workflow manager design which allow the high volume of daily processing to be successfully accomplished.
Guest editorial. Integrated healthcare information systems.

PubMed

Li, Ling; Ge, Ri-Li; Zhou, Shang-Ming; Valerdi, Ricardo

2012-07-01

The use of integrated information systems for healthcare has been started more than a decade ago. In recent years, rapid advances in information integration methods have spurred tremendous growth in the use of integrated information systems in healthcare delivery. Various techniques have been used for probing such integrated systems. These techniques include service-oriented architecture (SOA), EAI, workflow management, grid computing, and others. Many applications require a combination of these techniques, which gives rise to the emergence of enterprise systems in healthcare. Development of the techniques originated from different disciplines has the potential to significantly improve the performance of enterprise systems in healthcare. This editorial paper briefly introduces the enterprise systems in the perspective of healthcare informatics.
A Workflow to Improve the Alignment of Prostate Imaging with Whole-mount Histopathology.

PubMed

Yamamoto, Hidekazu; Nir, Dror; Vyas, Lona; Chang, Richard T; Popert, Rick; Cahill, Declan; Challacombe, Ben; Dasgupta, Prokar; Chandra, Ashish

2014-08-01

Evaluation of prostate imaging tests against whole-mount histology specimens requires accurate alignment between radiologic and histologic data sets. Misalignment results in false-positive and -negative zones as assessed by imaging. We describe a workflow for three-dimensional alignment of prostate imaging data against whole-mount prostatectomy reference specimens and assess its performance against a standard workflow. Ethical approval was granted. Patients underwent motorized transrectal ultrasound (Prostate Histoscanning) to generate a three-dimensional image of the prostate before radical prostatectomy. The test workflow incorporated steps for axial alignment between imaging and histology, size adjustments following formalin fixation, and use of custom-made parallel cutters and digital caliper instruments. The control workflow comprised freehand cutting and assumed homogeneous block thicknesses at the same relative angles between pathology and imaging sections. Thirty radical prostatectomy specimens were histologically and radiologically processed, either by an alignment-optimized workflow (n = 20) or a control workflow (n = 10). The optimized workflow generated tissue blocks of heterogeneous thicknesses but with no significant drifting in the cutting plane. The control workflow resulted in significantly nonparallel blocks, accurately matching only one out of four histology blocks to their respective imaging data. The image-to-histology alignment accuracy was 20% greater in the optimized workflow (P < .0001), with higher sensitivity (85% vs. 69%) and specificity (94% vs. 73%) for margin prediction in a 5 × 5-mm grid analysis. A significantly better alignment was observed in the optimized workflow. Evaluation of prostate imaging biomarkers using whole-mount histology references should include a test-to-reference spatial alignment workflow. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.
Context-aware workflow management of mobile health applications.

PubMed

Salden, Alfons; Poortinga, Remco

2006-01-01

We propose a medical application management architecture that allows medical (IT) experts readily designing, developing and deploying context-aware mobile health (m-health) applications or services. In particular, we elaborate on how our application workflow management architecture enables chaining, coordinating, composing, and adapting context-sensitive medical application components such that critical Quality of Service (QoS) and Quality of Context (QoC) requirements typical for m-health applications or services can be met. This functional architectural support requires learning modules for distilling application-critical selection of attention and anticipation models. These models will help medical experts constructing and adjusting on-the-fly m-health application workflows and workflow strategies. We illustrate our context-aware workflow management paradigm for a m-health data delivery problem, in which optimal communication network configurations have to be determined.

EIAGRID: In-field optimization of seismic data acquisition by real-time subsurface imaging using a remote GRID computing environment.

NASA Astrophysics Data System (ADS)

Heilmann, B. Z.; Vallenilla Ferrara, A. M.

2009-04-01

The constant growth of contaminated sites, the unsustainable use of natural resources, and, last but not least, the hydrological risk related to extreme meteorological events and increased climate variability are major environmental issues of today. Finding solutions for these complex problems requires an integrated cross-disciplinary approach, providing a unified basis for environmental science and engineering. In computer science, grid computing is emerging worldwide as a formidable tool allowing distributed computation and data management with administratively-distant resources. Utilizing these modern High Performance Computing (HPC) technologies, the GRIDA3 project bundles several applications from different fields of geoscience aiming to support decision making for reasonable and responsible land use and resource management. In this abstract we present a geophysical application called EIAGRID that uses grid computing facilities to perform real-time subsurface imaging by on-the-fly processing of seismic field data and fast optimization of the processing workflow. Even though, seismic reflection profiling has a broad application range spanning from shallow targets in a few meters depth to targets in a depth of several kilometers, it is primarily used by the hydrocarbon industry and hardly for environmental purposes. The complexity of data acquisition and processing poses severe problems for environmental and geotechnical engineering: Professional seismic processing software is expensive to buy and demands large experience from the user. In-field processing equipment needed for real-time data Quality Control (QC) and immediate optimization of the acquisition parameters is often not available for this kind of studies. As a result, the data quality will be suboptimal. In the worst case, a crucial parameter such as receiver spacing, maximum offset, or recording time turns out later to be inappropriate and the complete acquisition campaign has to be repeated. The EIAGRID portal provides an innovative solution to this problem combining state-of-the-art data processing methods and modern remote grid computing technology. In field-processing equipment is substituted by remote access to high performance grid computing facilities. The latter can be ubiquitously controlled by a user-friendly web-browser interface accessed from the field by any mobile computer using wireless data transmission technology such as UMTS (Universal Mobile Telecommunications System) or HSUPA/HSDPA (High-Speed Uplink/Downlink Packet Access). The complexity of data-manipulation and processing and thus also the time demanding user interaction is minimized by a data-driven, and highly automated velocity analysis and imaging approach based on the Common-Reflection-Surface (CRS) stack. Furthermore, the huge computing power provided by the grid deployment allows parallel testing of alternative processing sequences and parameter settings, a feature which considerably reduces the turn-around times. A shared data storage using georeferencing tools and data grid technology is under current development. It will allow to publish already accomplished projects, making results, processing workflows and parameter settings available in a transparent and reproducible way. Creating a unified database shared by all users will facilitate complex studies and enable the use of data-crossing techniques to incorporate results of other environmental applications hosted on the GRIDA3 portal.
Medication Management: The Macrocognitive Workflow of Older Adults With Heart Failure.

PubMed

Mickelson, Robin S; Unertl, Kim M; Holden, Richard J

2016-10-12

Older adults with chronic disease struggle to manage complex medication regimens. Health information technology has the potential to improve medication management, but only if it is based on a thorough understanding of the complexity of medication management workflow as it occurs in natural settings. Prior research reveals that patient work related to medication management is complex, cognitive, and collaborative. Macrocognitive processes are theorized as how people individually and collaboratively think in complex, adaptive, and messy nonlaboratory settings supported by artifacts. The objective of this research was to describe and analyze the work of medication management by older adults with heart failure, using a macrocognitive workflow framework. We interviewed and observed 61 older patients along with 30 informal caregivers about self-care practices including medication management. Descriptive qualitative content analysis methods were used to develop categories, subcategories, and themes about macrocognitive processes used in medication management workflow. We identified 5 high-level macrocognitive processes affecting medication management-sensemaking, planning, coordination, monitoring, and decision making-and 15 subprocesses. Data revealed workflow as occurring in a highly collaborative, fragile system of interacting people, artifacts, time, and space. Process breakdowns were common and patients had little support for macrocognitive workflow from current tools. Macrocognitive processes affected medication management performance. Describing and analyzing this performance produced recommendations for technology supporting collaboration and sensemaking, decision making and problem detection, and planning and implementation.
MetaGenSense: A web-application for analysis and exploration of high throughput sequencing metagenomic data

PubMed Central

Denis, Jean-Baptiste; Vandenbogaert, Mathias; Caro, Valérie

2016-01-01

The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS), solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users’ input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power). Galaxy is used to handle and analyze the user’s input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy’s main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration of intuitive exploratory tools, like Krona for representation of taxonomic classification, can be achieved very easily. In the trend of Galaxy, the interface enables the sharing of scientific results to fellow team members. PMID:28451381
MetaGenSense: A web-application for analysis and exploration of high throughput sequencing metagenomic data.

PubMed

Correia, Damien; Doppelt-Azeroual, Olivia; Denis, Jean-Baptiste; Vandenbogaert, Mathias; Caro, Valérie

2015-01-01

The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS), solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users' input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power). Galaxy is used to handle and analyze the user's input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy's main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration of intuitive exploratory tools, like Krona for representation of taxonomic classification, can be achieved very easily. In the trend of Galaxy, the interface enables the sharing of scientific results to fellow team members.
International Symposium on Grids and Clouds (ISGC) 2014

NASA Astrophysics Data System (ADS)

The International Symposium on Grids and Clouds (ISGC) 2014 will be held at Academia Sinica in Taipei, Taiwan from 23-28 March 2014, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC).“Bringing the data scientist to global e-Infrastructures” is the theme of ISGC 2014. The last decade has seen the phenomenal growth in the production of data in all forms by all research communities to produce a deluge of data from which information and knowledge need to be extracted. Key to this success will be the data scientist - educated to use advanced algorithms, applications and infrastructures - collaborating internationally to tackle society’s challenges. ISGC 2014 will bring together researchers working in all aspects of data science from different disciplines around the world to collaborate and educate themselves in the latest achievements and techniques being used to tackle the data deluge. In addition to the regular workshops, technical presentations and plenary keynotes, ISGC this year will focus on how to grow the data science community by considering the educational foundation needed for tomorrow’s data scientist. Topics of discussion include Physics (including HEP) and Engineering Applications, Biomedicine & Life Sciences Applications, Earth & Environmental Sciences & Biodiversity Applications, Humanities & Social Sciences Application, Virtual Research Environment (including Middleware, tools, services, workflow, ... etc.), Data Management, Big Data, Infrastructure & Operations Management, Infrastructure Clouds and Virtualisation, Interoperability, Business Models & Sustainability, Highly Distributed Computing Systems, and High Performance & Technical Computing (HPTC).
Inferring Clinical Workflow Efficiency via Electronic Medical Record Utilization

PubMed Central

Chen, You; Xie, Wei; Gunter, Carl A; Liebovitz, David; Mehrotra, Sanjay; Zhang, He; Malin, Bradley

2015-01-01

Complexity in clinical workflows can lead to inefficiency in making diagnoses, ineffectiveness of treatment plans and uninformed management of healthcare organizations (HCOs). Traditional strategies to manage workflow complexity are based on measuring the gaps between workflows defined by HCO administrators and the actual processes followed by staff in the clinic. However, existing methods tend to neglect the influences of EMR systems on the utilization of workflows, which could be leveraged to optimize workflows facilitated through the EMR. In this paper, we introduce a framework to infer clinical workflows through the utilization of an EMR and show how such workflows roughly partition into four types according to their efficiency. Our framework infers workflows at several levels of granularity through data mining technologies. We study four months of EMR event logs from a large medical center, including 16,569 inpatient stays, and illustrate that over approximately 95% of workflows are efficient and that 80% of patients are on such workflows. At the same time, we show that the remaining 5% of workflows may be inefficient due to a variety of factors, such as complex patients. PMID:26958173
Medication Management: The Macrocognitive Workflow of Older Adults With Heart Failure

PubMed Central

2016-01-01

Background Older adults with chronic disease struggle to manage complex medication regimens. Health information technology has the potential to improve medication management, but only if it is based on a thorough understanding of the complexity of medication management workflow as it occurs in natural settings. Prior research reveals that patient work related to medication management is complex, cognitive, and collaborative. Macrocognitive processes are theorized as how people individually and collaboratively think in complex, adaptive, and messy nonlaboratory settings supported by artifacts. Objective The objective of this research was to describe and analyze the work of medication management by older adults with heart failure, using a macrocognitive workflow framework. Methods We interviewed and observed 61 older patients along with 30 informal caregivers about self-care practices including medication management. Descriptive qualitative content analysis methods were used to develop categories, subcategories, and themes about macrocognitive processes used in medication management workflow. Results We identified 5 high-level macrocognitive processes affecting medication management—sensemaking, planning, coordination, monitoring, and decision making—and 15 subprocesses. Data revealed workflow as occurring in a highly collaborative, fragile system of interacting people, artifacts, time, and space. Process breakdowns were common and patients had little support for macrocognitive workflow from current tools. Conclusions Macrocognitive processes affected medication management performance. Describing and analyzing this performance produced recommendations for technology supporting collaboration and sensemaking, decision making and problem detection, and planning and implementation. PMID:27733331
A Model of Workflow Composition for Emergency Management

NASA Astrophysics Data System (ADS)

Xin, Chen; Bin-ge, Cui; Feng, Zhang; Xue-hui, Xu; Shan-shan, Fu

The common-used workflow technology is not flexible enough in dealing with concurrent emergency situations. The paper proposes a novel model for defining emergency plans, in which workflow segments appear as a constituent part. A formal abstraction, which contains four operations, is defined to compose workflow segments under constraint rule. The software system of the business process resources construction and composition is implemented and integrated into Emergency Plan Management Application System.
Worklist handling in workflow-enabled radiological application systems

NASA Astrophysics Data System (ADS)

Wendler, Thomas; Meetz, Kirsten; Schmidt, Joachim; von Berg, Jens

2000-05-01

For the next generation integrated information systems for health care applications, more emphasis has to be put on systems which, by design, support the reduction of cost, the increase inefficiency and the improvement of the quality of services. A substantial contribution to this will be the modeling. optimization, automation and enactment of processes in health care institutions. One of the perceived key success factors for the system integration of processes will be the application of workflow management, with workflow management systems as key technology components. In this paper we address workflow management in radiology. We focus on an important aspect of workflow management, the generation and handling of worklists, which provide workflow participants automatically with work items that reflect tasks to be performed. The display of worklists and the functions associated with work items are the visible part for the end-users of an information system using a workflow management approach. Appropriate worklist design and implementation will influence user friendliness of a system and will largely influence work efficiency. Technically, in current imaging department information system environments (modality-PACS-RIS installations), a data-driven approach has been taken: Worklist -- if present at all -- are generated from filtered views on application data bases. In a future workflow-based approach, worklists will be generated by autonomous workflow services based on explicit process models and organizational models. This process-oriented approach will provide us with an integral view of entire health care processes or sub- processes. The paper describes the basic mechanisms of this approach and summarizes its benefits.
Collaborative Science Using Web Services and the SciFlo Grid Dataflow Engine

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Xing, Z.; Yunck, T.

2006-12-01

The General Earth Science Investigation Suite (GENESIS) project is a NASA-sponsored partnership between the Jet Propulsion Laboratory, academia, and NASA data centers to develop a new suite of Web Services tools to facilitate multi-sensor investigations in Earth System Science. The goal of GENESIS is to enable large-scale, multi-instrument atmospheric science using combined datasets from the AIRS, MODIS, MISR, and GPS sensors. Investigations include cross-comparison of spaceborne climate sensors, cloud spectral analysis, study of upper troposphere-stratosphere water transport, study of the aerosol indirect cloud effect, and global climate model validation. The challenges are to bring together very large datasets, reformat and understand the individual instrument retrievals, co-register or re-grid the retrieved physical parameters, perform computationally-intensive data fusion and data mining operations, and accumulate complex statistics over months to years of data. To meet these challenges, we have developed a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data access, subsetting, registration, mining, fusion, compression, and advanced statistical analysis. SciFlo leverages remote Web Services, called via Simple Object Access Protocol (SOAP) or REST (one-line) URLs, and the Grid Computing standards (WS-* &Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable Web Services and native executables into a distributed computing flow (tree of operators). The SciFlo client &server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. In particular, SciFlo exploits the wealth of datasets accessible by OpenGIS Consortium (OGC) Web Mapping Servers & Web Coverage Servers (WMS/WCS), and by Open Data Access Protocol (OpenDAP) servers. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. Once an analysis has been specified for a chunk or day of data, it can be easily repeated with different control parameters or over months of data. Recently, the Earth Science Information Partners (ESIP) Federation sponsored a collaborative activity in which several ESIP members advertised their respective WMS/WCS and SOAP services, developed some collaborative science scenarios for atmospheric and aerosol science, and then choreographed services from multiple groups into demonstration workflows using the SciFlo engine and a Business Process Execution Language (BPEL) workflow engine. For several scenarios, the same collaborative workflow was executed in three ways: using hand-coded scripts, by executing a SciFlo document, and by executing a BPEL workflow document. We will discuss the lessons learned from this activity, the need for standardized interfaces (like WMS/WCS), the difficulty in agreeing on even simple XML formats and interfaces, and further collaborations that are being pursued.
a Schema for Extraction of Indoor Pedestrian Navigation Grid Network from Floor Plans

NASA Astrophysics Data System (ADS)

Niu, Lei; Song, Yiquan

2016-06-01

The requirement of the indoor navigation related tasks such emergency evacuation calls for efficient solutions for handling data sources. Therefore, the navigation grid extraction from existing floor plans draws attentions. To this, we have to thoroughly analyse the source data, such as Autocad dxf files. Then, we could establish a sounding navigation solution, which firstly complements the basic navigation rectangle boundaries, secondly subdivides these rectangles and finally generates accessible networks with these refined rectangles. Test files are introduced to validate the whole workflow and evaluate the solution performance. In conclusion, we have achieved the preliminary step of forming up accessible network from the navigation grids.
Grist : grid-based data mining for astronomy

NASA Technical Reports Server (NTRS)

Jacob, Joseph C.; Katz, Daniel S.; Miller, Craig D.; Walia, Harshpreet; Williams, Roy; Djorgovski, S. George; Graham, Matthew J.; Mahabal, Ashish; Babu, Jogesh; Berk, Daniel E. Vanden;

2004-01-01

The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a workflow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the 'hyperatlas' project to serve high-resolution multi-wavelength imagery over the Internet. In support of these science and outreach objectives, the Grist framework will provide the enabling fabric to tie together distributed grid services in the areas of data access, federation, mining, subsetting, source extraction, image mosaicking, statistics, and visualization.

Grist: Grid-based Data Mining for Astronomy

NASA Astrophysics Data System (ADS)

Jacob, J. C.; Katz, D. S.; Miller, C. D.; Walia, H.; Williams, R. D.; Djorgovski, S. G.; Graham, M. J.; Mahabal, A. A.; Babu, G. J.; vanden Berk, D. E.; Nichol, R.

2005-12-01

The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a workflow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the ``hyperatlas'' project to serve high-resolution multi-wavelength imagery over the Internet. In support of these science and outreach objectives, the Grist framework will provide the enabling fabric to tie together distributed grid services in the areas of data access, federation, mining, subsetting, source extraction, image mosaicking, statistics, and visualization.
Transformation of OODT CAS to Perform Larger Tasks

NASA Technical Reports Server (NTRS)

Mattmann, Chris; Freeborn, Dana; Crichton, Daniel; Hughes, John; Ramirez, Paul; Hardman, Sean; Woollard, David; Kelly, Sean

2008-01-01

A computer program denoted OODT CAS has been transformed to enable performance of larger tasks that involve greatly increased data volumes and increasingly intensive processing of data on heterogeneous, geographically dispersed computers. Prior to the transformation, OODT CAS (also alternatively denoted, simply, 'CAS') [wherein 'OODT' signifies 'Object-Oriented Data Technology' and 'CAS' signifies 'Catalog and Archive Service'] was a proven software component used to manage scientific data from spaceflight missions. In the transformation, CAS was split into two separate components representing its canonical capabilities: file management and workflow management. In addition, CAS was augmented by addition of a resource-management component. This third component enables CAS to manage heterogeneous computing by use of diverse resources, including high-performance clusters of computers, commodity computing hardware, and grid computing infrastructures. CAS is now more easily maintainable, evolvable, and reusable. These components can be used separately or, taking advantage of synergies, can be used together. Other elements of the transformation included addition of a separate Web presentation layer that supports distribution of data products via Really Simple Syndication (RSS) feeds, and provision for full Resource Description Framework (RDF) exports of metadata.
Design and implementation of workflow engine for service-oriented architecture

NASA Astrophysics Data System (ADS)

Peng, Shuqing; Duan, Huining; Chen, Deyun

2009-04-01

As computer network is developed rapidly and in the situation of the appearance of distribution specialty in enterprise application, traditional workflow engine have some deficiencies, such as complex structure, bad stability, poor portability, little reusability and difficult maintenance. In this paper, in order to improve the stability, scalability and flexibility of workflow management system, a four-layer architecture structure of workflow engine based on SOA is put forward according to the XPDL standard of Workflow Management Coalition, the route control mechanism in control model is accomplished and the scheduling strategy of cyclic routing and acyclic routing is designed, and the workflow engine which adopts the technology such as XML, JSP, EJB and so on is implemented.
A Portable Regional Weather and Climate Downscaling System Using GEOS-5, LIS-6, WRF, and the NASA Workflow Tool

NASA Astrophysics Data System (ADS)

Kemp, E. M.; Putman, W. M.; Gurganus, J.; Burns, R. W.; Damon, M. R.; McConaughy, G. R.; Seablom, M. S.; Wojcik, G. S.

2009-12-01

We present a regional downscaling system (RDS) suitable for high-resolution weather and climate simulations in multiple supercomputing environments. The RDS is built on the NASA Workflow Tool, a software framework for configuring, running, and managing computer models on multiple platforms with a graphical user interface. The Workflow Tool is used to run the NASA Goddard Earth Observing System Model Version 5 (GEOS-5), a global atmospheric-ocean model for weather and climate simulations down to 1/4 degree resolution; the NASA Land Information System Version 6 (LIS-6), a land surface modeling system that can simulate soil temperature and moisture profiles; and the Weather Research and Forecasting (WRF) community model, a limited-area atmospheric model for weather and climate simulations down to 1-km resolution. The Workflow Tool allows users to customize model settings to user needs; saves and organizes simulation experiments; distributes model runs across different computer clusters (e.g., the DISCOVER cluster at Goddard Space Flight Center, the Cray CX-1 Desktop Supercomputer, etc.); and handles all file transfers and network communications (e.g., scp connections). Together, the RDS is intended to aid researchers by making simulations as easy as possible to generate on the computer resources available. Initial conditions for LIS-6 and GEOS-5 are provided by Modern Era Retrospective-Analysis for Research and Applications (MERRA) reanalysis data stored on DISCOVER. The LIS-6 is first run for 2-4 years forced by MERRA atmospheric analyses, generating initial conditions for the WRF soil physics. GEOS-5 is then initialized from MERRA data and run for the period of interest. Large-scale atmospheric data, sea-surface temperatures, and sea ice coverage from GEOS-5 are used as boundary conditions for WRF, which is run for the same period of interest. Multiply nested grids are used for both LIS-6 and WRF, with the innermost grid run at a resolution sufficient for typical local weather features (terrain, convection, etc.) All model runs, restarts, and file transfers are coordinated by the Workflow Tool. Two use cases are being pursued. First, the RDS generates regional climate simulations down to 4-km for the Chesapeake Bay region, with WRF output provided as input to more specialized models (e.g., ocean/lake, hydrological, marine biology, and air pollution). This will allow assessment of climate impact on local interests (e.g., changes in Bay water levels and temperatures, innundation, fish kills, etc.) Second, the RDS generates high-resolution hurricane simulations in the tropical North Atlantic. This use case will support Observing System Simulation Experiments (OSSEs) of dynamically-targeted lidar observations as part of the NASA Sensor Web Simulator project. Sample results will be presented at the AGU Fall Meeting.
GO2OGS 1.0: a versatile workflow to integrate complex geological information with fault data into numerical simulation models

NASA Astrophysics Data System (ADS)

Fischer, T.; Naumov, D.; Sattler, S.; Kolditz, O.; Walther, M.

2015-11-01

We offer a versatile workflow to convert geological models built with the ParadigmTM GOCAD© (Geological Object Computer Aided Design) software into the open-source VTU (Visualization Toolkit unstructured grid) format for usage in numerical simulation models. Tackling relevant scientific questions or engineering tasks often involves multidisciplinary approaches. Conversion workflows are needed as a way of communication between the diverse tools of the various disciplines. Our approach offers an open-source, platform-independent, robust, and comprehensible method that is potentially useful for a multitude of environmental studies. With two application examples in the Thuringian Syncline, we show how a heterogeneous geological GOCAD model including multiple layers and faults can be used for numerical groundwater flow modeling, in our case employing the OpenGeoSys open-source numerical toolbox for groundwater flow simulations. The presented workflow offers the chance to incorporate increasingly detailed data, utilizing the growing availability of computational power to simulate numerical models.
Progress on the FabrIc for Frontier Experiments project at Fermilab

DOE PAGES

Box, Dennis; Boyd, Joseph; Dykstra, Dave; ...

2015-12-23

The FabrIc for Frontier Experiments (FIFE) project is an ambitious, major-impact initiative within the Fermilab Scientific Computing Division designed to lead the computing model for Fermilab experiments. FIFE is a collaborative effort between experimenters and computing professionals to design and develop integrated computing models for experiments of varying needs and infrastructure. The major focus of the FIFE project is the development, deployment, and integration of Open Science Grid solutions for high throughput computing, data management, database access and collaboration within experiment. To accomplish this goal, FIFE has developed workflows that utilize Open Science Grid sites along with dedicated and commercialmore » cloud resources. The FIFE project has made significant progress integrating into experiment computing operations several services including new job submission services, software and reference data distribution through CVMFS repositories, flexible data transfer client, and access to opportunistic resources on the Open Science Grid. Hence, the progress with current experiments and plans for expansion with additional projects will be discussed. FIFE has taken a leading role in the definition of the computing model for Fermilab experiments, aided in the design of computing for experiments beyond Fermilab, and will continue to define the future direction of high throughput computing for future physics experiments worldwide« less
BioAcoustica: a free and open repository and analysis platform for bioacoustics

PubMed Central

Baker, Edward; Price, Ben W.; Rycroft, S. D.; Smith, Vincent S.

2015-01-01

We describe an online open repository and analysis platform, BioAcoustica (http://bio.acousti.ca), for recordings of wildlife sounds. Recordings can be annotated using a crowdsourced approach, allowing voice introductions and sections with extraneous noise to be removed from analyses. This system is based on the Scratchpads virtual research environment, the BioVeL portal and the Taverna workflow management tool, which allows for analysis of recordings using a grid computing service. At present the analyses include spectrograms, oscillograms and dominant frequency analysis. Further analyses can be integrated to meet the needs of specific researchers or projects. Researchers can upload and annotate their recordings to supplement traditional publication. Database URL: http://bio.acousti.ca PMID:26055102
Scientific Data Management (SDM) Center for Enabling Technologies. Final Report, 2007-2012

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ludascher, Bertram; Altintas, Ilkay

Our contributions to advancing the State of the Art in scientific workflows have focused on the following areas: Workflow development; Generic workflow components and templates; Provenance collection and analysis; and, Workflow reliability and fault tolerance.

The FIFE Project at Fermilab

DOE Office of Scientific and Technical Information (OSTI.GOV)

Box, D.; Boyd, J.; Di Benedetto, V.

2016-01-01

The FabrIc for Frontier Experiments (FIFE) project is an initiative within the Fermilab Scientific Computing Division designed to steer the computing model for non-LHC Fermilab experiments across multiple physics areas. FIFE is a collaborative effort between experimenters and computing professionals to design and develop integrated computing models for experiments of varying size, needs, and infrastructure. The major focus of the FIFE project is the development, deployment, and integration of solutions for high throughput computing, data management, database access and collaboration management within an experiment. To accomplish this goal, FIFE has developed workflows that utilize Open Science Grid compute sites alongmore » with dedicated and commercial cloud resources. The FIFE project has made significant progress integrating into experiment computing operations several services including a common job submission service, software and reference data distribution through CVMFS repositories, flexible and robust data transfer clients, and access to opportunistic resources on the Open Science Grid. The progress with current experiments and plans for expansion with additional projects will be discussed. FIFE has taken the leading role in defining the computing model for Fermilab experiments, aided in the design of experiments beyond those hosted at Fermilab, and will continue to define the future direction of high throughput computing for future physics experiments worldwide.« less
Workflow technology: the new frontier. How to overcome the barriers and join the future.

PubMed

Shefter, Susan M

2006-01-01

Hospitals are catching up to the business world in the introduction of technology systems that support professional practice and workflow. The field of case management is highly complex and interrelates with diverse groups in diverse locations. The last few years have seen the introduction of Workflow Technology Tools, which can improve the quality and efficiency of discharge planning by the case manager. Despite the availability of these wonderful new programs, many case managers are hesitant to adopt the new technology and workflow. For a myriad of reasons, a computer-based workflow system can seem like a brick wall. This article discusses, from a practitioner's point of view, how professionals can gain confidence and skill to get around the brick wall and join the future.
Common Workflow Service: Standards Based Solution for Managing Operational Processes

NASA Astrophysics Data System (ADS)

Tinio, A. W.; Hollins, G. A.

2017-06-01

The Common Workflow Service is a collaborative and standards-based solution for managing mission operations processes using techniques from the Business Process Management (BPM) discipline. This presentation describes the CWS and its benefits.
Accessing eSDO Solar Image Processing and Visualization through AstroGrid

NASA Astrophysics Data System (ADS)

Auden, E.; Dalla, S.

2008-08-01

The eSDO project is funded by the UK's Science and Technology Facilities Council (STFC) to integrate Solar Dynamics Observatory (SDO) data, algorithms, and visualization tools with the UK's Virtual Observatory project, AstroGrid. In preparation for the SDO launch in January 2009, the eSDO team has developed nine algorithms covering coronal behaviour, feature recognition, and global / local helioseismology. Each of these algorithms has been deployed as an AstroGrid Common Execution Architecture (CEA) application so that they can be included in complex VO workflows. In addition, the PLASTIC-enabled eSDO "Streaming Tool" online movie application allows users to search multi-instrument solar archives through AstroGrid web services and visualise the image data through galleries, an interactive movie viewing applet, and QuickTime movies generated on-the-fly.
An architecture model for multiple disease management information systems.

PubMed

Chen, Lichin; Yu, Hui-Chu; Li, Hao-Chun; Wang, Yi-Van; Chen, Huang-Jen; Wang, I-Ching; Wang, Chiou-Shiang; Peng, Hui-Yu; Hsu, Yu-Ling; Chen, Chi-Huang; Chuang, Lee-Ming; Lee, Hung-Chang; Chung, Yufang; Lai, Feipei

2013-04-01

Disease management is a program which attempts to overcome the fragmentation of healthcare system and improve the quality of care. Many studies have proven the effectiveness of disease management. However, the case managers were spending the majority of time in documentation, coordinating the members of the care team. They need a tool to support them with daily practice and optimizing the inefficient workflow. Several discussions have indicated that information technology plays an important role in the era of disease management. Whereas applications have been developed, it is inefficient to develop information system for each disease management program individually. The aim of this research is to support the work of disease management, reform the inefficient workflow, and propose an architecture model that enhance on the reusability and time saving of information system development. The proposed architecture model had been successfully implemented into two disease management information system, and the result was evaluated through reusability analysis, time consumed analysis, pre- and post-implement workflow analysis, and user questionnaire survey. The reusability of the proposed model was high, less than half of the time was consumed, and the workflow had been improved. The overall user aspect is positive. The supportiveness during daily workflow is high. The system empowers the case managers with better information and leads to better decision making.
Design and implementation of a secure workflow system based on PKI/PMI

NASA Astrophysics Data System (ADS)

Yan, Kai; Jiang, Chao-hui

2013-03-01

As the traditional workflow system in privilege management has the following weaknesses: low privilege management efficiency, overburdened for administrator, lack of trust authority etc. A secure workflow model based on PKI/PMI is proposed after studying security requirements of the workflow systems in-depth. This model can achieve static and dynamic authorization after verifying user's ID through PKC and validating user's privilege information by using AC in workflow system. Practice shows that this system can meet the security requirements of WfMS. Moreover, it can not only improve system security, but also ensures integrity, confidentiality, availability and non-repudiation of the data in the system.
Innovations in Medication Preparation Safety and Wastage Reduction: Use of a Workflow Management System in a Pediatric Hospital.

PubMed

Davis, Stephen Jerome; Hurtado, Josephine; Nguyen, Rosemary; Huynh, Tran; Lindon, Ivan; Hudnall, Cedric; Bork, Sara

2017-01-01

Background: USP <797> regulatory requirements have mandated that pharmacies improve aseptic techniques and cleanliness of the medication preparation areas. In addition, the Institute for Safe Medication Practices (ISMP) recommends that technology and automation be used as much as possible for preparing and verifying compounded sterile products. Objective: To determine the benefits associated with the implementation of the workflow management system, such as reducing medication preparation and delivery errors, reducing quantity and frequency of medication errors, avoiding costs, and enhancing the organization's decision to move toward positive patient identification (PPID). Methods: At Texas Children's Hospital, data were collected and analyzed from January 2014 through August 2014 in the pharmacy areas in which the workflow management system would be implemented. Data were excluded for September 2014 during the workflow management system oral liquid implementation phase. Data were collected and analyzed from October 2014 through June 2015 to determine whether the implementation of the workflow management system reduced the quantity and frequency of reported medication errors. Data collected and analyzed during the study period included the quantity of doses prepared, number of incorrect medication scans, number of doses discontinued from the workflow management system queue, and the number of doses rejected. Data were collected and analyzed to identify patterns of incorrect medication scans, to determine reasons for rejected medication doses, and to determine the reduction in wasted medications. Results: During the 17-month study period, the pharmacy department dispensed 1,506,220 oral liquid and injectable medication doses. From October 2014 through June 2015, the pharmacy department dispensed 826,220 medication doses that were prepared and checked via the workflow management system. Of those 826,220 medication doses, there were 16 reported incorrect volume errors. The error rate after the implementation of the workflow management system averaged 8.4%, which was a 1.6% reduction. After the implementation of the workflow management system, the average number of reported oral liquid medication and injectable medication errors decreased to 0.4 and 0.2 times per week, respectively. Conclusion: The organization was able to achieve its purpose and goal of improving the provision of quality pharmacy care through optimal medication use and safety by reducing medication preparation errors. Error rates decreased and the workflow processes were streamlined, which has led to seamless operations within the pharmacy department. There has been significant cost avoidance and waste reduction and enhanced interdepartmental satisfaction due to the reduction of reported medication errors.
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deelman, Ewa; Carothers, Christopher; Mandal, Anirban

Here we report that computational science is well established as the third pillar of scientific discovery and is on par with experimentation and theory. However, as we move closer toward the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations alike, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of domain scientists. Therefore, workflow management systems are absolutely necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns the performance optimization of complex calculation andmore » data analysis tasks. The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.« less
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows

DOE PAGES

Deelman, Ewa; Carothers, Christopher; Mandal, Anirban; ...

2015-07-14

Here we report that computational science is well established as the third pillar of scientific discovery and is on par with experimentation and theory. However, as we move closer toward the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations alike, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of domain scientists. Therefore, workflow management systems are absolutely necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns the performance optimization of complex calculation andmore » data analysis tasks. The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.« less
Developing a geoscience knowledge framework for a national geological survey organisation

NASA Astrophysics Data System (ADS)

Howard, Andrew S.; Hatton, Bill; Reitsma, Femke; Lawrie, Ken I. G.

2009-04-01

Geological survey organisations (GSOs) are established by most nations to provide a geoscience knowledge base for effective decision-making on mitigating the impacts of natural hazards and global change, and on sustainable management of natural resources. The value of the knowledge base as a national asset is continually enhanced by the exchange of knowledge between GSOs as data and information providers and the stakeholder community as knowledge 'users and exploiters'. Geological maps and associated narrative texts typically form the core of national geoscience knowledge bases, but have some inherent limitations as methods of capturing and articulating knowledge. Much knowledge about the three-dimensional (3D) spatial interpretation and its derivation and uncertainty, and the wider contextual value of the knowledge, remains intangible in the minds of the mapping geologist in implicit and tacit form. To realise the value of these knowledge assets, the British Geological Survey (BGS) has established a workflow-based cyber-infrastructure to enhance its knowledge management and exchange capability. Future geoscience surveys in the BGS will contribute to a national, 3D digital knowledge base on UK geology, with the associated implicit and tacit information captured as metadata, qualitative assessments of uncertainty, and documented workflows and best practice. Knowledge-based decision-making at all levels of society requires both the accessibility and reliability of knowledge to be enhanced in the grid-based world. Establishment of collaborative cyber-infrastructures and ontologies for geoscience knowledge management and exchange will ensure that GSOs, as knowledge-based organisations, can make their contribution to this wider goal.
Spatial Data Exploring by Satellite Image Distributed Processing

NASA Astrophysics Data System (ADS)

Mihon, V. D.; Colceriu, V.; Bektas, F.; Allenbach, K.; Gvilava, M.; Gorgan, D.

2012-04-01

Our society needs and environmental predictions encourage the applications development, oriented on supervising and analyzing different Earth Science related phenomena. Satellite images could be explored for discovering information concerning land cover, hydrology, air quality, and water and soil pollution. Spatial and environment related data could be acquired by imagery classification consisting of data mining throughout the multispectral bands. The process takes in account a large set of variables such as satellite image types (e.g. MODIS, Landsat), particular geographic area, soil composition, vegetation cover, and generally the context (e.g. clouds, snow, and season). All these specific and variable conditions require flexible tools and applications to support an optimal search for the appropriate solutions, and high power computation resources. The research concerns with experiments on solutions of using the flexible and visual descriptions of the satellite image processing over distributed infrastructures (e.g. Grid, Cloud, and GPU clusters). This presentation highlights the Grid based implementation of the GreenLand application. The GreenLand application development is based on simple, but powerful, notions of mathematical operators and workflows that are used in distributed and parallel executions over the Grid infrastructure. Currently it is used in three major case studies concerning with Istanbul geographical area, Rioni River in Georgia, and Black Sea catchment region. The GreenLand application offers a friendly user interface for viewing and editing workflows and operators. The description involves the basic operators provided by GRASS [1] library as well as many other image related operators supported by the ESIP platform [2]. The processing workflows are represented as directed graphs giving the user a fast and easy way to describe complex parallel algorithms, without having any prior knowledge of any programming language or application commands. Also this Web application does not require any kind of install for what the house-hold user is concerned. It is a remote application which may be accessed over the Internet. Currently the GreenLand application is available through the BSC-OS Portal provided by the enviroGRIDS FP7 project [3]. This presentation aims to highlight the challenges and issues of flexible description of the Grid based processing of satellite images, interoperability with other software platforms available in the portal, as well as the particular requirements of the Black Sea related use cases.
Modelling and analysis of workflow for lean supply chains

NASA Astrophysics Data System (ADS)

Ma, Jinping; Wang, Kanliang; Xu, Lida

2011-11-01

Cross-organisational workflow systems are a component of enterprise information systems which support collaborative business process among organisations in supply chain. Currently, the majority of workflow systems is developed in perspectives of information modelling without considering actual requirements of supply chain management. In this article, we focus on the modelling and analysis of the cross-organisational workflow systems in the context of lean supply chain (LSC) using Petri nets. First, the article describes the assumed conditions of cross-organisation workflow net according to the idea of LSC and then discusses the standardisation of collaborating business process between organisations in the context of LSC. Second, the concept of labelled time Petri nets (LTPNs) is defined through combining labelled Petri nets with time Petri nets, and the concept of labelled time workflow nets (LTWNs) is also defined based on LTPNs. Cross-organisational labelled time workflow nets (CLTWNs) is then defined based on LTWNs. Third, the article proposes the notion of OR-silent CLTWNS and a verifying approach to the soundness of LTWNs and CLTWNs. Finally, this article illustrates how to use the proposed method by a simple example. The purpose of this research is to establish a formal method of modelling and analysis of workflow systems for LSC. This study initiates a new perspective of research on cross-organisational workflow management and promotes operation management of LSC in real world settings.
Workflow Management Systems for Molecular Dynamics on Leadership Computers

NASA Astrophysics Data System (ADS)

Wells, Jack; Panitkin, Sergey; Oleynik, Danila; Jha, Shantenu

Molecular Dynamics (MD) simulations play an important role in a range of disciplines from Material Science to Biophysical systems and account for a large fraction of cycles consumed on computing resources. Increasingly science problems require the successful execution of ''many'' MD simulations as opposed to a single MD simulation. There is a need to provide scalable and flexible approaches to the execution of the workload. We present preliminary results on the Titan computer at the Oak Ridge Leadership Computing Facility that demonstrate a general capability to manage workload execution agnostic of a specific MD simulation kernel or execution pattern, and in a manner that integrates disparate grid-based and supercomputing resources. Our results build upon our extensive experience of distributed workload management in the high-energy physics ATLAS project using PanDA (Production and Distributed Analysis System), coupled with recent conceptual advances in our understanding of workload management on heterogeneous resources. We will discuss how we will generalize these initial capabilities towards a more production level service on DOE leadership resources. This research is sponsored by US DOE/ASCR and used resources of the OLCF computing facility.
geoknife: Reproducible web-processing of large gridded datasets

USGS Publications Warehouse

Read, Jordan S.; Walker, Jordan I.; Appling, Alison P.; Blodgett, David L.; Read, Emily K.; Winslow, Luke A.

2016-01-01

Geoprocessing of large gridded data according to overlap with irregular landscape features is common to many large-scale ecological analyses. The geoknife R package was created to facilitate reproducible analyses of gridded datasets found on the U.S. Geological Survey Geo Data Portal web application or elsewhere, using a web-enabled workflow that eliminates the need to download and store large datasets that are reliably hosted on the Internet. The package provides access to several data subset and summarization algorithms that are available on remote web processing servers. Outputs from geoknife include spatial and temporal data subsets, spatially-averaged time series values filtered by user-specified areas of interest, and categorical coverage fractions for various land-use types.
Development of a user customizable imaging informatics-based intelligent workflow engine system to enhance rehabilitation clinical trials

NASA Astrophysics Data System (ADS)

Wang, Ximing; Martinez, Clarisa; Wang, Jing; Liu, Ye; Liu, Brent

2014-03-01

Clinical trials usually have a demand to collect, track and analyze multimedia data according to the workflow. Currently, the clinical trial data management requirements are normally addressed with custom-built systems. Challenges occur in the workflow design within different trials. The traditional pre-defined custom-built system is usually limited to a specific clinical trial and normally requires time-consuming and resource-intensive software development. To provide a solution, we present a user customizable imaging informatics-based intelligent workflow engine system for managing stroke rehabilitation clinical trials with intelligent workflow. The intelligent workflow engine provides flexibility in building and tailoring the workflow in various stages of clinical trials. By providing a solution to tailor and automate the workflow, the system will save time and reduce errors for clinical trials. Although our system is designed for clinical trials for rehabilitation, it may be extended to other imaging based clinical trials as well.
Accessible high performance computing solutions for near real-time image processing for time critical applications

NASA Astrophysics Data System (ADS)

Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

2009-09-01

High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.
Large scale and cloud-based multi-model analytics experiments on climate change data in the Earth System Grid Federation

NASA Astrophysics Data System (ADS)

Fiore, Sandro; Płóciennik, Marcin; Doutriaux, Charles; Blanquer, Ignacio; Barbera, Roberto; Donvito, Giacinto; Williams, Dean N.; Anantharaj, Valentine; Salomoni, Davide D.; Aloisio, Giovanni

2017-04-01

In many scientific domains such as climate, data is often n-dimensional and requires tools that support specialized data types and primitives to be properly stored, accessed, analysed and visualized. Moreover, new challenges arise in large-scale scenarios and eco-systems where petabytes (PB) of data can be available and data can be distributed and/or replicated, such as the Earth System Grid Federation (ESGF) serving the Coupled Model Intercomparison Project, Phase 5 (CMIP5) experiment, providing access to 2.5PB of data for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5). A case study on climate models intercomparison data analysis addressing several classes of multi-model experiments is being implemented in the context of the EU H2020 INDIGO-DataCloud project. Such experiments require the availability of large amount of data (multi-terabyte order) related to the output of several climate models simulations as well as the exploitation of scientific data management tools for large-scale data analytics. More specifically, the talk discusses in detail a use case on precipitation trend analysis in terms of requirements, architectural design solution, and infrastructural implementation. The experiment has been tested and validated on CMIP5 datasets, in the context of a large scale distributed testbed across EU and US involving three ESGF sites (LLNL, ORNL, and CMCC) and one central orchestrator site (PSNC). The general "environment" of the case study relates to: (i) multi-model data analysis inter-comparison challenges; (ii) addressed on CMIP5 data; and (iii) which are made available through the IS-ENES/ESGF infrastructure. The added value of the solution proposed in the INDIGO-DataCloud project are summarized in the following: (i) it implements a different paradigm (from client- to server-side); (ii) it intrinsically reduces data movement; (iii) it makes lightweight the end-user setup; (iv) it fosters re-usability (of data, final/intermediate products, workflows, sessions, etc.) since everything is managed on the server-side; (v) it complements, extends and interoperates with the ESGF stack; (vi) it provides a "tool" for scientists to run multi-model experiments, and finally; and (vii) it can drastically reduce the time-to-solution for these experiments from weeks to hours. At the time the contribution is being written, the proposed testbed represents the first concrete implementation of a distributed multi-model experiment in the ESGF/CMIP context joining server-side and parallel processing, end-to-end workflow management and cloud computing. As opposed to the current scenario based on search & discovery, data download, and client-based data analysis, the INDIGO-DataCloud architectural solution described in this contribution addresses the scientific computing & analytics requirements by providing a paradigm shift based on server-side and high performance big data frameworks jointly with two-level workflow management systems realized at the PaaS level via a cloud infrastructure.
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience

PubMed Central

Stockton, David B.; Santamaria, Fidel

2015-01-01

We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project. PMID:26528175
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience.

PubMed

Stockton, David B; Santamaria, Fidel

2015-01-01

We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project.
Nexus: A modular workflow management system for quantum simulation codes

NASA Astrophysics Data System (ADS)

Krogel, Jaron T.

2016-01-01

The management of simulation workflows represents a significant task for the individual computational researcher. Automation of the required tasks involved in simulation work can decrease the overall time to solution and reduce sources of human error. A new simulation workflow management system, Nexus, is presented to address these issues. Nexus is capable of automated job management on workstations and resources at several major supercomputing centers. Its modular design allows many quantum simulation codes to be supported within the same framework. Current support includes quantum Monte Carlo calculations with QMCPACK, density functional theory calculations with Quantum Espresso or VASP, and quantum chemical calculations with GAMESS. Users can compose workflows through a transparent, text-based interface, resembling the input file of a typical simulation code. A usage example is provided to illustrate the process.

Biowep: a workflow enactment portal for bioinformatics applications.

PubMed

Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

2007-03-08

The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics - LITBIO.
Biowep: a workflow enactment portal for bioinformatics applications

PubMed Central

Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

2007-01-01

Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO. PMID:17430563
Content and Workflow Management for Library Websites: Case Studies

ERIC Educational Resources Information Center

Yu, Holly, Ed.

2005-01-01

Using database-driven web pages or web content management (WCM) systems to manage increasingly diverse web content and to streamline workflows is a commonly practiced solution recognized in libraries today. However, limited library web content management models and funding constraints prevent many libraries from purchasing commercially available…
Scientific Workflow Management in Proteomics

PubMed Central

de Bruin, Jeroen S.; Deelder, André M.; Palmblad, Magnus

2012-01-01

Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond. PMID:22411703
Flexible Workflow Software enables the Management of an Increased Volume and Heterogeneity of Sensors, and evolves with the Expansion of Complex Ocean Observatory Infrastructures.

NASA Astrophysics Data System (ADS)

Tomlin, M. C.; Jenkyns, R.

2015-12-01

Ocean Networks Canada (ONC) collects data from observatories in the northeast Pacific, Salish Sea, Arctic Ocean, Atlantic Ocean, and land-based sites in British Columbia. Data are streamed, collected autonomously, or transmitted via satellite from a variety of instruments. The Software Engineering group at ONC develops and maintains Oceans 2.0, an in-house software system that acquires and archives data from sensors, and makes data available to scientists, the public, government and non-government agencies. The Oceans 2.0 workflow tool was developed by ONC to manage a large volume of tasks and processes required for instrument installation, recovery and maintenance activities. Since 2013, the workflow tool has supported 70 expeditions and grown to include 30 different workflow processes for the increasing complexity of infrastructures at ONC. The workflow tool strives to keep pace with an increasing heterogeneity of sensors, connections and environments by supporting versioning of existing workflows, and allowing the creation of new processes and tasks. Despite challenges in training and gaining mutual support from multidisciplinary teams, the workflow tool has become invaluable in project management in an innovative setting. It provides a collective place to contribute to ONC's diverse projects and expeditions and encourages more repeatable processes, while promoting interactions between the multidisciplinary teams who manage various aspects of instrument development and the data they produce. The workflow tool inspires documentation of terminologies and procedures, and effectively links to other tools at ONC such as JIRA, Alfresco and Wiki. Motivated by growing sensor schemes, modes of collecting data, archiving, and data distribution at ONC, the workflow tool ensures that infrastructure is managed completely from instrument purchase to data distribution. It integrates all areas of expertise and helps fulfill ONC's mandate to offer quality data to users.
Workflow Automation: A Collective Case Study

ERIC Educational Resources Information Center

Harlan, Jennifer

2013-01-01

Knowledge management has proven to be a sustainable competitive advantage for many organizations. Knowledge management systems are abundant, with multiple functionalities. The literature reinforces the use of workflow automation with knowledge management systems to benefit organizations; however, it was not known if process automation yielded…
Bringing the CMS distributed computing system into scalable operations

NASA Astrophysics Data System (ADS)

Belforte, S.; Fanfani, A.; Fisk, I.; Flix, J.; Hernández, J. M.; Kress, T.; Letts, J.; Magini, N.; Miccio, V.; Sciabà, A.

2010-04-01

Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.
Coarsening of three-dimensional structured and unstructured grids for subsurface flow

NASA Astrophysics Data System (ADS)

Aarnes, Jørg Espen; Hauge, Vera Louise; Efendiev, Yalchin

2007-11-01

We present a generic, semi-automated algorithm for generating non-uniform coarse grids for modeling subsurface flow. The method is applicable to arbitrary grids and does not impose smoothness constraints on the coarse grid. One therefore avoids conventional smoothing procedures that are commonly used to ensure that the grids obtained with standard coarsening procedures are not too rough. The coarsening algorithm is very simple and essentially involves only two parameters that specify the level of coarsening. Consequently the algorithm allows the user to specify the simulation grid dynamically to fit available computer resources, and, e.g., use the original geomodel as input for flow simulations. This is of great importance since coarse grid-generation is normally the most time-consuming part of an upscaling phase, and therefore the main obstacle that has prevented simulation workflows with user-defined resolution. We apply the coarsening algorithm to a series of two-phase flow problems on both structured (Cartesian) and unstructured grids. The numerical results demonstrate that one consistently obtains significantly more accurate results using the proposed non-uniform coarsening strategy than with corresponding uniform coarse grids with roughly the same number of cells.
Alchemist multimodal workflows for diabetic retinopathy research, disease prevention and investigational drug discovery.

PubMed

Riposan, Adina; Taylor, Ian; Owens, David R; Rana, Omer; Conley, Edward C

2007-01-01

In this paper we present mechanisms for imaging and spectral data discovery, as applied to the early detection of pathologic mechanisms underlying diabetic retinopathy in research and clinical trial scenarios. We discuss the Alchemist framework, built using a generic peer-to-peer architecture, supporting distributed database queries and complex search algorithms based on workflow. The Alchemist is a domain-independent search mechanism that can be applied to search and data discovery scenarios in many areas. We illustrate Alchemist's ability to perform complex searches composed as a collection of peer-to-peer overlays, Grid-based services and workflows, e.g. applied to image and spectral data discovery, as applied to the early detection and prevention of retinal disease and investigational drug discovery. The Alchemist framework is built on top of decentralised technologies and uses industry standards such as Web services and SOAP for messaging.
Nexus: a modular workflow management system for quantum simulation codes

DOE PAGES

Krogel, Jaron T.

2015-08-24

The management of simulation workflows is a significant task for the individual computational researcher. Automation of the required tasks involved in simulation work can decrease the overall time to solution and reduce sources of human error. A new simulation workflow management system, Nexus, is presented to address these issues. Nexus is capable of automated job management on workstations and resources at several major supercomputing centers. Its modular design allows many quantum simulation codes to be supported within the same framework. Current support includes quantum Monte Carlo calculations with QMCPACK, density functional theory calculations with Quantum Espresso or VASP, and quantummore » chemical calculations with GAMESS. Users can compose workflows through a transparent, text-based interface, resembling the input file of a typical simulation code. A usage example is provided to illustrate the process.« less
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

PubMed

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

PubMed Central

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
An Integrated Cyberenvironment for Event-Driven Environmental Observatory Research and Education

NASA Astrophysics Data System (ADS)

Myers, J.; Minsker, B.; Butler, R.

2006-12-01

National environmental observatories will soon provide large-scale data from diverse sensor networks and community models. While much attention is focused on piping data from sensors to archives and users, truly integrating these resources into the everyday research activities of scientists and engineers across the community, and enabling their results and innovations to be brought back into the observatory, also critical to long-term success of the observatories, is often neglected. This talk will give an overview of the Environmental Cyberinfrastructure Demonstrator (ECID) Cyberenvironment for observatory-centric environmental research and education, under development at the National Center for Supercomputing Applications (NCSA), which is designed to address these issues. Cyberenvironments incorporate collaboratory and grid technologies, web services, and other cyberinfrastructure into an overall framework that balances needs for efficient coordination and the ability to innovate. They are designed to support the full scientific lifecycle both in terms of individual experiments moving from data to workflows to publication and at the macro level where new discoveries lead to additional data, models, tools, and conceptual frameworks that augment and evolve community-scale systems such as observatories. The ECID cyberenvironment currently integrates five major components a collaborative portal, workflow engine, event manager, metadata repository, and social network personalization capabilities - that have novel features inspired by the Cyberenvironment concept and enabling powerful environmental research scenarios. A summary of these components and the overall cyberenvironment will be given in this talk, while other posters will give details on several of the components. The summary will be presented within the context of environmental use case scenarios created in collaboration with researchers from the WATERS (WATer and Environmental Research Systems) Network, a joint National Science Foundation-funded initiative of the hydrology and environmental engineering communities. The use case scenarios include identifying sensor anomalies in point- and streaming sensor data and notifying data managers in near-real time; and referring users of data or data products (e.g., workflows, publications) to related data or data products.
A Tool Supporting Collaborative Data Analytics Workflow Design and Management

NASA Astrophysics Data System (ADS)

Zhang, J.; Bao, Q.; Lee, T. J.

2016-12-01

Collaborative experiment design could significantly enhance the sharing and adoption of the data analytics algorithms and models emerged in Earth science. Existing data-oriented workflow tools, however, are not suitable to support collaborative design of such a workflow, to name a few, to support real-time co-design; to track how a workflow evolves over time based on changing designs contributed by multiple Earth scientists; and to capture and retrieve collaboration knowledge on workflow design (discussions that lead to a design). To address the aforementioned challenges, we have designed and developed a technique supporting collaborative data-oriented workflow composition and management, as a key component toward supporting big data collaboration through the Internet. Reproducibility and scalability are two major targets demanding fundamental infrastructural support. One outcome of the project os a software tool, supporting an elastic number of groups of Earth scientists to collaboratively design and compose data analytics workflows through the Internet. Instead of recreating the wheel, we have extended an existing workflow tool VisTrails into an online collaborative environment as a proof of concept.
International Symposium on Grids and Clouds (ISGC) 2016

NASA Astrophysics Data System (ADS)

The International Symposium on Grids and Clouds (ISGC) 2016 will be held at Academia Sinica in Taipei, Taiwan from 13-18 March 2016, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). The theme of ISGC 2016 focuses on“Ubiquitous e-infrastructures and Applications”. Contemporary research is impossible without a strong IT component - researchers rely on the existence of stable and widely available e-infrastructures and their higher level functions and properties. As a result of these expectations, e-Infrastructures are becoming ubiquitous, providing an environment that supports large scale collaborations that deal with global challenges as well as smaller and temporal research communities focusing on particular scientific problems. To support those diversified communities and their needs, the e-Infrastructures themselves are becoming more layered and multifaceted, supporting larger groups of applications. Following the call for the last year conference, ISGC 2016 continues its aim to bring together users and application developers with those responsible for the development and operation of multi-purpose ubiquitous e-Infrastructures. Topics of discussion include Physics (including HEP) and Engineering Applications, Biomedicine & Life Sciences Applications, Earth & Environmental Sciences & Biodiversity Applications, Humanities, Arts, and Social Sciences (HASS) Applications, Virtual Research Environment (including Middleware, tools, services, workflow, etc.), Data Management, Big Data, Networking & Security, Infrastructure & Operations, Infrastructure Clouds and Virtualisation, Interoperability, Business Models & Sustainability, Highly Distributed Computing Systems, and High Performance & Technical Computing (HPTC), etc.
Lessons from implementing a combined workflow-informatics system for diabetes management.

PubMed

Zai, Adrian H; Grant, Richard W; Estey, Greg; Lester, William T; Andrews, Carl T; Yee, Ronnie; Mort, Elizabeth; Chueh, Henry C

2008-01-01

Shortcomings surrounding the care of patients with diabetes have been attributed largely to a fragmented, disorganized, and duplicative health care system that focuses more on acute conditions and complications than on managing chronic disease. To address these shortcomings, we developed a diabetes registry population management application to change the way our staff manages patients with diabetes. Use of this new application has helped us coordinate the responsibilities for intervening and monitoring patients in the registry among different users. Our experiences using this combined workflow-informatics intervention system suggest that integrating a chronic disease registry into clinical workflow for the treatment of chronic conditions creates a useful and efficient tool for managing disease.
Quality Metadata Management for Geospatial Scientific Workflows: from Retrieving to Assessing with Online Tools

NASA Astrophysics Data System (ADS)

Leibovici, D. G.; Pourabdollah, A.; Jackson, M.

2011-12-01

Experts and decision-makers use or develop models to monitor global and local changes of the environment. Their activities require the combination of data and processing services in a flow of operations and spatial data computations: a geospatial scientific workflow. The seamless ability to generate, re-use and modify a geospatial scientific workflow is an important requirement but the quality of outcomes is equally much important [1]. Metadata information attached to the data and processes, and particularly their quality, is essential to assess the reliability of the scientific model that represents a workflow [2]. Managing tools, dealing with qualitative and quantitative metadata measures of the quality associated with a workflow, are, therefore, required for the modellers. To ensure interoperability, ISO and OGC standards [3] are to be adopted, allowing for example one to define metadata profiles and to retrieve them via web service interfaces. However these standards need a few extensions when looking at workflows, particularly in the context of geoprocesses metadata. We propose to fill this gap (i) at first through the provision of a metadata profile for the quality of processes, and (ii) through providing a framework, based on XPDL [4], to manage the quality information. Web Processing Services are used to implement a range of metadata analyses on the workflow in order to evaluate and present quality information at different levels of the workflow. This generates the metadata quality, stored in the XPDL file. The focus is (a) on the visual representations of the quality, summarizing the retrieved quality information either from the standardized metadata profiles of the components or from non-standard quality information e.g., Web 2.0 information, and (b) on the estimated qualities of the outputs derived from meta-propagation of uncertainties (a principle that we have introduced [5]). An a priori validation of the future decision-making supported by the outputs of the workflow once run, is then provided using the meta-propagated qualities, obtained without running the workflow [6], together with the visualization pointing out the need to improve the workflow with better data or better processes on the workflow graph itself. [1] Leibovici, DG, Hobona, G Stock, K Jackson, M (2009) Qualifying geospatial workfow models for adaptive controlled validity and accuracy. In: IEEE 17th GeoInformatics, 1-5 [2] Leibovici, DG, Pourabdollah, A (2010a) Workflow Uncertainty using a Metamodel Framework and Metadata for Data and Processes. OGC TC/PC Meetings, September 2010, Toulouse, France [3] OGC (2011) www.opengeospatial.org [4] XPDL (2008) Workflow Process Definition Interface - XML Process Definition Language.Workflow Management Coalition, Document WfMC-TC-1025, 2008 [5] Leibovici, DG Pourabdollah, A Jackson, M (2011) Meta-propagation of Uncertainties for Scientific Workflow Management in Interoperable Spatial Data Infrastructures. In: Proceedings of the European Geosciences Union (EGU2011), April 2011, Austria [6] Pourabdollah, A Leibovici, DG Jackson, M (2011) MetaPunT: an Open Source tool for Meta-Propagation of uncerTainties in Geospatial Processing. In: Proceedings of OSGIS2011, June 2011, Nottingham, UK
Selective Capture of Histidine-tagged Proteins from Cell Lysates Using TEM grids Modified with NTA-Graphene Oxide

NASA Astrophysics Data System (ADS)

Benjamin, Christopher J.; Wright, Kyle J.; Bolton, Scott C.; Hyun, Seok-Hee; Krynski, Kyle; Grover, Mahima; Yu, Guimei; Guo, Fei; Kinzer-Ursem, Tamara L.; Jiang, Wen; Thompson, David H.

2016-10-01

We report the fabrication of transmission electron microscopy (TEM) grids bearing graphene oxide (GO) sheets that have been modified with Nα, Nα-dicarboxymethyllysine (NTA) and deactivating agents to block non-selective binding between GO-NTA sheets and non-target proteins. The resulting GO-NTA-coated grids with these improved antifouling properties were then used to isolate His6-T7 bacteriophage and His6-GroEL directly from cell lysates. To demonstrate the utility and simplified workflow enabled by these grids, we performed cryo-electron microscopy (cryo-EM) of His6-GroEL obtained from clarified E. coli lysates. Single particle analysis produced a 3D map with a gold standard resolution of 8.1 Å. We infer from these findings that TEM grids modified with GO-NTA are a useful tool that reduces background and improves both the speed and simplicity of biological sample preparation for high-resolution structure elucidation by cryo-EM.
Selective Capture of Histidine-tagged Proteins from Cell Lysates Using TEM grids Modified with NTA-Graphene Oxide.

PubMed

Benjamin, Christopher J; Wright, Kyle J; Bolton, Scott C; Hyun, Seok-Hee; Krynski, Kyle; Grover, Mahima; Yu, Guimei; Guo, Fei; Kinzer-Ursem, Tamara L; Jiang, Wen; Thompson, David H

2016-10-17

We report the fabrication of transmission electron microscopy (TEM) grids bearing graphene oxide (GO) sheets that have been modified with N α , N α -dicarboxymethyllysine (NTA) and deactivating agents to block non-selective binding between GO-NTA sheets and non-target proteins. The resulting GO-NTA-coated grids with these improved antifouling properties were then used to isolate His 6 -T7 bacteriophage and His 6 -GroEL directly from cell lysates. To demonstrate the utility and simplified workflow enabled by these grids, we performed cryo-electron microscopy (cryo-EM) of His 6 -GroEL obtained from clarified E. coli lysates. Single particle analysis produced a 3D map with a gold standard resolution of 8.1 Å. We infer from these findings that TEM grids modified with GO-NTA are a useful tool that reduces background and improves both the speed and simplicity of biological sample preparation for high-resolution structure elucidation by cryo-EM.
The architecture of a virtual grid GIS server

NASA Astrophysics Data System (ADS)

Wu, Pengfei; Fang, Yu; Chen, Bin; Wu, Xi; Tian, Xiaoting

2008-10-01

The grid computing technology provides the service oriented architecture for distributed applications. The virtual Grid GIS server is the distributed and interoperable enterprise application GIS architecture running in the grid environment, which integrates heterogeneous GIS platforms. All sorts of legacy GIS platforms join the grid as members of GIS virtual organization. Based on Microkernel we design the ESB and portal GIS service layer, which compose Microkernel GIS. Through web portals, portal GIS services and mediation of service bus, following the principle of SoC, we separate business logic from implementing logic. Microkernel GIS greatly reduces the coupling degree between applications and GIS platforms. The enterprise applications are independent of certain GIS platforms, and making the application developers to pay attention to the business logic. Via configuration and orchestration of a set of fine-grained services, the system creates GIS Business, which acts as a whole WebGIS request when activated. In this way, the system satisfies a business workflow directly and simply, with little or no new code.

The CMS dataset bookkeeping service

NASA Astrophysics Data System (ADS)

Afaq, A.; Dolgert, A.; Guo, Y.; Jones, C.; Kosyakov, S.; Kuznetsov, V.; Lueking, L.; Riley, D.; Sekhri, V.

2008-07-01

The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems.
CMS distributed data analysis with CRAB3

NASA Astrophysics Data System (ADS)

Mascheroni, M.; Balcas, J.; Belforte, S.; Bockelman, B. P.; Hernandez, J. M.; Ciangottini, D.; Konstantinov, P. B.; Silva, J. M. D.; Ali, M. A. B. M.; Melo, A. M.; Riahi, H.; Tanasijczuk, A. J.; Yusli, M. N. B.; Wolf, M.; Woodard, A. E.; Vaandering, E.

2015-12-01

The CMS Remote Analysis Builder (CRAB) is a distributed workflow management tool which facilitates analysis tasks by isolating users from the technical details of the Grid infrastructure. Throughout LHC Run 1, CRAB has been successfully employed by an average of 350 distinct users each week executing about 200,000 jobs per day. CRAB has been significantly upgraded in order to face the new challenges posed by LHC Run 2. Components of the new system include 1) a lightweight client, 2) a central primary server which communicates with the clients through a REST interface, 3) secondary servers which manage user analysis tasks and submit jobs to the CMS resource provisioning system, and 4) a central service to asynchronously move user data from temporary storage in the execution site to the desired storage location. The new system improves the robustness, scalability and sustainability of the service. Here we provide an overview of the new system, operation, and user support, report on its current status, and identify lessons learned from the commissioning phase and production roll-out.
Workflow computing. Improving management and efficiency of pathology diagnostic services.

PubMed

Buffone, G J; Moreau, D; Beck, J R

1996-04-01

Traditionally, information technology in health care has helped practitioners to collect, store, and present information and also to add a degree of automation to simple tasks (instrument interfaces supporting result entry, for example). Thus commercially available information systems do little to support the need to model, execute, monitor, coordinate, and revise the various complex clinical processes required to support health-care delivery. Workflow computing, which is already implemented and improving the efficiency of operations in several nonmedical industries, can address the need to manage complex clinical processes. Workflow computing not only provides a means to define and manage the events, roles, and information integral to health-care delivery but also supports the explicit implementation of policy or rules appropriate to the process. This article explains how workflow computing may be applied to health-care and the inherent advantages of the technology, and it defines workflow system requirements for use in health-care delivery with special reference to diagnostic pathology.
Integration of virtualized worker nodes in standard batch systems

NASA Astrophysics Data System (ADS)

Büge, Volker; Hessling, Hermann; Kemp, Yves; Kunze, Marcel; Oberst, Oliver; Quast, Günter; Scheurer, Armin; Synge, Owen

2010-04-01

Current experiments in HEP only use a limited number of operating system flavours. Their software might only be validated on one single OS platform. Resource providers might have other operating systems of choice for the installation of the batch infrastructure. This is especially the case if a cluster is shared with other communities, or communities that have stricter security requirements. One solution would be to statically divide the cluster into separated sub-clusters. In such a scenario, no opportunistic distribution of the load can be achieved, resulting in a poor overall utilization efficiency. Another approach is to make the batch system aware of virtualization, and to provide each community with its favoured operating system in a virtual machine. Here, the scheduler has full flexibility, resulting in a better overall efficiency of the resources. In our contribution, we present a lightweight concept for the integration of virtual worker nodes into standard batch systems. The virtual machines are started on the worker nodes just before jobs are executed there. No meta-scheduling is introduced. We demonstrate two prototype implementations, one based on the Sun Grid Engine (SGE), the other using Maui/Torque as a batch system. Both solutions support local job as well as Grid job submission. The hypervisors currently used are Xen and KVM, a port to another system is easily envisageable. To better handle different virtual machines on the physical host, the management solution VmImageManager is developed. We will present first experience from running the two prototype implementations. In a last part, we will show the potential future use of this lightweight concept when integrated into high-level (i.e. Grid) work-flows.
Linked Environments for Atmospheric Discovery (LEAD): A Cyberinfrastructure for Mesoscale Meteorology Research and Education

NASA Astrophysics Data System (ADS)

Droegemeier, K.

2004-12-01

A new National Science Foundation Large Information Technology Research (ITR) grant - known as Linked Environments for Atmospheric Discovery (LEAD) - has been funded to facilitate the identification, access, preparation, assimilation, prediction, management, analysis, mining, and visualization of a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is dynamic workflow orchestration and data management, which will allow use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Having been in operation for slightly more than a year, LEAD has created a technology roadmap and architecture for developing its capabilities and placing them within the academic and research environment. Further, much of the LEAD infrastructure being developed for the WRF model, particularly workflow orchestration, will play a significant role in the nascent WRF Developmental Test Bed Center located at NCAR. This paper updates the status of LEAD (e.g., the topics noted above), its ties with other community activities (e.g., CONDUIT, THREDDS, MADIS, NOMADS), and the manner in which LEAD technologies will be made available for general use. Each component LEAD application is being created as a standards-based Web service that can be run in stand-alone configuration or chained together to build an end-to-end environment for on-demand, real time NWP. We describe in this paper the concepts, implementation plans, and expected impacts of LEAD, the underpinning of which will be a series of interconnected, heterogeneous virtual IT "Grid environments" designed to provide a complete framework for mesoscale meteorology research and education. A set of Integrated Grid and Web Services Testbeds will maintain a rolling archive of several months of recent data, provide tools for operating on them, and serve as an infrastructure (i.e., a mini Grid) for developing distributed Web services capabilities. Education Testbeds will integrate education and outreach throughout the entire LEAD program, and will help shape LEAD research into applications that are congruent with pedagogic requirements, national standards, and evaluation metrics. Ultimately, the LEAD environments will enable researchers, educators, and students to run atmospheric models and other tools in much more realistic, real time settings than is now possible, with emphasis on the use of locally or otherwise uniquely available data.
Enabling Efficient Climate Science Workflows in High Performance Computing Environments

NASA Astrophysics Data System (ADS)

Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

2015-12-01

A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.
DREAM: Distributed Resources for the Earth System Grid Federation (ESGF) Advanced Management

NASA Astrophysics Data System (ADS)

Williams, D. N.

2015-12-01

The data associated with climate research is often generated, accessed, stored, and analyzed on a mix of unique platforms. The volume, variety, velocity, and veracity of this data creates unique challenges as climate research attempts to move beyond stand-alone platforms to a system that truly integrates dispersed resources. Today, sharing data across multiple facilities is often a challenge due to the large variance in supporting infrastructures. This results in data being accessed and downloaded many times, which requires significant amounts of resources, places a heavy analytic development burden on the end users, and mismanaged resources. Working across U.S. federal agencies, international agencies, and multiple worldwide data centers, and spanning seven international network organizations, the Earth System Grid Federation (ESGF) has begun to solve this problem. Its architecture employs a system of geographically distributed peer nodes that are independently administered yet united by common federation protocols and application programming interfaces. However, significant challenges remain, including workflow provenance, modular and flexible deployment, scalability of a diverse set of computational resources, and more. Expanding on the existing ESGF, the Distributed Resources for the Earth System Grid Federation Advanced Management (DREAM) will ensure that the access, storage, movement, and analysis of the large quantities of data that are processed and produced by diverse science projects can be dynamically distributed with proper resource management. This system will enable data from an infinite number of diverse sources to be organized and accessed from anywhere on any device (including mobile platforms). The approach offers a powerful roadmap for the creation and integration of a unified knowledge base of an entire ecosystem, including its many geophysical, geographical, social, political, agricultural, energy, transportation, and cyber aspects. The resulting aggregation of data combined with analytics services has the potential to generate an informational universe and knowledge system of unprecedented size and value to the scientific community, downstream applications, decision makers, and the public.
wft4galaxy: a workflow testing tool for galaxy.

PubMed

Piras, Marco Enrico; Pireddu, Luca; Zanetti, Gianluigi

2017-12-01

Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. marcoenrico.piras@crs4.it. © The Author 2017. Published by Oxford University Press.
SU-F-T-251: The Quality Assurance for the Heavy Patient Load Department in the Developing Country: The Primary Experience of An Entire Workflow QA Process Management in Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, J; Wang, J; Peng, J

Purpose: To implement an entire workflow quality assurance (QA) process in the radiotherapy department and to reduce the error rates of radiotherapy based on the entire workflow management in the developing country. Methods: The entire workflow QA process management starts from patient registration to the end of last treatment including all steps through the entire radiotherapy process. Error rate of chartcheck is used to evaluate the the entire workflow QA process. Two to three qualified senior medical physicists checked the documents before the first treatment fraction of every patient. Random check of the treatment history during treatment was also performed.more » A total of around 6000 patients treatment data before and after implementing the entire workflow QA process were compared from May, 2014 to December, 2015. Results: A systemic checklist was established. It mainly includes patient’s registration, treatment plan QA, information exporting to OIS(Oncology Information System), documents of treatment QAand QA of the treatment history. The error rate derived from the chart check decreases from 1.7% to 0.9% after our the entire workflow QA process. All checked errors before the first treatment fraction were corrected as soon as oncologist re-confirmed them and reinforce staff training was accordingly followed to prevent those errors. Conclusion: The entire workflow QA process improved the safety, quality of radiotherapy in our department and we consider that our QA experience can be applicable for the heavily-loaded radiotherapy departments in developing country.« less
Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms.

PubMed

Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel

2014-01-01

With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies.
Ergonomic design for dental offices.

PubMed

Ahearn, David J; Sanders, Martha J; Turcotte, Claudia

2010-01-01

The increasing complexity of the dental office environment influences productivity and workflow for dental clinicians. Advances in technology, and with it the range of products needed to provide services, have led to sprawl in operatory setups and the potential for awkward postures for dental clinicians during the delivery of oral health services. Although ergonomics often addresses the prevention of musculoskeletal disorders for specific populations of workers, concepts of workflow and productivity are integral to improved practice in work environments. This article provides suggestions for improving workflow and productivity for dental clinicians. The article applies ergonomic principles to dental practice issues such as equipment and supply management, office design, and workflow management. Implications for improved ergonomic processes and future research are explored.
Cyberinfrastructure for Atmospheric Discovery

NASA Astrophysics Data System (ADS)

Wilhelmson, R.; Moore, C. W.

2004-12-01

Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed to create an integrated, scalable for identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, and visualizing a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is Workflow Orchestration for On-Demand, Real-Time, Dynamically-Adaptive Systems (WOORDS), which allows the use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand, Grid-enabled systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Although LEAD efforts are primiarly directed at mesoscale meteorology, the IT services being developed has general applicability to other geoscience and environmental science. Integration of traditional and new data sources is a crucial component in LEAD for data analysis and assimilation, for integration of (ensemble mining) of data from sets of simulations, and for comparing results to observational data. As part of the integration effort, LEAD is creating a myLEAD metadata catalog service: a personal metacatalog that extends the Globus MCS system and is built on top of the OGSA-DAI system developed at the National e-Science Center in Edinburgh, Scotland.
UBioLab: a web-LABoratory for Ubiquitous in-silico experiments.

PubMed

Bartocci, E; Di Berardini, M R; Merelli, E; Vito, L

2012-03-01

The huge and dynamic amount of bioinformatic resources (e.g., data and tools) available nowadays in Internet represents a big challenge for biologists -for what concerns their management and visualization- and for bioinformaticians -for what concerns the possibility of rapidly creating and executing in-silico experiments involving resources and activities spread over the WWW hyperspace. Any framework aiming at integrating such resources as in a physical laboratory has imperatively to tackle -and possibly to handle in a transparent and uniform way- aspects concerning physical distribution, semantic heterogeneity, co-existence of different computational paradigms and, as a consequence, of different invocation interfaces (i.e., OGSA for Grid nodes, SOAP for Web Services, Java RMI for Java objects, etc.). The framework UBioLab has been just designed and developed as a prototype following the above objective. Several architectural features -as those ones of being fully Web-based and of combining domain ontologies, Semantic Web and workflow techniques- give evidence of an effort in such a direction. The integration of a semantic knowledge management system for distributed (bioinformatic) resources, a semantic-driven graphic environment for defining and monitoring ubiquitous workflows and an intelligent agent-based technology for their distributed execution allows UBioLab to be a semantic guide for bioinformaticians and biologists providing (i) a flexible environment for visualizing, organizing and inferring any (semantics and computational) "type" of domain knowledge (e.g., resources and activities, expressed in a declarative form), (ii) a powerful engine for defining and storing semantic-driven ubiquitous in-silico experiments on the domain hyperspace, as well as (iii) a transparent, automatic and distributed environment for correct experiment executions.
Detecting distant homologies on protozoans metabolic pathways using scientific workflows.

PubMed

da Cruz, Sérgio Manuel Serra; Batista, Vanessa; Silva, Edno; Tosta, Frederico; Vilela, Clarissa; Cuadrat, Rafael; Tschoeke, Diogo; Dávila, Alberto M R; Campos, Maria Luiza Machado; Mattoso, Marta

2010-01-01

Bioinformatics experiments are typically composed of programs in pipelines manipulating an enormous quantity of data. An interesting approach for managing those experiments is through workflow management systems (WfMS). In this work we discuss WfMS features to support genome homology workflows and present some relevant issues for typical genomic experiments. Our evaluation used Kepler WfMS to manage a real genomic pipeline, named OrthoSearch, originally defined as a Perl script. We show a case study detecting distant homologies on trypanomatids metabolic pathways. Our results reinforce the benefits of WfMS over script languages and point out challenges to WfMS in distributed environments.
The BioGRID interaction database: 2013 update.

PubMed

Chatr-Aryamontri, Andrew; Breitkreutz, Bobby-Joe; Heinicke, Sven; Boucher, Lorrie; Winter, Andrew; Stark, Chris; Nixon, Julie; Ramage, Lindsay; Kolas, Nadine; O'Donnell, Lara; Reguly, Teresa; Breitkreutz, Ashton; Sellam, Adnane; Chen, Daici; Chang, Christie; Rust, Jennifer; Livstone, Michael; Oughtred, Rose; Dolinski, Kara; Tyers, Mike

2013-01-01

The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.
SynTrack: DNA Assembly Workflow Management (SynTrack) v2.0.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

MENG, XIANWEI; SIMIRENKO, LISA

2016-12-01

SynTrack is a dynamic, workflow-driven data management system that tracks the DNA build process: Management of the hierarchical relationships of the DNA fragments; Monitoring of process tasks for the assembly of multiple DNA fragments into final constructs; Creations of vendor order forms with selectable building blocks. Organizing plate layouts barcodes for vendor/pcr/fusion/chewback/bioassay/glycerol/master plate maps (default/condensed); Creating or updating Pre-Assembly/Assembly process workflows with selected building blocks; Generating Echo pooling instructions based on plate maps; Tracking of building block orders, received and final assembled for delivering; Bulk updating of colony or PCR amplification information, fusion PCR and chewback results; Updating with QA/QCmore » outcome with .csv & .xlsx template files; Re-work assembly workflow enabled before and after sequencing validation; and Tracking of plate/well data changes and status updates and reporting of master plate status with QC outcomes.« less
Improving diabetes population management efficiency with an informatics solution.

PubMed

Zai, Adrian; Grant, Richard; Andrews, Carl; Yee, Ronnie; Chueh, Henry

2007-10-11

Despite intensive resource use for diabetes management in the U.S., our care continues to fall short of evidence-based goals, partly due to system inefficiencies. Diabetes registries are increasingly being utilized as a critical tool for population level disease management by providing real-time data. Since the successful adoption of a diabetes registry depends on how well it integrates with disease management workflows, we optimized our current diabetes management workflow and designed our registry application around it.
Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

NASA Astrophysics Data System (ADS)

Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.
Web-based interactive visualization in a Grid-enabled neuroimaging application using HTML5.

PubMed

Siewert, René; Specovius, Svenja; Wu, Jie; Krefting, Dagmar

2012-01-01

Interactive visualization and correction of intermediate results are required in many medical image analysis pipelines. To allow certain interaction in the remote execution of compute- and data-intensive applications, new features of HTML5 are used. They allow for transparent integration of user interaction into Grid- or Cloud-enabled scientific workflows. Both 2D and 3D visualization and data manipulation can be performed through a scientific gateway without the need to install specific software or web browser plugins. The possibilities of web-based visualization are presented along the FreeSurfer-pipeline, a popular compute- and data-intensive software tool for quantitative neuroimaging.
Flexible Early Warning Systems with Workflows and Decision Tables

NASA Astrophysics Data System (ADS)

Riedel, F.; Chaves, F.; Zeiner, H.

2012-04-01

An essential part of early warning systems and systems for crisis management are decision support systems that facilitate communication and collaboration. Often official policies specify how different organizations collaborate and what information is communicated to whom. For early warning systems it is crucial that information is exchanged dynamically in a timely manner and all participants get exactly the information they need to fulfil their role in the crisis management process. Information technology obviously lends itself to automate parts of the process. We have experienced however that in current operational systems the information logistics processes are hard-coded, even though they are subject to change. In addition, systems are tailored to the policies and requirements of a certain organization and changes can require major software refactoring. We seek to develop a system that can be deployed and adapted to multiple organizations with different dynamic runtime policies. A major requirement for such a system is that changes can be applied locally without affecting larger parts of the system. In addition to the flexibility regarding changes in policies and processes, the system needs to be able to evolve; when new information sources become available, it should be possible to integrate and use these in the decision process. In general, this kind of flexibility comes with a significant increase in complexity. This implies that only IT professionals can maintain a system that can be reconfigured and adapted; end-users are unable to utilise the provided flexibility. In the business world similar problems arise and previous work suggested using business process management systems (BPMS) or workflow management systems (WfMS) to guide and automate early warning processes or crisis management plans. However, the usability and flexibility of current WfMS are limited, because current notations and user interfaces are still not suitable for end-users, and workflows are usually only suited for rigid processes. We show how improvements can be achieved by using decision tables and rule-based adaptive workflows. Decision tables have been shown to be an intuitive tool that can be used by domain experts to express rule sets that can be interpreted automatically at runtime. Adaptive workflows use a rule-based approach to increase the flexibility of workflows by providing mechanisms to adapt workflows based on context changes, human intervention and availability of services. The combination of workflows, decision tables and rule-based adaption creates a framework that opens up new possibilities for flexible and adaptable workflows, especially, for use in early warning and crisis management systems.

RESTFul based heterogeneous Geoprocessing workflow interoperation for Sensor Web Service

NASA Astrophysics Data System (ADS)

Yang, Chao; Chen, Nengcheng; Di, Liping

2012-10-01

Advanced sensors on board satellites offer detailed Earth observations. A workflow is one approach for designing, implementing and constructing a flexible and live link between these sensors' resources and users. It can coordinate, organize and aggregate the distributed sensor Web services to meet the requirement of a complex Earth observation scenario. A RESTFul based workflow interoperation method is proposed to integrate heterogeneous workflows into an interoperable unit. The Atom protocols are applied to describe and manage workflow resources. The XML Process Definition Language (XPDL) and Business Process Execution Language (BPEL) workflow standards are applied to structure a workflow that accesses sensor information and one that processes it separately. Then, a scenario for nitrogen dioxide (NO2) from a volcanic eruption is used to investigate the feasibility of the proposed method. The RESTFul based workflows interoperation system can describe, publish, discover, access and coordinate heterogeneous Geoprocessing workflows.
Production experience with the ATLAS Event Service

NASA Astrophysics Data System (ADS)

Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration

2017-10-01

The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.
Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms

PubMed Central

Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel

2017-01-01

With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies. PMID:29399237
A Two-Stage Probabilistic Approach to Manage Personal Worklist in Workflow Management Systems

NASA Astrophysics Data System (ADS)

Han, Rui; Liu, Yingbo; Wen, Lijie; Wang, Jianmin

The application of workflow scheduling in managing individual actor's personal worklist is one area that can bring great improvement to business process. However, current deterministic work cannot adapt to the dynamics and uncertainties in the management of personal worklist. For such an issue, this paper proposes a two-stage probabilistic approach which aims at assisting actors to flexibly manage their personal worklists. To be specific, the approach analyzes every activity instance's continuous probability of satisfying deadline at the first stage. Based on this stochastic analysis result, at the second stage, an innovative scheduling strategy is proposed to minimize the overall deadline violation cost for an actor's personal worklist. Simultaneously, the strategy recommends the actor a feasible worklist of activity instances which meet the required bottom line of successful execution. The effectiveness of our approach is evaluated in a real-world workflow management system and with large scale simulation experiments.
Contextual cloud-based service oriented architecture for clinical workflow.

PubMed

Moreno-Conde, Jesús; Moreno-Conde, Alberto; Núñez-Benjumea, Francisco J; Parra-Calderón, Carlos

2015-01-01

Given that acceptance of systems within the healthcare domain multiple papers highlighted the importance of integrating tools with the clinical workflow. This paper analyse how clinical context management could be deployed in order to promote the adoption of cloud advanced services and within the clinical workflow. This deployment will be able to be integrated with the eHealth European Interoperability Framework promoted specifications. Throughout this paper, it is proposed a cloud-based service-oriented architecture. This architecture will implement a context management system aligned with the HL7 standard known as CCOW.
Overcoming Barriers to Technology Adoption in Small Manufacturing Enterprises (SMEs)

DTIC Science & Technology

2003-06-01

automates quote-generation, order - processing workflow management, perform- ance analysis, and accounting functions. Ultimately, it will enable Magdic...that Magdic imple- ment an MES instead. The MES, in addition to solving the problem of document manage- ment, would automate quote-generation, order ... processing , workflow management, perform- ance analysis, and accounting functions. To help Magdic personnel learn about the MES, TIDE personnel provided
Molgenis-impute: imputation pipeline in a box.

PubMed

Kanterakis, Alexandros; Deelen, Patrick; van Dijk, Freerk; Byelas, Heorhiy; Dijkstra, Martijn; Swertz, Morris A

2015-08-19

Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Here we present MOLGENIS-impute, an 'imputation in a box' solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation.
Integrating DICOM structure reporting (SR) into the medical imaging informatics data grid

NASA Astrophysics Data System (ADS)

Lee, Jasper; Le, Anh; Liu, Brent

2008-03-01

The Medical Imaging Informatics (MI2) Data Grid developed at the USC Image Processing and Informatics Laboratory enables medical images to be shared securely between multiple imaging centers. Current applications include an imaging-based clinical trial setting where multiple field sites perform image acquisition and a centralized radiology core performs image analysis, often using computer-aided diagnosis tools (CAD) that generate a DICOM-SR to report their findings and measurements. As more and more CAD tools are being developed in the radiology field, the generated DICOM Structure Reports (SR) holding key radiological findings and measurements that are not part of the DICOM image need to be integrated into the existing Medical Imaging Informatics Data Grid with the corresponding imaging studies. We will discuss the significance and method involved in adapting DICOM-SR into the Medical Imaging Informatics Data Grid. The result is a MI2 Data Grid repository from which users can send and receive DICOM-SR objects based on the imaging-based clinical trial application. The services required to extract and categorize information from the structured reports will be discussed, and the workflow to store and retrieve a DICOM-SR file into the existing MI2 Data Grid will be shown.
Development of a novel imaging informatics-based system with an intelligent workflow engine (IWEIS) to support imaging-based clinical trials

PubMed Central

Wang, Ximing; Liu, Brent J; Martinez, Clarisa; Zhang, Xuejun; Winstein, Carolee J

2015-01-01

Imaging based clinical trials can benefit from a solution to efficiently collect, analyze, and distribute multimedia data at various stages within the workflow. Currently, the data management needs of these trials are typically addressed with custom-built systems. However, software development of the custom- built systems for versatile workflows can be resource-consuming. To address these challenges, we present a system with a workflow engine for imaging based clinical trials. The system enables a project coordinator to build a data collection and management system specifically related to study protocol workflow without programming. Web Access to DICOM Objects (WADO) module with novel features is integrated to further facilitate imaging related study. The system was initially evaluated by an imaging based rehabilitation clinical trial. The evaluation shows that the cost of the development of system can be much reduced compared to the custom-built system. By providing a solution to customize a system and automate the workflow, the system will save on development time and reduce errors especially for imaging clinical trials. PMID:25870169
GEMSS: grid-infrastructure for medical service provision.

PubMed

Benkner, S; Berti, G; Engelbrecht, G; Fingberg, J; Kohring, G; Middleton, S E; Schmidt, R

2005-01-01

The European GEMSS Project is concerned with the creation of medical Grid service prototypes and their evaluation in a secure service-oriented infrastructure for distributed on demand/supercomputing. Key aspects of the GEMSS Grid middleware include negotiable QoS support for time-critical service provision, flexible support for business models, and security at all levels in order to ensure privacy of patient data as well as compliance to EU law. The GEMSS Grid infrastructure is based on a service-oriented architecture and is being built on top of existing standard Grid and Web technologies. The GEMSS infrastructure offers a generic Grid service provision framework that hides the complexity of transforming existing applications into Grid services. For the development of client-side applications or portals, a pluggable component framework has been developed, providing developers with full control over business processes, service discovery, QoS negotiation, and workflow, while keeping their underlying implementation hidden from view. A first version of the GEMSS Grid infrastructure is operational and has been used for the set-up of a Grid test-bed deploying six medical Grid service prototypes including maxillo-facial surgery simulation, neuro-surgery support, radio-surgery planning, inhaled drug-delivery simulation, cardiovascular simulation and advanced image reconstruction. The GEMSS Grid infrastructure is based on standard Web Services technology with an anticipated future transition path towards the OGSA standard proposed by the Global Grid Forum. GEMSS demonstrates that the Grid can be used to provide medical practitioners and researchers with access to advanced simulation and image processing services for improved preoperative planning and near real-time surgical support.
How to Take HRMS Process Management to the Next Level with Workflow Business Event System

NASA Technical Reports Server (NTRS)

Rajeshuni, Sarala; Yagubian, Aram; Kunamaneni, Krishna

2006-01-01

Oracle Workflow with the Business Event System offers a complete process management solution for enterprises to manage business processes cost-effectively. Using Workflow event messaging, event subscriptions, AQ Servlet and advanced queuing technologies, this presentation will demonstrate the step-by-step design and implementation of system solutions in order to integrate two dissimilar systems and establish communication remotely. As a case study, the presentation walks you through the process of propagating organization name changes in other applications that originated from the HRMS module without changing applications code. The solution can be applied to your particular business cases for streamlining or modifying business processes across Oracle and non-Oracle applications.
Defining Usability Heuristics for Adoption and Efficiency of an Electronic Workflow Document Management System

ERIC Educational Resources Information Center

Fuentes, Steven

2017-01-01

Usability heuristics have been established for different uses and applications as general guidelines for user interfaces. These can affect the implementation of industry solutions and play a significant role regarding cost reduction and process efficiency. The area of electronic workflow document management (EWDM) solutions, also known as…
Towards an intelligent hospital environment: OR of the future.

PubMed

Sutherland, Jeffrey V; van den Heuvel, Willem-Jan; Ganous, Tim; Burton, Matthew M; Kumar, Animesh

2005-01-01

Patients, providers, payers, and government demand more effective and efficient healthcare services, and the healthcare industry needs innovative ways to re-invent core processes. Business process reengineering (BPR) showed adopting new hospital information systems can leverage this transformation and workflow management technologies can automate process management. Our research indicates workflow technologies in healthcare require real time patient monitoring, detection of adverse events, and adaptive responses to breakdown in normal processes. Adaptive workflow systems are rarely implemented making current workflow implementations inappropriate for healthcare. The advent of evidence based medicine, guideline based practice, and better understanding of cognitive workflow combined with novel technologies including Radio Frequency Identification (RFID), mobile/wireless technologies, internet workflow, intelligent agents, and Service Oriented Architectures (SOA) opens up new and exciting ways of automating business processes. Total situational awareness of events, timing, and location of healthcare activities can generate self-organizing change in behaviors of humans and machines. A test bed of a novel approach towards continuous process management was designed for the new Weinburg Surgery Building at the University of Maryland Medical. Early results based on clinical process mapping and analysis of patient flow bottlenecks demonstrated 100% improvement in delivery of supplies and instruments at surgery start time. This work has been directly applied to the design of the DARPA Trauma Pod research program where robotic surgery will be performed on wounded soldiers on the battlefield.
Intelligent Data Analysis in the 21st Century

NASA Astrophysics Data System (ADS)

Cohen, Paul; Adams, Niall

When IDA began, data sets were small and clean, data provenance and management were not significant issues, workflows and grid computing and cloud computing didn’t exist, and the world was not populated with billions of cellphone and computer users. The original conception of intelligent data analysis — automating some of the reasoning of skilled data analysts — has not been updated to account for the dramatic changes in what skilled data analysis means, today. IDA might update its mission to address pressing problems in areas such as climate change, habitat loss, education, and medicine. It might anticipate data analysis opportunities five to ten years out, such as customizing educational trajectories to individual students, and personalizing medical protocols. Such developments will elevate the conference and our community by shifting our focus from arbitrary measures of the performance of isolated algorithms to the practical, societal value of intelligent data analysis systems.
Managing and Communicating Operational Workflow: Designing and Implementing an Electronic Outpatient Whiteboard.

PubMed

Steitz, Bryan D; Weinberg, Stuart T; Danciu, Ioana; Unertl, Kim M

2016-01-01

Healthcare team members in emergency department contexts have used electronic whiteboard solutions to help manage operational workflow for many years. Ambulatory clinic settings have highly complex operational workflow, but are still limited in electronic assistance to communicate and coordinate work activities. To describe and discuss the design, implementation, use, and ongoing evolution of a coordination and collaboration tool supporting ambulatory clinic operational workflow at Vanderbilt University Medical Center (VUMC). The outpatient whiteboard tool was initially designed to support healthcare work related to an electronic chemotherapy order-entry application. After a highly successful initial implementation in an oncology context, a high demand emerged across the organization for the outpatient whiteboard implementation. Over the past 10 years, developers have followed an iterative user-centered design process to evolve the tool. The electronic outpatient whiteboard system supports 194 separate whiteboards and is accessed by over 2800 distinct users on a typical day. Clinics can configure their whiteboards to support unique workflow elements. Since initial release, features such as immunization clinical decision support have been integrated into the system, based on requests from end users. The success of the electronic outpatient whiteboard demonstrates the usefulness of an operational workflow tool within the ambulatory clinic setting. Operational workflow tools can play a significant role in supporting coordination, collaboration, and teamwork in ambulatory healthcare settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Garzoglio, Gabriele

The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center propose a joint project. The goals are to enable scientific workflows of stakeholders to run on multiple cloud resources by use of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federat ion of Cloud Resources , and (c) High-Throughput Fabric Virtualization. This is a matching fund project in which Fermilab and KISTI will contribute equal resources .
Agent-Based Scientific Workflow Composition

NASA Astrophysics Data System (ADS)

Barker, A.; Mann, B.

2006-07-01

Agents are active autonomous entities that interact with one another to achieve their objectives. This paper addresses how these active agents are a natural fit to consume the passive Service Oriented Architecture which is found in Internet and Grid Systems, in order to compose, coordinate and execute e-Science experiments. A framework is introduced which allows an e-Science experiment to be described as a MultiAgent System.
The CMS dataset bookkeeping service

DOE Office of Scientific and Technical Information (OSTI.GOV)

Afaq, Anzar,; /Fermilab; Dolgert, Andrew

2007-10-01

The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS ismore » available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems.« less
CMS distributed data analysis with CRAB3

DOE PAGES

Mascheroni, M.; Balcas, J.; Belforte, S.; ...

2015-12-23

The CMS Remote Analysis Builder (CRAB) is a distributed workflow management tool which facilitates analysis tasks by isolating users from the technical details of the Grid infrastructure. Throughout LHC Run 1, CRAB has been successfully employed by an average of 350 distinct users each week executing about 200,000 jobs per day.CRAB has been significantly upgraded in order to face the new challenges posed by LHC Run 2. Components of the new system include 1) a lightweight client, 2) a central primary server which communicates with the clients through a REST interface, 3) secondary servers which manage user analysis tasks andmore » submit jobs to the CMS resource provisioning system, and 4) a central service to asynchronously move user data from temporary storage in the execution site to the desired storage location. Furthermore, the new system improves the robustness, scalability and sustainability of the service.Here we provide an overview of the new system, operation, and user support, report on its current status, and identify lessons learned from the commissioning phase and production roll-out.« less
Confidentiality Protection of User Data and Adaptive Resource Allocation for Managing Multiple Workflow Performance in Service-Based Systems

ERIC Educational Resources Information Center

An, Ho

2012-01-01

In this dissertation, two interrelated problems of service-based systems (SBS) are addressed: protecting users' data confidentiality from service providers, and managing performance of multiple workflows in SBS. Current SBSs pose serious limitations to protecting users' data confidentiality. Since users' sensitive data is sent in…

Changes in the cardiac rehabilitation workflow process needed for the implementation of a self-management system.

PubMed

Wiggers, Anne-Marieke; Vosbergen, Sandra; Kraaijenhagen, Roderik; Jaspers, Monique; Peek, Niels

2013-01-01

E-health interventions are of a growing importance for self-management of chronic conditions. This study aimed to describe the process adaptions that are needed in cardiac rehabilitation (CR) to implement a self-management system, called MyCARDSS. We created a generic workflow model based on interviews and observations at three CR clinics. Subsequently, a workflow model of the ideal situation after implementation of MyCARDSS was created. We found that the implementation will increase the complexity of existing working procedures because 1) not all patients will use MyCARDSS, 2) there is a transfer of tasks and responsibilities from professionals to patients, and 3) information in MyCARDSS needs to be synchronized with the EPR system for professionals.
Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources

NASA Astrophysics Data System (ADS)

Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

2014-12-01

The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.
Scaling up ATLAS Event Service to production levels on opportunistic computing platforms

NASA Astrophysics Data System (ADS)

Benjamin, D.; Caballero, J.; Ernst, M.; Guan, W.; Hover, J.; Lesny, D.; Maeno, T.; Nilsson, P.; Tsulaia, V.; van Gemmeren, P.; Vaniachine, A.; Wang, F.; Wenaus, T.; ATLAS Collaboration

2016-10-01

Continued growth in public cloud and HPC resources is on track to exceed the dedicated resources available for ATLAS on the WLCG. Examples of such platforms are Amazon AWS EC2 Spot Instances, Edison Cray XC30 supercomputer, backfill at Tier 2 and Tier 3 sites, opportunistic resources at the Open Science Grid (OSG), and ATLAS High Level Trigger farm between the data taking periods. Because of specific aspects of opportunistic resources such as preemptive job scheduling and data I/O, their efficient usage requires workflow innovations provided by the ATLAS Event Service. Thanks to the finer granularity of the Event Service data processing workflow, the opportunistic resources are used more efficiently. We report on our progress in scaling opportunistic resource usage to double-digit levels in ATLAS production.
Developing science gateways for drug discovery in a grid environment.

PubMed

Pérez-Sánchez, Horacio; Rezaei, Vahid; Mezhuyev, Vitaliy; Man, Duhu; Peña-García, Jorge; den-Haan, Helena; Gesing, Sandra

2016-01-01

Methods for in silico screening of large databases of molecules increasingly complement and replace experimental techniques to discover novel compounds to combat diseases. As these techniques become more complex and computationally costly we are faced with an increasing problem to provide the research community of life sciences with a convenient tool for high-throughput virtual screening on distributed computing resources. To this end, we recently integrated the biophysics-based drug-screening program FlexScreen into a service, applicable for large-scale parallel screening and reusable in the context of scientific workflows. Our implementation is based on Pipeline Pilot and Simple Object Access Protocol and provides an easy-to-use graphical user interface to construct complex workflows, which can be executed on distributed computing resources, thus accelerating the throughput by several orders of magnitude.
Dispel4py: An Open-Source Python library for Data-Intensive Seismology

NASA Astrophysics Data System (ADS)

Filgueira, Rosa; Krause, Amrey; Spinuso, Alessandro; Klampanos, Iraklis; Danecek, Peter; Atkinson, Malcolm

2015-04-01

Scientific workflows are a necessary tool for many scientific communities as they enable easy composition and execution of applications on computing resources while scientists can focus on their research without being distracted by the computation management. Nowadays, scientific communities (e.g. Seismology) have access to a large variety of computing resources and their computational problems are best addressed using parallel computing technology. However, successful use of these technologies requires a lot of additional machinery whose use is not straightforward for non-experts: different parallel frameworks (MPI, Storm, multiprocessing, etc.) must be used depending on the computing resources (local machines, grids, clouds, clusters) where applications are run. This implies that for achieving the best applications' performance, users usually have to change their codes depending on the features of the platform selected for running them. This work presents dispel4py, a new open-source Python library for describing abstract stream-based workflows for distributed data-intensive applications. Special care has been taken to provide dispel4py with the ability to map abstract workflows to different platforms dynamically at run-time. Currently dispel4py has four mappings: Apache Storm, MPI, multi-threading and sequential. The main goal of dispel4py is to provide an easy-to-use tool to develop and test workflows in local resources by using the sequential mode with a small dataset. Later, once a workflow is ready for long runs, it can be automatically executed on different parallel resources. dispel4py takes care of the underlying mappings by performing an efficient parallelisation. Processing Elements (PE) represent the basic computational activities of any dispel4Py workflow, which can be a seismologic algorithm, or a data transformation process. For creating a dispel4py workflow, users only have to write very few lines of code to describe their PEs and how they are connected by using Python, which is widely supported on many platforms and is popular in many scientific domains, such as in geosciences. Once, a dispel4py workflow is written, a user only has to select which mapping they would like to use, and everything else (parallelisation, distribution of data) is carried on by dispel4py without any cost to the user. Among all dispel4py features we would like to highlight the following: * The PEs are connected by streams and not by writing to and reading from intermediate files, avoiding many IO operations. * The PEs can be stored into a registry. Therefore, different users can recombine PEs in many different workflows. * dispel4py has been enriched with a provenance mechanism to support runtime provenance analysis. We have adopted the W3C-PROV data model, which is accessible via a prototypal browser-based user interface and a web API. It supports the users with the visualisation of graphical products and offers combined operations to access and download the data, which may be selectively stored at runtime, into dedicated data archives. dispel4py has been already used by seismologists in the VERCE project to develop different seismic workflows. One of them is the Seismic Ambient Noise Cross-Correlation workflow, which preprocesses and cross-correlates traces from several stations. First, this workflow was tested on a local machine by using a small number of stations as input data. Later, it was executed on different parallel platforms (SuperMUC cluster, and Terracorrelator machine), automatically scaling up by using MPI and multiprocessing mappings and up to 1000 stations as input data. The results show that the dispel4py achieves scalable performance in both mappings tested on different parallel platforms.
UBioLab: a web-laboratory for ubiquitous in-silico experiments.

PubMed

Bartocci, Ezio; Cacciagrano, Diletta; Di Berardini, Maria Rita; Merelli, Emanuela; Vito, Leonardo

2012-07-09

The huge and dynamic amount of bioinformatic resources (e.g., data and tools) available nowadays in Internet represents a big challenge for biologists –for what concerns their management and visualization– and for bioinformaticians –for what concerns the possibility of rapidly creating and executing in-silico experiments involving resources and activities spread over the WWW hyperspace. Any framework aiming at integrating such resources as in a physical laboratory has imperatively to tackle –and possibly to handle in a transparent and uniform way– aspects concerning physical distribution, semantic heterogeneity, co-existence of different computational paradigms and, as a consequence, of different invocation interfaces (i.e., OGSA for Grid nodes, SOAP for Web Services, Java RMI for Java objects, etc.). The framework UBioLab has been just designed and developed as a prototype following the above objective. Several architectural features –as those ones of being fully Web-based and of combining domain ontologies, Semantic Web and workflow techniques– give evidence of an effort in such a direction. The integration of a semantic knowledge management system for distributed (bioinformatic) resources, a semantic-driven graphic environment for defining and monitoring ubiquitous workflows and an intelligent agent-based technology for their distributed execution allows UBioLab to be a semantic guide for bioinformaticians and biologists providing (i) a flexible environment for visualizing, organizing and inferring any (semantics and computational) "type" of domain knowledge (e.g., resources and activities, expressed in a declarative form), (ii) a powerful engine for defining and storing semantic-driven ubiquitous in-silico experiments on the domain hyperspace, as well as (iii) a transparent, automatic and distributed environment for correct experiment executions.
Scientific Data Management (SDM) Center for Enabling Technologies. 2007-2012

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ludascher, Bertram; Altintas, Ilkay

Over the past five years, our activities have both established Kepler as a viable scientific workflow environment and demonstrated its value across multiple science applications. We have published numerous peer-reviewed papers on the technologies highlighted in this short paper and have given Kepler tutorials at SC06,SC07,SC08,and SciDAC 2007. Our outreach activities have allowed scientists to learn best practices and better utilize Kepler to address their individual workflow problems. Our contributions to advancing the state-of-the-art in scientific workflows have focused on the following areas. Progress in each of these areas is described in subsequent sections. Workflow development. The development of amore » deeper understanding of scientific workflows "in the wild" and of the requirements for support tools that allow easy construction of complex scientific workflows; Generic workflow components and templates. The development of generic actors (i.e.workflow components and processes) which can be broadly applied to scientific problems; Provenance collection and analysis. The design of a flexible provenance collection and analysis infrastructure within the workflow environment; and, Workflow reliability and fault tolerance. The improvement of the reliability and fault-tolerance of workflow environments.« less
FluxCTTX: A LIMS-based tool for management and analysis of cytotoxicity assays data

PubMed Central

2015-01-01

Background Cytotoxicity assays have been used by researchers to screen for cytotoxicity in compound libraries. Researchers can either look for cytotoxic compounds or screen "hits" from initial high-throughput drug screens for unwanted cytotoxic effects before investing in their development as a pharmaceutical. These assays may be used as an alternative to animal experimentation and are becoming increasingly important in modern laboratories. However, the execution of these assays in large scale and different laboratories requires, among other things, the management of protocols, reagents, cell lines used as well as the data produced, which can be a challenge. The management of all this information is greatly improved by the utilization of computational tools to save time and guarantee quality. However, a tool that performs this task designed specifically for cytotoxicity assays is not yet available. Results In this work, we have used a workflow based LIMS -- the Flux system -- and the Together Workflow Editor as a framework to develop FluxCTTX, a tool for management of data from cytotoxicity assays performed at different laboratories. The main work is the development of a workflow, which represents all stages of the assay and has been developed and uploaded in Flux. This workflow models the activities of cytotoxicity assays performed as described in the OECD 129 Guidance Document. Conclusions FluxCTTX presents a solution for the management of the data produced by cytotoxicity assays performed at Interlaboratory comparisons. Its adoption will contribute to guarantee the quality of activities in the process of cytotoxicity tests and enforce the use of Good Laboratory Practices (GLP). Furthermore, the workflow developed is complete and can be adapted to other contexts and different tests for management of other types of data. PMID:26696462
Wireless remote control clinical image workflow: utilizing a PDA for offsite distribution

NASA Astrophysics Data System (ADS)

Liu, Brent J.; Documet, Luis; Documet, Jorge; Huang, H. K.; Muldoon, Jean

2004-04-01

Last year we presented in RSNA an application to perform wireless remote control of PACS image distribution utilizing a handheld device such as a Personal Digital Assistant (PDA). This paper describes the clinical experiences including workflow scenarios of implementing the PDA application to route exams from the clinical PACS archive server to various locations for offsite distribution of clinical PACS exams. By utilizing this remote control application, radiologists can manage image workflow distribution with a single wireless handheld device without impacting their clinical workflow on diagnostic PACS workstations. A PDA application was designed and developed to perform DICOM Query and C-Move requests by a physician from a clinical PACS Archive to a CD-burning device for automatic burning of PACS data for the distribution to offsite. In addition, it was also used for convenient routing of historical PACS exams to the local web server, local workstations, and teleradiology systems. The application was evaluated by radiologists as well as other clinical staff who need to distribute PACS exams to offsite referring physician"s offices and offsite radiologists. An application for image workflow management utilizing wireless technology was implemented in a clinical environment and evaluated. A PDA application was successfully utilized to perform DICOM Query and C-Move requests from the clinical PACS archive to various offsite exam distribution devices. Clinical staff can utilize the PDA to manage image workflow and PACS exam distribution conveniently for offsite consultations by referring physicians and radiologists. This solution allows the radiologist to expand their effectiveness in health care delivery both within the radiology department as well as offisite by improving their clinical workflow.
Workflow-enabled distributed component-based information architecture for digital medical imaging enterprises.

PubMed

Wong, Stephen T C; Tjandra, Donny; Wang, Huili; Shen, Weimin

2003-09-01

Few information systems today offer a flexible means to define and manage the automated part of radiology processes, which provide clinical imaging services for the entire healthcare organization. Even fewer of them provide a coherent architecture that can easily cope with heterogeneity and inevitable local adaptation of applications and can integrate clinical and administrative information to aid better clinical, operational, and business decisions. We describe an innovative enterprise architecture of image information management systems to fill the needs. Such a system is based on the interplay of production workflow management, distributed object computing, Java and Web techniques, and in-depth domain knowledge in radiology operations. Our design adapts the approach of "4+1" architectural view. In this new architecture, PACS and RIS become one while the user interaction can be automated by customized workflow process. Clinical service applications are implemented as active components. They can be reasonably substituted by applications of local adaptations and can be multiplied for fault tolerance and load balancing. Furthermore, the workflow-enabled digital radiology system would provide powerful query and statistical functions for managing resources and improving productivity. This paper will potentially lead to a new direction of image information management. We illustrate the innovative design with examples taken from an implemented system.
Use of In-Situ and Remotely Sensed Snow Observations for the National Water Model in Both an Analysis and Calibration Framework.

NASA Astrophysics Data System (ADS)

Karsten, L. R.; Gochis, D.; Dugger, A. L.; McCreight, J. L.; Barlage, M. J.; Fall, G. M.; Olheiser, C.

2017-12-01

Since version 1.0 of the National Water Model (NWM) has gone operational in Summer 2016, several upgrades to the model have occurred to improve hydrologic prediction for the continental United States. Version 1.1 of the NWM (Spring 2017) includes upgrades to parameter datasets impacting land surface hydrologic processes. These parameter datasets were upgraded using an automated calibration workflow that utilizes the Dynamic Data Search (DDS) algorithm to adjust parameter values using observed streamflow. As such, these upgrades to parameter values took advantage of various observations collected for snow analysis. In particular, in-situ SNOTEL observations in the Western US, volunteer in-situ observations across the entire US, gamma-derived snow water equivalent (SWE) observations courtesy of the NWS NOAA Corps program, gridded snow depth and SWE products from the Jet Propulsion Laboratory (JPL) Airborne Snow Observatory (ASO), gridded remotely sensed satellite-based snow products (MODIS,AMSR2,VIIRS,ATMS), and gridded SWE from the NWS Snow Data Assimilation System (SNODAS). This study explores the use of these observations to quantify NWM error and improvements from version 1.0 to version 1.1, along with subsequent work since then. In addition, this study explores the use of snow observations for use within the automated calibration workflow. Gridded parameter fields impacting the accumulation and ablation of snow states in the NWM were adjusted and calibrated using gridded remotely sensed snow states, SNODAS products, and in-situ snow observations. This calibration adjustment took place over various ecological regions in snow-dominated parts of the US for a retrospective period of time to capture a variety of climatological conditions. Specifically, the latest calibrated parameters impacting streamflow were held constant and only parameters impacting snow physics were tuned using snow observations and analysis. The adjusted parameter datasets were then used to force the model over an independent period for analysis against both snow and streamflow observations to see if improvements took place. The goal of this work is to further improve snow physics in the NWM, along with identifying areas where further work will take place in the future, such as data assimilation or further forcing improvements.
Understanding the dispensary workflow at the Birmingham Free Clinic: a proposed framework for an informatics intervention.

PubMed

Fisher, Arielle M; Herbert, Mary I; Douglas, Gerald P

2016-02-19

The Birmingham Free Clinic (BFC) in Pittsburgh, Pennsylvania, USA is a free, walk-in clinic that serves medically uninsured populations through the use of volunteer health care providers and an on-site medication dispensary. The introduction of an electronic medical record (EMR) has improved several aspects of clinic workflow. However, pharmacists' tasks involving medication management and dispensing have become more challenging since EMR implementation due to its inability to support workflows between the medical and pharmaceutical services. To inform the design of a systematic intervention, we conducted a needs assessment study to identify workflow challenges and process inefficiencies in the dispensary. We used contextual inquiry to document the dispensary workflow and facilitate identification of critical aspects of intervention design specific to the user. Pharmacists were observed according to contextual inquiry guidelines. Graphical models were produced to aid data and process visualization. We created a list of themes describing workflow challenges and asked the pharmacists to rank them in order of significance to narrow the scope of intervention design. Three pharmacists were observed at the BFC. Observer notes were documented and analyzed to produce 13 themes outlining the primary challenges pharmacists encounter during dispensation at the BFC. The dispensary workflow is labor intensive, redundant, and inefficient when integrated with the clinical service. Observations identified inefficiencies that may benefit from the introduction of informatics interventions including: medication labeling, insufficient process notification, triple documentation, and inventory control. We propose a system for Prescription Management and General Inventory Control (RxMAGIC). RxMAGIC is a framework designed to mitigate workflow challenges and improve the processes of medication management and inventory control. While RxMAGIC is described in the context of the BFC dispensary, we believe it will be generalizable to pharmacies in other low-resource settings, both domestically and internationally.
HappyFace as a generic monitoring tool for HEP experiments

NASA Astrophysics Data System (ADS)

Kawamura, Gen; Magradze, Erekle; Musheghyan, Haykuhi; Quadt, Arnulf; Rzehorz, Gerhard

2015-12-01

The importance of monitoring on HEP grid computing systems is growing due to a significant increase in their complexity. Computer scientists and administrators have been studying and building effective ways to gather information on and clarify a status of each local grid infrastructure. The HappyFace project aims at making the above-mentioned workflow possible. It aggregates, processes and stores the information and the status of different HEP monitoring resources into the common database of HappyFace. The system displays the information and the status through a single interface. However, this model of HappyFace relied on the monitoring resources which are always under development in the HEP experiments. Consequently, HappyFace needed to have direct access methods to the grid application and grid service layers in the different HEP grid systems. To cope with this issue, we use a reliable HEP software repository, the CernVM File System. We propose a new implementation and an architecture of HappyFace, the so-called grid-enabled HappyFace. It allows its basic framework to connect directly to the grid user applications and the grid collective services, without involving the monitoring resources in the HEP grid systems. This approach gives HappyFace several advantages: Portability, to provide an independent and generic monitoring system among the HEP grid systems. Eunctionality, to allow users to perform various diagnostic tools in the individual HEP grid systems and grid sites. Elexibility, to make HappyFace beneficial and open for the various distributed grid computing environments. Different grid-enabled modules, to connect to the Ganga job monitoring system and to check the performance of grid transfers among the grid sites, have been implemented. The new HappyFace system has been successfully integrated and now it displays the information and the status of both the monitoring resources and the direct access to the grid user applications and the grid collective services.
Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach.

PubMed

Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J

2012-01-01

Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.
Building capacity for in-situ phenological observation data to support integrated biodiversity information at local to national scales

NASA Astrophysics Data System (ADS)

Weltzin, J. F.

2016-12-01

Earth observations from a variety of platforms and across a range of scales are required to support research, natural resource management, and policy- and decision-making in a changing world. Integrated earth observation data provides multi-faceted information critical to decision support, vulnerability and change detection, risk assessments, early warning and modeling, simulation and forecasting in the natural resource societal benefit area. The USA National Phenology Network (USA-NPN; www.usanpn.org) is a national-scale science and monitoring initiative focused on phenology - the study of seasonal life-cycle events such as leafing, flowering, reproduction, and migration - as a tool to understand the response of biodiversity to environmental variation and change. USA-NPN provides a hierarchical, national monitoring framework that enables other organizations to leverage the capacity of the Network for their own applications - minimizing investment and duplication of effort - while promoting interoperability and sustainability. Over the last decade, the network has focused on the development of a centralized database for in-situ (ground based) observations of plants and animals, now with 8 M records for the period 1954-present. More recently, we have developed a workflow for the production and validation of spatially gridded phenology products based on models that couple the organismal data with climatological and meteorological data at daily time-steps and relatively fine spatial resolutions ( 2.5 km to 4 km). These gridded data are now ripe for integration with other modeled or earth observation gridded data, e.g., indices of drought impact or land surface reflectance. This greatly broadens capacity to scale organismal observational data to landscapes and regions, and enables novel investigations of biophysical interactions at unprecedented scales, e.g., continental-scale migrations. Sustainability emerges from identification of stakeholder needs, segmentation of target audiences (e.g., data contributors, data consumers), documentation of all aspects of the data production and delivery process, development of collaborative partnerships, enterprise approaches to information management, and excellent customer service.
Managing and Communicating Operational Workflow

PubMed Central

Weinberg, Stuart T.; Danciu, Ioana; Unertl, Kim M.

2016-01-01

Summary Background Healthcare team members in emergency department contexts have used electronic whiteboard solutions to help manage operational workflow for many years. Ambulatory clinic settings have highly complex operational workflow, but are still limited in electronic assistance to communicate and coordinate work activities. Objective To describe and discuss the design, implementation, use, and ongoing evolution of a coordination and collaboration tool supporting ambulatory clinic operational workflow at Vanderbilt University Medical Center (VUMC). Methods The outpatient whiteboard tool was initially designed to support healthcare work related to an electronic chemotherapy order-entry application. After a highly successful initial implementation in an oncology context, a high demand emerged across the organization for the outpatient whiteboard implementation. Over the past 10 years, developers have followed an iterative user-centered design process to evolve the tool. Results The electronic outpatient whiteboard system supports 194 separate whiteboards and is accessed by over 2800 distinct users on a typical day. Clinics can configure their whiteboards to support unique workflow elements. Since initial release, features such as immunization clinical decision support have been integrated into the system, based on requests from end users. Conclusions The success of the electronic outpatient whiteboard demonstrates the usefulness of an operational workflow tool within the ambulatory clinic setting. Operational workflow tools can play a significant role in supporting coordination, collaboration, and teamwork in ambulatory healthcare settings. PMID:27081407
From DNS to RANS: A Multi-model workflow to understand the Influence of Hurricanes on Generating Turbidity Currents in the Gulf of Mexico

NASA Astrophysics Data System (ADS)

Syvitski, J. P.; Arango, H.; Harris, C. K.; Meiburg, E. H.; Jenkins, C. J.; Auad, G.; Hutton, E.; Kniskern, T. A.; Radhakrishnan, S.

2016-12-01

A loosely coupled numerical workflow is developed to address land-sea pathways for sediment routing from terrestrial and coastal sources, across the continental shelf and ultimately down the continental slope canyon system of the northern Gulf of Mexico (GOM). Model simulations represent a range of environmental conditions that might lead to the generation of turbidity-currents. The workflow comprises: 1) A simulator for the water and sediment discharged from rivers into the GOM with WMBsedv2 with calibration using USGS and USACE gauged river data; 2) Domain grids and bathymetry (ETOPO2) for the ocean models and realistic seabed sediment texture grids (dbSEABED) for the sediment transport models; 3) A spectral wave action simulator (10 km resolution) (WaveWatch III) driven by GFDL - GFS winds; 4) A simulator for ocean dynamics (ROMS) forced with ECMWF ERA winds; 5) A simulator for seafloor resuspension and transport (CSTMS); 6) Simulators (HurriSlip) of seafloor failure and flow ignition locations for boundary input to a turbidity current model; and 7) A RANS turbidity current model (TURBINS) to route sediment flows down GOM canyons, providing estimates of bottom shear stresses. TURBINS was developed first as a DNS model and then converted to an LES model wherein a dynamic turbulence closure scheme was employed. Like most DNS to LES model comparisons (these being done by the UCSB team), turbulence scaling allowed for higher Re applications but were found still not capable of simulating field scale (GOM continental canyons) environments. The LES model was next converted to a non-hydrostatic RANS model capable of field scale applications but only with a daisy-chain approach to multiple model runs along the simulated canyon floor. These model adaptations allowed the workflow to be tested for the year 1-Oct-2007 to 30-Sep-2008 that included two domain Hurricanes (Ike and Gustav). The RANS-TURBINS employed further boundary simplifications on both sediment erosion and deposition in line with the ocean model ROMS-CSTMS.
The future of scientific workflows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deelman, Ewa; Peterka, Tom; Altintas, Ilkay

Today’s computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of Energy (DOE) invited a diverse group of domain and computer scientists from national laboratories supported by the Office of Science, the National Nuclear Security Administration, from industry, and from academia to review the workflow requirements of DOE’s science and national security missions, to assess the current state of the art in science workflows, to understand the impact of emerging extreme-scale computing systems on thosemore » workflows, and to develop requirements for automated workflow management in future and existing environments. This article is a summary of the opinions of over 50 leading researchers attending this workshop. We highlight use cases, computing systems, workflow needs and conclude by summarizing the remaining challenges this community sees that inhibit large-scale scientific workflows from becoming a mainstream tool for extreme-scale science.« less
Applications of the pipeline environment for visual informatics and genomics computations

PubMed Central

2011-01-01

Background Contemporary informatics and genomics research require efficient, flexible and robust management of large heterogeneous data, advanced computational tools, powerful visualization, reliable hardware infrastructure, interoperability of computational resources, and detailed data and analysis-protocol provenance. The Pipeline is a client-server distributed computational environment that facilitates the visual graphical construction, execution, monitoring, validation and dissemination of advanced data analysis protocols. Results This paper reports on the applications of the LONI Pipeline environment to address two informatics challenges - graphical management of diverse genomics tools, and the interoperability of informatics software. Specifically, this manuscript presents the concrete details of deploying general informatics suites and individual software tools to new hardware infrastructures, the design, validation and execution of new visual analysis protocols via the Pipeline graphical interface, and integration of diverse informatics tools via the Pipeline eXtensible Markup Language syntax. We demonstrate each of these processes using several established informatics packages (e.g., miBLAST, EMBOSS, mrFAST, GWASS, MAQ, SAMtools, Bowtie) for basic local sequence alignment and search, molecular biology data analysis, and genome-wide association studies. These examples demonstrate the power of the Pipeline graphical workflow environment to enable integration of bioinformatics resources which provide a well-defined syntax for dynamic specification of the input/output parameters and the run-time execution controls. Conclusions The LONI Pipeline environment http://pipeline.loni.ucla.edu provides a flexible graphical infrastructure for efficient biomedical computing and distributed informatics research. The interactive Pipeline resource manager enables the utilization and interoperability of diverse types of informatics resources. The Pipeline client-server model provides computational power to a broad spectrum of informatics investigators - experienced developers and novice users, user with or without access to advanced computational-resources (e.g., Grid, data), as well as basic and translational scientists. The open development, validation and dissemination of computational networks (pipeline workflows) facilitates the sharing of knowledge, tools, protocols and best practices, and enables the unbiased validation and replication of scientific findings by the entire community. PMID:21791102
e-Science on Earthquake Disaster Mitigation by EUAsiaGrid

NASA Astrophysics Data System (ADS)

Yen, Eric; Lin, Simon; Chen, Hsin-Yen; Chao, Li; Huang, Bor-Shoh; Liang, Wen-Tzong

2010-05-01

Although earthquake is not predictable at this moment, with the aid of accurate seismic wave propagation analysis, we could simulate the potential hazards at all distances from possible fault sources by understanding the source rupture process during large earthquakes. With the integration of strong ground-motion sensor network, earthquake data center and seismic wave propagation analysis over gLite e-Science Infrastructure, we could explore much better knowledge on the impact and vulnerability of potential earthquake hazards. On the other hand, this application also demonstrated the e-Science way to investigate unknown earth structure. Regional integration of earthquake sensor networks could aid in fast event reporting and accurate event data collection. Federation of earthquake data center entails consolidation and sharing of seismology and geology knowledge. Capability building of seismic wave propagation analysis implies the predictability of potential hazard impacts. With gLite infrastructure and EUAsiaGrid collaboration framework, earth scientists from Taiwan, Vietnam, Philippine, Thailand are working together to alleviate potential seismic threats by making use of Grid technologies and also to support seismology researches by e-Science. A cross continental e-infrastructure, based on EGEE and EUAsiaGrid, is established for seismic wave forward simulation and risk estimation. Both the computing challenge on seismic wave analysis among 5 European and Asian partners, and the data challenge for data center federation had been exercised and verified. Seismogram-on-Demand service is also developed for the automatic generation of seismogram on any sensor point to a specific epicenter. To ease the access to all the services based on users workflow and retain the maximal flexibility, a Seismology Science Gateway integating data, computation, workflow, services and user communities would be implemented based on typical use cases. In the future, extension of the earthquake wave propagation to tsunami mitigation would be feasible once the user community support is in place.

Schedule-Aware Workflow Management Systems

NASA Astrophysics Data System (ADS)

Mans, Ronny S.; Russell, Nick C.; van der Aalst, Wil M. P.; Moleman, Arnold J.; Bakker, Piet J. M.

Contemporary workflow management systems offer work-items to users through specific work-lists. Users select the work-items they will perform without having a specific schedule in mind. However, in many environments work needs to be scheduled and performed at particular times. For example, in hospitals many work-items are linked to appointments, e.g., a doctor cannot perform surgery without reserving an operating theater and making sure that the patient is present. One of the problems when applying workflow technology in such domains is the lack of calendar-based scheduling support. In this paper, we present an approach that supports the seamless integration of unscheduled (flow) and scheduled (schedule) tasks. Using CPN Tools we have developed a specification and simulation model for schedule-aware workflow management systems. Based on this a system has been realized that uses YAWL, Microsoft Exchange Server 2007, Outlook, and a dedicated scheduling service. The approach is illustrated using a real-life case study at the AMC hospital in the Netherlands. In addition, we elaborate on the experiences obtained when developing and implementing a system of this scale using formal techniques.
Automated lattice data generation

NASA Astrophysics Data System (ADS)

Ayyar, Venkitesh; Hackett, Daniel C.; Jay, William I.; Neil, Ethan T.

2018-03-01

The process of generating ensembles of gauge configurations (and measuring various observables over them) can be tedious and error-prone when done "by hand". In practice, most of this procedure can be automated with the use of a workflow manager. We discuss how this automation can be accomplished using Taxi, a minimal Python-based workflow manager built for generating lattice data. We present a case study demonstrating this technology.
qPortal: A platform for data-driven biomedical research.

PubMed

Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

2018-01-01

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on high-performance computing systems via coupling of workflow management systems. Integration of project and data management as well as workflow resources in one place present clear advantages over existing solutions.
RISA: Remote Interface for Science Analysis

NASA Astrophysics Data System (ADS)

Gabriel, C.; Ibarra, A.; de La Calle, I.; Salgado, J.; Osuna, P.; Tapiador, D.

2008-08-01

The Scientific Analysis System (SAS) is the package for interactive and pipeline data reduction of all XMM-Newton data. Freely distributed by ESA to run under many different operating systems, the SAS has been used by almost every one of the 1600 refereed scientific publications obtained so far from the mission. We are developing RISA, the Remote Interface for Science Analysis, which makes it possible to run SAS through fully configurable web service workflows, enabling observers to access and analyse data making use of all of the existing SAS functionalities, without any installation/download of software/data. The workflows run primarily but not exclusively on the ESAC Grid, which offers scalable processing resources, directly connected to the XMM-Newton Science Archive. A first project internal version of RISA was issued in May 2007, a public release is expected already within this year.
Big Data Challenges in Global Seismic 'Adjoint Tomography' (Invited)

NASA Astrophysics Data System (ADS)

Tromp, J.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Smith, J.

2013-12-01

The challenge of imaging Earth's interior on a global scale is closely linked to the challenge of handling large data sets. The related iterative workflow involves five distinct phases, namely, 1) data gathering and culling, 2) synthetic seismogram calculations, 3) pre-processing (time-series analysis and time-window selection), 4) data assimilation and adjoint calculations, 5) post-processing (pre-conditioning, regularization, model update). In order to implement this workflow on modern high-performance computing systems, a new seismic data format is being developed. The Adaptable Seismic Data Format (ASDF) is designed to replace currently used data formats with a more flexible format that allows for fast parallel I/O. The metadata is divided into abstract categories, such as "source" and "receiver", along with provenance information for complete reproducibility. The structure of ASDF is designed keeping in mind three distinct applications: earthquake seismology, seismic interferometry, and exploration seismology. Existing time-series analysis tool kits, such as SAC and ObsPy, can be easily interfaced with ASDF so that seismologists can use robust, previously developed software packages. ASDF accommodates an automated, efficient workflow for global adjoint tomography. Manually managing the large number of simulations associated with the workflow can rapidly become a burden, especially with increasing numbers of earthquakes and stations. Therefore, it is of importance to investigate the possibility of automating the entire workflow. Scientific Workflow Management Software (SWfMS) allows users to execute workflows almost routinely. SWfMS provides additional advantages. In particular, it is possible to group independent simulations in a single job to fit the available computational resources. They also give a basic level of fault resilience as the workflow can be resumed at the correct state preceding a failure. Some of the best candidates for our particular workflow are Kepler and Swift, and the latter appears to be the most serious candidate for a large-scale workflow on a single supercomputer, remaining sufficiently simple to accommodate further modifications and improvements.
Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach

PubMed Central

Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J

2012-01-01

Abstract Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow. The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow. PMID:22859881
3D interactive forward and inversion gravity modelling at different scales: From subduction zone modelling to cavity detection.

NASA Astrophysics Data System (ADS)

Götze, Hans-Jürgen; Schmidt, Sabine

2014-05-01

Modern geophysical interpretation requires an interdisciplinary approach, particularly when considering the available amount of 'state of the art' information. A combination of different geophysical surveys employing seismic, gravity and EM, together with geological and petrological studies, can provide new insights into the structures and tectonic evolution of the lithosphere, natural deposits and underground cavities. Interdisciplinary interpretation is essential for any numerical modelling of these structures and the processes acting on them Interactive gravity and magnetic modeling can play an important role in the depth imaging workflow of complex projects. The integration of the workflow and the tools is important to meet the needs of today's more interactive and interpretative depth imaging workflows. For the integration of gravity and magnetic models the software IGMAS+ can play an important role in this workflow. For simplicity the focus is on gravity modeling, but all methods can be applied to the modeling of magnetic data as well. Currently there are three common ways to define a 3D gravity model. Grid based models: Grids define the different geological units. The densities of the geological units are constant. Additional grids can be introduced to subdivide the geological units, making it possible to represent density depth relations. Polyhedral models: The interfaces between different geological units are defined by polyhedral, typically triangles. Voxel models: Each voxel in a regular cube has a density assigned. Spherical Earth modeling: Geophysical investigations may cover huge areas of several thousand square kilometers. The depression of the earth's surface due to the curvature of the Earth is 3 km at a distance of 200 km and 20 km at a distance of 500 km. Interactive inversion: Inversion is typically done in batch where constraints are defined beforehand and then after a few minutes or hours a model fitting the data and constraints is generated. As examples I show results from the Central Andes and the North Sea. Both gravity and geoid of the two areas were investigated with regard to their isostatic state, the crustal density structure and rigidity of the Lithosphere. Modern satellite measurements of the recent ESA campaigns are compared to ground observations in the region. Estimates of stress and GPE (gravitational potential energy) at the western South American margin have been derived from an existing 3D density model. Here, sensitivity studies of gravity and gravity gradients indicate that short wavelength lithospheric structures are more pronounced in the gravity gradient tensor than in the gravity field. A medium size example of the North Sea underground demonstrates how interdisciplinary data sets can support aero gravity investigations. At the micro scale an example from the detection of a crypt (Alversdorf, Northern Germany) is shown.
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi

The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC

DOE PAGES

Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi

2018-03-19

The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
A patient workflow management system built on guidelines.

PubMed Central

Dazzi, L.; Fassino, C.; Saracco, R.; Quaglini, S.; Stefanelli, M.

1997-01-01

To provide high quality, shared, and distributed medical care, clinical and organizational issues need to be integrated. This work describes a methodology for developing a Patient Workflow Management System, based on a detailed model of both the medical work process and the organizational structure. We assume that the medical work process is represented through clinical practice guidelines, and that an ontological description of the organization is available. Thus, we developed tools 1) for acquiring the medical knowledge contained into a guideline, 2) to translate the derived formalized guideline into a computational formalism, precisely a Petri Net, 3) to maintain different representation levels. The high level representation guarantees that the Patient Workflow follows the guideline prescriptions, while the low level takes into account the specific organization characteristics and allow allocating resources for managing a specific patient in daily practice. PMID:9357606
Rethinking Clinical Workflow.

PubMed

Schlesinger, Joseph J; Burdick, Kendall; Baum, Sarah; Bellomy, Melissa; Mueller, Dorothee; MacDonald, Alistair; Chern, Alex; Chrouser, Kristin; Burger, Christie

2018-03-01

The concept of clinical workflow borrows from management and leadership principles outside of medicine. The only way to rethink clinical workflow is to understand the neuroscience principles that underlie attention and vigilance. With any implementation to improve practice, there are human factors that can promote or impede progress. Modulating the environment and working as a team to take care of patients is paramount. Clinicians must continually rethink clinical workflow, evaluate progress, and understand that other industries have something to offer. Then, novel approaches can be implemented to take the best care of patients. Copyright © 2017 Elsevier Inc. All rights reserved.
Web-Accessible Scientific Workflow System for Performance Monitoring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roelof Versteeg; Roelof Versteeg; Trevor Rowe

2006-03-01

We describe the design and implementation of a web accessible scientific workflow system for environmental monitoring. This workflow environment integrates distributed, automated data acquisition with server side data management and information visualization through flexible browser based data access tools. Component technologies include a rich browser-based client (using dynamic Javascript and HTML/CSS) for data selection, a back-end server which uses PHP for data processing, user management, and result delivery, and third party applications which are invoked by the back-end using webservices. This environment allows for reproducible, transparent result generation by a diverse user base. It has been implemented for several monitoringmore » systems with different degrees of complexity.« less
An Extensible Information Grid for Risk Management

NASA Technical Reports Server (NTRS)

Maluf, David A.; Bell, David G.

2003-01-01

This paper describes recent work on developing an extensible information grid for risk management at NASA - a RISK INFORMATION GRID. This grid is being developed by integrating information grid technology with risk management processes for a variety of risk related applications. To date, RISK GRID applications are being developed for three main NASA processes: risk management - a closed-loop iterative process for explicit risk management, program/project management - a proactive process that includes risk management, and mishap management - a feedback loop for learning from historical risks that escaped other processes. This is enabled through an architecture involving an extensible database, structuring information with XML, schemaless mapping of XML, and secure server-mediated communication using standard protocols.
Workflow Management for Complex HEP Analyses

NASA Astrophysics Data System (ADS)

Erdmann, M.; Fischer, R.; Rieger, M.; von Cube, R. F.

2017-10-01

We present the novel Analysis Workflow Management (AWM) that provides users with the tools and competences of professional large scale workflow systems, e.g. Apache’s Airavata[1]. The approach presents a paradigm shift from executing parts of the analysis to defining the analysis. Within AWM an analysis consists of steps. For example, a step defines to run a certain executable for multiple files of an input data collection. Each call to the executable for one of those input files can be submitted to the desired run location, which could be the local computer or a remote batch system. An integrated software manager enables automated user installation of dependencies in the working directory at the run location. Each execution of a step item creates one report for bookkeeping purposes containing error codes and output data or file references. Required files, e.g. created by previous steps, are retrieved automatically. Since data storage and run locations are exchangeable from the steps perspective, computing resources can be used opportunistically. A visualization of the workflow as a graph of the steps in the web browser provides a high-level view on the analysis. The workflow system is developed and tested alongside of a ttbb cross section measurement where, for instance, the event selection is represented by one step and a Bayesian statistical inference is performed by another. The clear interface and dependencies between steps enables a make-like execution of the whole analysis.
From chart tracking to workflow management.

PubMed Central

Srinivasan, P.; Vignes, G.; Venable, C.; Hazelwood, A.; Cade, T.

1994-01-01

The current interest in system-wide integration appears to be based on the assumption that an organization, by digitizing information and accepting a common standard for the exchange of such information, will improve the accessibility of this information and automatically experience benefits resulting from its more productive use. We do not dispute this reasoning, but assert that an organization's capacity for effective change is proportional to the understanding of the current structure among its personnel. Our workflow manager is based on the use of a Parameterized Petri Net (PPN) model which can be configured to represent an arbitrarily detailed picture of an organization. The PPN model can be animated to observe the model organization in action, and the results of the animation analyzed. This simulation is a dynamic ongoing process which changes with the system and allows members of the organization to pose "what if" questions as a means of exploring opportunities for change. We present, the "workflow management system" as the natural successor to the tracking program, incorporating modeling, scheduling, reactive planning, performance evaluation, and simulation. This workflow management system is more than adequate for meeting the needs of a paper chart tracking system, and, as the patient record is computerized, will serve as a planning and evaluation tool in converting the paper-based health information system into a computer-based system. PMID:7950051
Dynamic Voltage Frequency Scaling Simulator for Real Workflows Energy-Aware Management in Green Cloud Computing

PubMed Central

Cotes-Ruiz, Iván Tomás; Prado, Rocío P.; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás

2017-01-01

Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique. PMID:28085932
Dynamic Voltage Frequency Scaling Simulator for Real Workflows Energy-Aware Management in Green Cloud Computing.

PubMed

Cotes-Ruiz, Iván Tomás; Prado, Rocío P; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás

2017-01-01

Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique.
Scheduling Multilevel Deadline-Constrained Scientific Workflows on Clouds Based on Cost Optimization

DOE PAGES

Malawski, Maciej; Figiela, Kamil; Bubak, Marian; ...

2015-01-01

This paper presents a cost optimization model for scheduling scientific workflows on IaaS clouds such as Amazon EC2 or RackSpace. We assume multiple IaaS clouds with heterogeneous virtual machine instances, with limited number of instances per cloud and hourly billing. Input and output data are stored on a cloud object store such as Amazon S3. Applications are scientific workflows modeled as DAGs as in the Pegasus Workflow Management System. We assume that tasks in the workflows are grouped into levels of identical tasks. Our model is specified using mathematical programming languages (AMPL and CMPL) and allows us to minimize themore » cost of workflow execution under deadline constraints. We present results obtained using our model and the benchmark workflows representing real scientific applications in a variety of domains. The data used for evaluation come from the synthetic workflows and from general purpose cloud benchmarks, as well as from the data measured in our own experiments with Montage, an astronomical application, executed on Amazon EC2 cloud. We indicate how this model can be used for scenarios that require resource planning for scientific workflows and their ensembles.« less
Digitization workflows for flat sheets and packets of plants, algae, and fungi1

PubMed Central

Nelson, Gil; Sweeney, Patrick; Wallace, Lisa E.; Rabeler, Richard K.; Allard, Dorothy; Brown, Herrick; Carter, J. Richard; Denslow, Michael W.; Ellwood, Elizabeth R.; Germain-Aubrey, Charlotte C.; Gilbert, Ed; Gillespie, Emily; Goertzen, Leslie R.; Legler, Ben; Marchant, D. Blaine; Marsico, Travis D.; Morris, Ashley B.; Murrell, Zack; Nazaire, Mare; Neefus, Chris; Oberreiter, Shanna; Paul, Deborah; Ruhfel, Brad R.; Sasek, Thomas; Shaw, Joey; Soltis, Pamela S.; Watson, Kimberly; Weeks, Andrea; Mast, Austin R.

2015-01-01

Effective workflows are essential components in the digitization of biodiversity specimen collections. To date, no comprehensive, community-vetted workflows have been published for digitizing flat sheets and packets of plants, algae, and fungi, even though latest estimates suggest that only 33% of herbarium specimens have been digitally transcribed, 54% of herbaria use a specimen database, and 24% are imaging specimens. In 2012, iDigBio, the U.S. National Science Foundation’s (NSF) coordinating center and national resource for the digitization of public, nonfederal U.S. collections, launched several working groups to address this deficiency. Here, we report the development of 14 workflow modules with 7–36 tasks each. These workflows represent the combined work of approximately 35 curators, directors, and collections managers representing more than 30 herbaria, including 15 NSF-supported plant-related Thematic Collections Networks and collaboratives. The workflows are provided for download as Portable Document Format (PDF) and Microsoft Word files. Customization of these workflows for specific institutional implementation is encouraged. PMID:26421256
Real-Time System for Water Modeling and Management

NASA Astrophysics Data System (ADS)

Lee, J.; Zhao, T.; David, C. H.; Minsker, B.

2012-12-01

Working closely with the Texas Commission on Environmental Quality (TCEQ) and the University of Texas at Austin (UT-Austin), we are developing a real-time system for water modeling and management using advanced cyberinfrastructure, data integration and geospatial visualization, and numerical modeling. The state of Texas suffered a severe drought in 2011 that cost the state $7.62 billion in agricultural losses (crops and livestock). Devastating situations such as this could potentially be avoided with better water modeling and management strategies that incorporate state of the art simulation and digital data integration. The goal of the project is to prototype a near-real-time decision support system for river modeling and management in Texas that can serve as a national and international model to promote more sustainable and resilient water systems. The system uses National Weather Service current and predicted precipitation data as input to the Noah-MP Land Surface model, which forecasts runoff, soil moisture, evapotranspiration, and water table levels given land surface features. These results are then used by a river model called RAPID, along with an error model currently under development at UT-Austin, to forecast stream flows in the rivers. Model forecasts are visualized as a Web application for TCEQ decision makers, who issue water diversion (withdrawal) permits and any needed drought restrictions; permit holders; and reservoir operation managers. Users will be able to adjust model parameters to predict the impacts of alternative curtailment scenarios or weather forecasts. A real-time optimization system under development will help TCEQ to identify optimal curtailment strategies to minimize impacts on permit holders and protect health and safety. To develop the system we have implemented RAPID as a remotely-executed modeling service using the Cyberintegrator workflow system with input data downloaded from the North American Land Data Assimilation System. The Cyberintegrator workflow system provides RESTful web services for users to provide inputs, execute workflows, and retrieve outputs. Along with REST endpoints, PAW (Publishable Active Workflows) provides the web user interface toolkit for us to develop web applications with scientific workflows. The prototype web application is built on top of workflows with PAW, so that users will have a user-friendly web environment to provide input parameters, execute the model, and visualize/retrieve the results using geospatial mapping tools. In future work the optimization model will be developed and integrated into the workflow.; Real-Time System for Water Modeling and Management

Opportunistic Computing with Lobster: Lessons Learned from Scaling up to 25k Non-Dedicated Cores

NASA Astrophysics Data System (ADS)

Wolf, Matthias; Woodard, Anna; Li, Wenzhao; Hurtado Anampa, Kenyi; Yannakopoulos, Anna; Tovar, Benjamin; Donnelly, Patrick; Brenner, Paul; Lannon, Kevin; Hildreth, Mike; Thain, Douglas

2017-10-01

We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges. Categories: Workflows can now be divided into categories based on their required system resources. This allows the batch queueing system to optimize assignment of tasks to nodes with the appropriate capabilities. Within each category, limits can be specified for the number of running jobs to regulate the utilization of communication bandwidth. System resource specifications for a task category can now be modified while a project is running, avoiding the need to restart the project if resource requirements differ from the initial estimates. Lobster now implements time limits on each task category to voluntarily terminate tasks. This allows partially completed work to be recovered. Workflow dependency specification: One workflow often requires data from other workflows as input. Rather than waiting for earlier workflows to be completed before beginning later ones, Lobster now allows dependent tasks to begin as soon as sufficient input data has accumulated. Resource monitoring: Lobster utilizes a new capability in Work Queue to monitor the system resources each task requires in order to identify bottlenecks and optimally assign tasks. The capability of the Lobster opportunistic workflow management system for HEP computation has been significantly increased. We have demonstrated efficient utilization of 25 000 non-dedicated cores and achieved a data input rate of 30 Gb/s and an output rate of 500GB/h. This has required new capabilities in task categorization, workflow dependency specification, and resource monitoring.
Applying Content Management to Automated Provenance Capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schuchardt, Karen L.; Gibson, Tara D.; Stephan, Eric G.

2008-04-10

Workflows and data pipelines are becoming increasingly valuable in both computational and experimen-tal sciences. These automated systems are capable of generating significantly more data within the same amount of time than their manual counterparts. Automatically capturing and recording data prove-nance and annotation as part of these workflows is critical for data management, verification, and dis-semination. Our goal in addressing the provenance challenge was to develop and end-to-end system that demonstrates real-time capture, persistent content management, and ad-hoc searches of both provenance and metadata using open source software and standard protocols. We describe our prototype, which extends the Kepler workflow toolsmore » for the execution environment, the Scientific Annotation Middleware (SAM) content management software for data services, and an existing HTTP-based query protocol. Our implementation offers several unique capabilities, and through the use of standards, is able to pro-vide access to the provenance record to a variety of commonly available client tools.« less
Producing an Infrared Multiwavelength Galactic Plane Atlas Using Montage, Pegasus, and Amazon Web Services

NASA Astrophysics Data System (ADS)

Rynge, M.; Juve, G.; Kinney, J.; Good, J.; Berriman, B.; Merrihew, A.; Deelman, E.

2014-05-01

In this paper, we describe how to leverage cloud resources to generate large-scale mosaics of the galactic plane in multiple wavelengths. Our goal is to generate a 16-wavelength infrared Atlas of the Galactic Plane at a common spatial sampling of 1 arcsec, processed so that they appear to have been measured with a single instrument. This will be achieved by using the Montage image mosaic engine process observations from the 2MASS, GLIMPSE, MIPSGAL, MSX and WISE datasets, over a wavelength range of 1 μm to 24 μm, and by using the Pegasus Workflow Management System for managing the workload. When complete, the Atlas will be made available to the community as a data product. We are generating images that cover ±180° in Galactic longitude and ±20° in Galactic latitude, to the extent permitted by the spatial coverage of each dataset. Each image will be 5°x5° in size (including an overlap of 1° with neighboring tiles), resulting in an atlas of 1,001 images. The final size will be about 50 TBs. This paper will focus on the computational challenges, solutions, and lessons learned in producing the Atlas. To manage the computation we are using the Pegasus Workflow Management System, a mature, highly fault-tolerant system now in release 4.2.2 that has found wide applicability across many science disciplines. A scientific workflow describes the dependencies between the tasks and in most cases the workflow is described as a directed acyclic graph, where the nodes are tasks and the edges denote the task dependencies. A defining property for a scientific workflow is that it manages data flow between tasks. Applied to the galactic plane project, each 5 by 5 mosaic is a Pegasus workflow. Pegasus is used to fetch the source images, execute the image mosaicking steps of Montage, and store the final outputs in a storage system. As these workflows are very I/O intensive, care has to be taken when choosing what infrastructure to execute the workflow on. In our setup, we choose to use dynamically provisioned compute clusters running on the Amazon Elastic Compute Cloud (EC2). All our instances are using the same base image, which is configured to come up as a master node by default. The master node is a central instance from where the workflow can be managed. Additional worker instances are provisioned and configured to accept work assignments from the master node. The system allows for adding/removing workers in an ad hoc fashion, and could be run in large configurations. To-date we have performed 245,000 CPU hours of computing and generated 7,029 images and totaling 30 TB. With the current set up our runtime would be 340,000 CPU hours for the whole project. Using spot m2.4xlarge instances, the cost would be approximately $5,950. Using faster AWS instances, such as cc2.8xlarge could potentially decrease the total CPU hours and further reduce the compute costs. The paper will explore these tradeoffs.
Implementation of Cyberinfrastructure and Data Management Workflow for a Large-Scale Sensor Network

NASA Astrophysics Data System (ADS)

Jones, A. S.; Horsburgh, J. S.

2014-12-01

Monitoring with in situ environmental sensors and other forms of field-based observation presents many challenges for data management, particularly for large-scale networks consisting of multiple sites, sensors, and personnel. The availability and utility of these data in addressing scientific questions relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into functional data products. It also depends on the ability of researchers to share and access the data in useable formats. In addition to addressing the challenges presented by the quantity of data, monitoring networks need practices to ensure high data quality, including procedures and tools for post processing. Data quality is further enhanced if practitioners are able to track equipment, deployments, calibrations, and other events related to site maintenance and associate these details with observational data. In this presentation we will describe the overall workflow that we have developed for research groups and sites conducting long term monitoring using in situ sensors. Features of the workflow include: software tools to automate the transfer of data from field sites to databases, a Python-based program for data quality control post-processing, a web-based application for online discovery and visualization of data, and a data model and web interface for managing physical infrastructure. By automating the data management workflow, the time from collection to analysis is reduced and sharing and publication is facilitated. The incorporation of metadata standards and descriptions and the use of open-source tools enhances the sustainability and reusability of the data. We will describe the workflow and tools that we have developed in the context of the iUTAH (innovative Urban Transitions and Aridregion Hydrosustainability) monitoring network. The iUTAH network consists of aquatic and climate sensors deployed in three watersheds to monitor Gradients Along Mountain to Urban Transitions (GAMUT). The variety of environmental sensors and the multi-watershed, multi-institutional nature of the network necessitate a well-planned and efficient workflow for acquiring, managing, and sharing sensor data, which should be useful for similar large-scale and long-term networks.
CMS Connect

NASA Astrophysics Data System (ADS)

Balcas, J.; Bockelman, B.; Gardner, R., Jr.; Hurtado Anampa, K.; Jayatilaka, B.; Aftab Khan, F.; Lannon, K.; Larson, K.; Letts, J.; Marra Da Silva, J.; Mascheroni, M.; Mason, D.; Perez-Calero Yzquierdo, A.; Tiradani, A.

2017-10-01

The CMS experiment collects and analyzes large amounts of data coming from high energy particle collisions produced by the Large Hadron Collider (LHC) at CERN. This involves a huge amount of real and simulated data processing that needs to be handled in batch-oriented platforms. The CMS Global Pool of computing resources provide +100K dedicated CPU cores and another 50K to 100K CPU cores from opportunistic resources for these kind of tasks and even though production and event processing analysis workflows are already managed by existing tools, there is still a lack of support to submit final stage condor-like analysis jobs familiar to Tier-3 or local Computing Facilities users into these distributed resources in an integrated (with other CMS services) and friendly way. CMS Connect is a set of computing tools and services designed to augment existing services in the CMS Physics community focusing on these kind of condor analysis jobs. It is based on the CI-Connect platform developed by the Open Science Grid and uses the CMS GlideInWMS infrastructure to transparently plug CMS global grid resources into a virtual pool accessed via a single submission machine. This paper describes the specific developments and deployment of CMS Connect beyond the CI-Connect platform in order to integrate the service with CMS specific needs, including specific Site submission, accounting of jobs and automated reporting to standard CMS monitoring resources in an effortless way to their users.
End-to-end interoperability and workflows from building architecture design to one or more simulations

DOEpatents

Chao, Tian-Jy; Kim, Younghun

2015-02-10

An end-to-end interoperability and workflows from building architecture design to one or more simulations, in one aspect, may comprise establishing a BIM enablement platform architecture. A data model defines data entities and entity relationships for enabling the interoperability and workflows. A data definition language may be implemented that defines and creates a table schema of a database associated with the data model. Data management services and/or application programming interfaces may be implemented for interacting with the data model. Web services may also be provided for interacting with the data model via the Web. A user interface may be implemented that communicates with users and uses the BIM enablement platform architecture, the data model, the data definition language, data management services and application programming interfaces to provide functions to the users to perform work related to building information management.
Identification and Management of Information Problems by Emergency Department Staff

PubMed Central

Murphy, Alison R.; Reddy, Madhu C.

2014-01-01

Patient-care teams frequently encounter information problems during their daily activities. These information problems include wrong, outdated, conflicting, incomplete, or missing information. Information problems can negatively impact the patient-care workflow, lead to misunderstandings about patient information, and potentially lead to medical errors. Existing research focuses on understanding the cause of these information problems and the impact that they can have on the hospital’s workflow. However, there is limited research on how patient-care teams currently identify and manage information problems that they encounter during their work. Through qualitative observations and interviews in an emergency department (ED), we identified the types of information problems encountered by ED staff, and examined how they identified and managed the information problems. We also discuss the impact that these information problems can have on the patient-care teams, including the cascading effects of information problems on workflow and the ambiguous accountability for fixing information problems within collaborative teams. PMID:25954457
ATLAS Distributed Computing Experience and Performance During the LHC Run-2

NASA Astrophysics Data System (ADS)

Filipčič, A.; ATLAS Collaboration

2017-10-01

ATLAS Distributed Computing during LHC Run-1 was challenged by steadily increasing computing, storage and network requirements. In addition, the complexity of processing task workflows and their associated data management requirements led to a new paradigm in the ATLAS computing model for Run-2, accompanied by extensive evolution and redesign of the workflow and data management systems. The new systems were put into production at the end of 2014, and gained robustness and maturity during 2015 data taking. ProdSys2, the new request and task interface; JEDI, the dynamic job execution engine developed as an extension to PanDA; and Rucio, the new data management system, form the core of Run-2 ATLAS distributed computing engine. One of the big changes for Run-2 was the adoption of the Derivation Framework, which moves the chaotic CPU and data intensive part of the user analysis into the centrally organized train production, delivering derived AOD datasets to user groups for final analysis. The effectiveness of the new model was demonstrated through the delivery of analysis datasets to users just one week after data taking, by completing the calibration loop, Tier-0 processing and train production steps promptly. The great flexibility of the new system also makes it possible to execute part of the Tier-0 processing on the grid when Tier-0 resources experience a backlog during high data-taking periods. The introduction of the data lifetime model, where each dataset is assigned a finite lifetime (with extensions possible for frequently accessed data), was made possible by Rucio. Thanks to this the storage crises experienced in Run-1 have not reappeared during Run-2. In addition, the distinction between Tier-1 and Tier-2 disk storage, now largely artificial given the quality of Tier-2 resources and their networking, has been removed through the introduction of dynamic ATLAS clouds that group the storage endpoint nucleus and its close-by execution satellite sites. All stable ATLAS sites are now able to store unique or primary copies of the datasets. ATLAS Distributed Computing is further evolving to speed up request processing by introducing network awareness, using machine learning and optimisation of the latencies during the execution of the full chain of tasks. The Event Service, a new workflow and job execution engine, is designed around check-pointing at the level of event processing to use opportunistic resources more efficiently. ATLAS has been extensively exploring possibilities of using computing resources extending beyond conventional grid sites in the WLCG fabric to deliver as many computing cycles as possible and thereby enhance the significance of the Monte-Carlo samples to deliver better physics results. The exploitation of opportunistic resources was at an early stage throughout 2015, at the level of 10% of the total ATLAS computing power, but in the next few years it is expected to deliver much more. In addition, demonstrating the ability to use an opportunistic resource can lead to securing ATLAS allocations on the facility, hence the importance of this work goes beyond merely the initial CPU cycles gained. In this paper, we give an overview and compare the performance, development effort, flexibility and robustness of the various approaches.
Implementation of a SOA-Based Service Deployment Platform with Portal

NASA Astrophysics Data System (ADS)

Yang, Chao-Tung; Yu, Shih-Chi; Lai, Chung-Che; Liu, Jung-Chun; Chu, William C.

In this paper we propose a Service Oriented Architecture to provide a flexible and serviceable environment. SOA comes up with commercial requirements; it integrates many techniques over ten years to find the solution in different platforms, programming languages and users. SOA provides the connection with a protocol between service providers and service users. After this, the performance and the reliability problems are reviewed. Finally we apply SOA into our Grid and Hadoop platform. Service acts as an interface in front of the Resource Broker in the Grid, and the Resource Broker is middleware that provides functions for developers. The Hadoop has a file replication feature to ensure file reliability. Services provided on the Grid and Hadoop are centralized. We design a portal, in which users can use services on it directly or register service through the service provider. The portal also offers a service workflow function so that users can customize services according to the need of their jobs.
Coupling of a continuum ice sheet model and a discrete element calving model using a scientific workflow system

NASA Astrophysics Data System (ADS)

Memon, Shahbaz; Vallot, Dorothée; Zwinger, Thomas; Neukirchen, Helmut

2017-04-01

Scientific communities generate complex simulations through orchestration of semi-structured analysis pipelines which involves execution of large workflows on multiple, distributed and heterogeneous computing and data resources. Modeling ice dynamics of glaciers requires workflows consisting of many non-trivial, computationally expensive processing tasks which are coupled to each other. From this domain, we present an e-Science use case, a workflow, which requires the execution of a continuum ice flow model and a discrete element based calving model in an iterative manner. Apart from the execution, this workflow also contains data format conversion tasks that support the execution of ice flow and calving by means of transition through sequential, nested and iterative steps. Thus, the management and monitoring of all the processing tasks including data management and transfer of the workflow model becomes more complex. From the implementation perspective, this workflow model was initially developed on a set of scripts using static data input and output references. In the course of application usage when more scripts or modifications introduced as per user requirements, the debugging and validation of results were more cumbersome to achieve. To address these problems, we identified a need to have a high-level scientific workflow tool through which all the above mentioned processes can be achieved in an efficient and usable manner. We decided to make use of the e-Science middleware UNICORE (Uniform Interface to Computing Resources) that allows seamless and automated access to different heterogenous and distributed resources which is supported by a scientific workflow engine. Based on this, we developed a high-level scientific workflow model for coupling of massively parallel High-Performance Computing (HPC) jobs: a continuum ice sheet model (Elmer/Ice) and a discrete element calving and crevassing model (HiDEM). In our talk we present how the use of a high-level scientific workflow middleware enables reproducibility of results more convenient and also provides a reusable and portable workflow template that can be deployed across different computing infrastructures. Acknowledgements This work was kindly supported by NordForsk as part of the Nordic Center of Excellence (NCoE) eSTICC (eScience Tools for Investigating Climate Change at High Northern Latitudes) and the Top-level Research Initiative NCoE SVALI (Stability and Variation of Arctic Land Ice).
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

PubMed Central

2011-01-01

Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples. PMID:21352538
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines.

PubMed

Cieślik, Marcin; Mura, Cameron

2011-02-25

Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples.
Context-aware access control for pervasive access to process-based healthcare systems.

PubMed

Koufi, Vassiliki; Vassilacopoulos, George

2008-01-01

Healthcare is an increasingly collaborative enterprise involving a broad range of healthcare services provided by many individuals and organizations. Grid technology has been widely recognized as a means for integrating disparate computing resources in the healthcare field. Moreover, Grid portal applications can be developed on a wireless and mobile infrastructure to execute healthcare processes which, in turn, can provide remote access to Grid database services. Such an environment provides ubiquitous and pervasive access to integrated healthcare services at the point of care, thus improving healthcare quality. In such environments, the ability to provide an effective access control mechanism that meets the requirement of the least privilege principle is essential. Adherence to the least privilege principle requires continuous adjustments of user permissions in order to adapt to the current situation. This paper presents a context-aware access control mechanism for HDGPortal, a Grid portal application which provides access to workflow-based healthcare processes using wireless Personal Digital Assistants. The proposed mechanism builds upon and enhances security mechanisms provided by the Grid Security Infrastructure. It provides tight, just-in-time permissions so that authorized users get access to specific objects according to the current context. These permissions are subject to continuous adjustments triggered by the changing context. Thus, the risk of compromising information integrity during task executions is reduced.
SHIWA Services for Workflow Creation and Sharing in Hydrometeorolog

NASA Astrophysics Data System (ADS)

Terstyanszky, Gabor; Kiss, Tamas; Kacsuk, Peter; Sipos, Gergely

2014-05-01

Researchers want to run scientific experiments on Distributed Computing Infrastructures (DCI) to access large pools of resources and services. To run these experiments requires specific expertise that they may not have. Workflows can hide resources and services as a virtualisation layer providing a user interface that researchers can use. There are many scientific workflow systems but they are not interoperable. To learn a workflow system and create workflows may require significant efforts. Considering these efforts it is not reasonable to expect that researchers will learn new workflow systems if they want to run workflows developed in other workflow systems. To overcome it requires creating workflow interoperability solutions to allow workflow sharing. The FP7 'Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs' (SHIWA) project developed the Coarse-Grained Interoperability concept (CGI). It enables recycling and sharing workflows of different workflow systems and executing them on different DCIs. SHIWA developed the SHIWA Simulation Platform (SSP) to implement the CGI concept integrating three major components: the SHIWA Science Gateway, the workflow engines supported by the CGI concept and DCI resources where workflows are executed. The science gateway contains a portal, a submission service, a workflow repository and a proxy server to support the whole workflow life-cycle. The SHIWA Portal allows workflow creation, configuration, execution and monitoring through a Graphical User Interface using the WS-PGRADE workflow system as the host workflow system. The SHIWA Repository stores the formal description of workflows and workflow engines plus executables and data needed to execute them. It offers a wide-range of browse and search operations. To support non-native workflow execution the SHIWA Submission Service imports the workflow and workflow engine from the SHIWA Repository. This service either invokes locally or remotely pre-deployed workflow engines or submits workflow engines with the workflow to local or remote resources to execute workflows. The SHIWA Proxy Server manages certificates needed to execute the workflows on different DCIs. Currently SSP supports sharing of ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflows. Further workflow systems can be added to the simulation platform as required by research communities. The FP7 'Building a European Research Community through Interoperable Workflows and Data' (ER-flow) project disseminates the achievements of the SHIWA project to build workflow user communities across Europe. ER-flow provides application supports to research communities within (Astrophysics, Computational Chemistry, Heliophysics and Life Sciences) and beyond (Hydrometeorology and Seismology) to develop, share and run workflows through the simulation platform. The simulation platform supports four usage scenarios: creating and publishing workflows in the repository, searching and selecting workflows in the repository, executing non-native workflows and creating and running meta-workflows. The presentation will outline the CGI concept, the SHIWA Simulation Platform, the ER-flow usage scenarios and how the Hydrometeorology research community runs simulations on SSP.
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE PAGES

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar; ...

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Optimization of tomographic reconstruction workflows on geographically distributed resources

PubMed Central

Bicer, Tekin; Gürsoy, Doǧa; Kettimuthu, Rajkumar; De Carlo, Francesco; Foster, Ian T.

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modeling of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Moreover, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks. PMID:27359149
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in ultrascale environments

DOE PAGES

Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...

2016-10-06

The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Data handling with SAM and art at the NO vA experiment

DOE PAGES

Aurisano, A.; Backhouse, C.; Davies, G. S.; ...

2015-12-23

During operations, NOvA produces between 5,000 and 7,000 raw files per day with peaks in excess of 12,000. These files must be processed in several stages to produce fully calibrated and reconstructed analysis files. In addition, many simulated neutrino interactions must be produced and processed through the same stages as data. To accommodate the large volume of data and Monte Carlo, production must be possible both on the Fermilab grid and on off-site farms, such as the ones accessible through the Open Science Grid. To handle the challenge of cataloging these files and to facilitate their off-line processing, we havemore » adopted the SAM system developed at Fermilab. SAM indexes files according to metadata, keeps track of each file's physical locations, provides dataset management facilities, and facilitates data transfer to off-site grids. To integrate SAM with Fermilab's art software framework and the NOvA production workflow, we have developed methods to embed metadata into our configuration files, art files, and standalone ROOT files. A module in the art framework propagates the embedded information from configuration files into art files, and from input art files to output art files, allowing us to maintain a complete processing history within our files. Embedding metadata in configuration files also allows configuration files indexed in SAM to be used as inputs to Monte Carlo production jobs. Further, SAM keeps track of the input files used to create each output file. Parentage information enables the construction of self-draining datasets which have become the primary production paradigm used at NOvA. In this study we will present an overview of SAM at NOvA and how it has transformed the file production framework used by the experiment.« less
Lessons from Implementing a Combined Workflow–Informatics System for Diabetes Management

PubMed Central

Zai, Adrian H.; Grant, Richard W.; Estey, Greg; Lester, William T.; Andrews, Carl T.; Yee, Ronnie; Mort, Elizabeth; Chueh, Henry C.

2008-01-01

Shortcomings surrounding the care of patients with diabetes have been attributed largely to a fragmented, disorganized, and duplicative health care system that focuses more on acute conditions and complications than on managing chronic disease. To address these shortcomings, we developed a diabetes registry population management application to change the way our staff manages patients with diabetes. Use of this new application has helped us coordinate the responsibilities for intervening and monitoring patients in the registry among different users. Our experiences using this combined workflow-informatics intervention system suggest that integrating a chronic disease registry into clinical workflow for the treatment of chronic conditions creates a useful and efficient tool for managing disease. PMID:18436907

Expanding the user base beyond HEP for the Ganga distributed analysis user interface

NASA Astrophysics Data System (ADS)

Currie, R.; Egede, U.; Richards, A.; Slater, M.; Williams, M.

2017-10-01

This document presents the result of recent developments within Ganga[1] project to support users from new communities outside of HEP. In particular I will examine the case of users from the Large Scale Survey Telescope (LSST) group looking to use resources provided by the UK based GridPP[2][3] DIRAC[4][5] instance. An example use case is work performed with users from the LSST Virtual Organisation (VO) to distribute the workflow used for galaxy shape identification analyses. This work highlighted some LSST specific challenges which could be well solved by common tools within the HEP community. As a result of this work the LSST community was able to take advantage of GridPP[2][3] resources to perform large computing tasks within the UK.
A pattern-based analysis of clinical computer-interpretable guideline modeling languages.

PubMed

Mulyar, Nataliya; van der Aalst, Wil M P; Peleg, Mor

2007-01-01

Languages used to specify computer-interpretable guidelines (CIGs) differ in their approaches to addressing particular modeling challenges. The main goals of this article are: (1) to examine the expressive power of CIG modeling languages, and (2) to define the differences, from the control-flow perspective, between process languages in workflow management systems and modeling languages used to design clinical guidelines. The pattern-based analysis was applied to guideline modeling languages Asbru, EON, GLIF, and PROforma. We focused on control-flow and left other perspectives out of consideration. We evaluated the selected CIG modeling languages and identified their degree of support of 43 control-flow patterns. We used a set of explicitly defined evaluation criteria to determine whether each pattern is supported directly, indirectly, or not at all. PROforma offers direct support for 22 of 43 patterns, Asbru 20, GLIF 17, and EON 11. All four directly support basic control-flow patterns, cancellation patterns, and some advance branching and synchronization patterns. None support multiple instances patterns. They offer varying levels of support for synchronizing merge patterns and state-based patterns. Some support a few scenarios not covered by the 43 control-flow patterns. CIG modeling languages are remarkably close to traditional workflow languages from the control-flow perspective, but cover many fewer workflow patterns. CIG languages offer some flexibility that supports modeling of complex decisions and provide ways for modeling some decisions not covered by workflow management systems. Workflow management systems may be suitable for clinical guideline applications.
Integrate Data into Scientific Workflows for Terrestrial Biosphere Model Evaluation through Brokers

NASA Astrophysics Data System (ADS)

Wei, Y.; Cook, R. B.; Du, F.; Dasgupta, A.; Poco, J.; Huntzinger, D. N.; Schwalm, C. R.; Boldrini, E.; Santoro, M.; Pearlman, J.; Pearlman, F.; Nativi, S.; Khalsa, S.

2013-12-01

Terrestrial biosphere models (TBMs) have become integral tools for extrapolating local observations and process-level understanding of land-atmosphere carbon exchange to larger regions. Model-model and model-observation intercomparisons are critical to understand the uncertainties within model outputs, to improve model skill, and to improve our understanding of land-atmosphere carbon exchange. The DataONE Exploration, Visualization, and Analysis (EVA) working group is evaluating TBMs using scientific workflows in UV-CDAT/VisTrails. This workflow-based approach promotes collaboration and improved tracking of evaluation provenance. But challenges still remain. The multi-scale and multi-discipline nature of TBMs makes it necessary to include diverse and distributed data resources in model evaluation. These include, among others, remote sensing data from NASA, flux tower observations from various organizations including DOE, and inventory data from US Forest Service. A key challenge is to make heterogeneous data from different organizations and disciplines discoverable and readily integrated for use in scientific workflows. This presentation introduces the brokering approach taken by the DataONE EVA to fill the gap between TBMs' evaluation scientific workflows and cross-organization and cross-discipline data resources. The DataONE EVA started the development of an Integrated Model Intercomparison Framework (IMIF) that leverages standards-based discovery and access brokers to dynamically discover, access, and transform (e.g. subset and resampling) diverse data products from DataONE, Earth System Grid (ESG), and other data repositories into a format that can be readily used by scientific workflows in UV-CDAT/VisTrails. The discovery and access brokers serve as an independent middleware that bridge existing data repositories and TBMs evaluation scientific workflows but introduce little overhead to either component. In the initial work, an OpenSearch-based discovery broker is leveraged to provide a consistent mechanism for data discovery. Standards-based data services, including Open Geospatial Consortium (OGC) Web Coverage Service (WCS) and THREDDS are leveraged to provide on-demand data access and transformations through the data access broker. To ease the adoption of broker services, a package of broker client VisTrails modules have been developed to be easily plugged into scientific workflows. The initial IMIF has been successfully tested in selected model evaluation scenarios involved in the NASA-funded Multi-scale Synthesis and Terrestrial Model Intercomparison Project (MsTMIP).
Asterism: an integrated, complete, and open-source approach for running seismologist continuous data-intensive analysis on heterogeneous systems

NASA Astrophysics Data System (ADS)

Ferreira da Silva, R.; Filgueira, R.; Deelman, E.; Atkinson, M.

2016-12-01

We present Asterism, an open source data-intensive framework, which combines the Pegasus and dispel4py workflow systems. Asterism aims to simplify the effort required to develop data-intensive applications that run across multiple heterogeneous resources, without users having to: re-formulate their methods according to different enactment systems; manage the data distribution across systems; parallelize their methods; co-place and schedule their methods with computing resources; and store and transfer large/small volumes of data. Asterism's key element is to leverage the strengths of each workflow system: dispel4py allows developing scientific applications locally and then automatically parallelize and scale them on a wide range of HPC infrastructures with no changes to the application's code; Pegasus orchestrates the distributed execution of applications while providing portability, automated data management, recovery, debugging, and monitoring, without users needing to worry about the particulars of the target execution systems. Asterism leverages the level of abstractions provided by each workflow system to describe hybrid workflows where no information about the underlying infrastructure is required beforehand. The feasibility of Asterism has been evaluated using the seismic ambient noise cross-correlation application, a common data-intensive analysis pattern used by many seismologists. The application preprocesses (Phase1) and cross-correlates (Phase2) traces from several seismic stations. The Asterism workflow is implemented as a Pegasus workflow composed of two tasks (Phase1 and Phase2), where each phase represents a dispel4py workflow. Pegasus tasks describe the in/output data at a logical level, the data dependency between tasks, and the e-Infrastructures and the execution engine to run each dispel4py workflow. We have instantiated the workflow using data from 1000 stations from the IRIS services, and run it across two heterogeneous resources described as Docker containers: MPI (Container2) and Storm (Container3) clusters (Figure 1). Each dispel4py workflow is mapped to a particular execution engine, and data transfers between resources are automatically handled by Pegasus. Asterism is freely available online at http://github.com/dispel4py/pegasus_dispel4py.
Streamling the Change Management with Business Rules

NASA Technical Reports Server (NTRS)

Savela, Christopher

2015-01-01

Will discuss how their organization is trying to streamline workflows and the change management process with business rules. In looking for ways to make things more efficient and save money one way is to reduce the work the workflow task approvers have to do when reviewing affected items. Will share the technical details of the business rules, how to implement them, how to speed up the development process by using the API to demonstrate the rules in action.
Data Grid Management Systems

NASA Technical Reports Server (NTRS)

Moore, Reagan W.; Jagatheesan, Arun; Rajasekar, Arcot; Wan, Michael; Schroeder, Wayne

2004-01-01

The "Grid" is an emerging infrastructure for coordinating access across autonomous organizations to distributed, heterogeneous computation and data resources. Data grids are being built around the world as the next generation data handling systems for sharing, publishing, and preserving data residing on storage systems located in multiple administrative domains. A data grid provides logical namespaces for users, digital entities and storage resources to create persistent identifiers for controlling access, enabling discovery, and managing wide area latencies. This paper introduces data grids and describes data grid use cases. The relevance of data grids to digital libraries and persistent archives is demonstrated, and research issues in data grids and grid dataflow management systems are discussed.
Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

PubMed

Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

2016-02-27

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Performance management of high performance computing for medical image processing in Amazon Web Services

NASA Astrophysics Data System (ADS)

Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

2016-03-01

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services

PubMed Central

Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

2016-01-01

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline. PMID:27127335
SuperB Simulation Production System

NASA Astrophysics Data System (ADS)

Tomassetti, L.; Bianchi, F.; Ciaschini, V.; Corvo, M.; Del Prete, D.; Di Simone, A.; Donvito, G.; Fella, A.; Franchini, P.; Giacomini, F.; Gianoli, A.; Longo, S.; Luitz, S.; Luppi, E.; Manzali, M.; Pardi, S.; Paolini, A.; Perez, A.; Rama, M.; Russo, G.; Santeramo, B.; Stroili, R.

2012-12-01

The SuperB asymmetric e+e- collider and detector to be built at the newly founded Nicola Cabibbo Lab will provide a uniquely sensitive probe of New Physics in the flavor sector of the Standard Model. Studying minute effects in the heavy quark and heavy lepton sectors requires a data sample of 75 ab-1 and a peak luminosity of 1036 cm-2 s-1. The SuperB Computing group is working on developing a simulation production framework capable to satisfy the experiment needs. It provides access to distributed resources in order to support both the detector design definition and its performance evaluation studies. During last year the framework has evolved from the point of view of job workflow, Grid services interfaces and technologies adoption. A complete code refactoring and sub-component language porting now permits the framework to sustain distributed production involving resources from two continents and Grid Flavors. In this paper we will report a complete description of the production system status of the art, its evolution and its integration with Grid services; in particular, we will focus on the utilization of new Grid component features as in LB and WMS version 3. Results from the last official SuperB production cycle will be reported.
Construction and application research of Three-dimensional digital power grid in Southwest China

NASA Astrophysics Data System (ADS)

Zhou, Yang; Zhou, Hong; You, Chuan; Jiang, Li; Xin, Weidong

2018-01-01

With the rapid development of Three-dimensional (3D) digital design technology in the field of power grid construction, the data foundation and technical means of 3D digital power grid construction approaches perfection. 3D digital power grid has gradually developed into an important part of power grid construction and management. In view of the complicated geological conditions in Southwest China and the difficulty in power grid construction and management, this paper is based on the data assets of Southwest power grid, and it aims at establishing a 3D digital power grid in Southwest China to provide effective support for power grid construction and operation management. This paper discusses the data architecture, technical architecture and system design and implementation process of the 3D digital power grid construction through teasing the key technology of 3D digital power grid. The application of power grid data assets management, transmission line corridor planning, geological hazards risk assessment, environmental impact assessment in 3D digital power grid are also discussed and analysed.
Additional Security Considerations for Grid Management

NASA Technical Reports Server (NTRS)

Eidson, Thomas M.

2003-01-01

The use of Grid computing environments is growing in popularity. A Grid computing environment is primarily a wide area network that encompasses multiple local area networks, where some of the local area networks are managed by different organizations. A Grid computing environment also includes common interfaces for distributed computing software so that the heterogeneous set of machines that make up the Grid can be used more easily. The other key feature of a Grid is that the distributed computing software includes appropriate security technology. The focus of most Grid software is on the security involved with application execution, file transfers, and other remote computing procedures. However, there are other important security issues related to the management of a Grid and the users who use that Grid. This note discusses these additional security issues and makes several suggestions as how they can be managed.
COSMOS: Python library for massively parallel workflows

PubMed Central

Gafni, Erik; Luquette, Lovelace J.; Lancaster, Alex K.; Hawkins, Jared B.; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P.; Tonellato, Peter J.

2014-01-01

Summary: Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. Availability and implementation: Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. Contact: dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24982428
COSMOS: Python library for massively parallel workflows.

PubMed

Gafni, Erik; Luquette, Lovelace J; Lancaster, Alex K; Hawkins, Jared B; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P; Tonellato, Peter J

2014-10-15

Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
ERM Ideas and Innovations

ERIC Educational Resources Information Center

Schmidt, Kari

2012-01-01

In this column, the author discusses how the management of e-books has introduced, at many libraries and in varying degrees, the challenges of maintaining effective technical services workflows. Four different e-book workflows are identified and explored, and the author takes a closer look at how particular variables for each are affected, such as…
DOE Office of Scientific and Technical Information (OSTI.GOV)

Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin

The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Tools for automated acoustic monitoring within the R package monitoR

USGS Publications Warehouse

Katz, Jonathan; Hafner, Sasha D.; Donovan, Therese

2016-01-01

The R package monitoR contains tools for managing an acoustic-monitoring program including survey metadata, template creation and manipulation, automated detection and results management. These tools are scalable for use with small projects as well as larger long-term projects and those with expansive spatial extents. Here, we describe typical workflow when using the tools in monitoR. Typical workflow utilizes a generic sequence of functions, with the option for either binary point matching or spectrogram cross-correlation detectors.
7 CFR 1710.102 - Borrower eligibility for different types of loans.

Code of Federal Regulations, 2014 CFR

2014-01-01

... implementation of demand side management, energy conservation programs, and on grid and off grid renewable energy... management, energy conservation programs, and on grid and off grid renewable energy systems. (c) One hundred..., energy conservation programs, and on grid and off grid renewable energy systems. (See 7 CFR part 1712...
Grid Technology as a Cyberinfrastructure for Delivering High-End Services to the Earth and Space Science Community

NASA Technical Reports Server (NTRS)

Hinke, Thomas H.

2004-01-01

Grid technology consists of middleware that permits distributed computations, data and sensors to be seamlessly integrated into a secure, single-sign-on processing environment. In &is environment, a user has to identify and authenticate himself once to the grid middleware, and then can utilize any of the distributed resources to which he has been,panted access. Grid technology allows resources that exist in enterprises that are under different administrative control to be securely integrated into a single processing environment The grid community has adopted commercial web services technology as a means for implementing persistent, re-usable grid services that sit on top of the basic distributed processing environment that grids provide. These grid services can then form building blocks for even more complex grid services. Each grid service is characterized using the Web Service Description Language, which provides a description of the interface and how other applications can access it. The emerging Semantic grid work seeks to associates sufficient semantic information with each grid service such that applications wii1 he able to automatically select, compose and if necessary substitute available equivalent services in order to assemble collections of services that are most appropriate for a particular application. Grid technology has been used to provide limited support to various Earth and space science applications. Looking to the future, this emerging grid service technology can provide a cyberinfrastructures for both the Earth and space science communities. Groups within these communities could transform those applications that have community-wide applicability into persistent grid services that are made widely available to their respective communities. In concert with grid-enabled data archives, users could easily create complex workflows that extract desired data from one or more archives and process it though an appropriate set of widely distributed grid services discovered using semantic grid technology. As required, high-end computational resources could be drawn from available grid resource pools. Using grid technology, this confluence of data, services and computational resources could easily be harnessed to transform data from many different sources into a desired product that is delivered to a user's workstation or to a web portal though which it could be accessed by its intended audience.
A big data approach for climate change indicators processing in the CLIP-C project

NASA Astrophysics Data System (ADS)

D'Anca, Alessandro; Conte, Laura; Palazzo, Cosimo; Fiore, Sandro; Aloisio, Giovanni

2016-04-01

Defining and implementing processing chains with multiple (e.g. tens or hundreds of) data analytics operators can be a real challenge in many practical scientific use cases such as climate change indicators. This is usually done via scripts (e.g. bash) on the client side and requires climate scientists to take care of, implement and replicate workflow-like control logic aspects (which may be error-prone too) in their scripts, along with the expected application-level part. Moreover, the big amount of data and the strong I/O demand pose additional challenges related to the performance. In this regard, production-level tools for climate data analysis are mostly sequential and there is a lack of big data analytics solutions implementing fine-grain data parallelism or adopting stronger parallel I/O strategies, data locality, workflow optimization, etc. High-level solutions leveraging on workflow-enabled big data analytics frameworks for eScience could help scientists in defining and implementing the workflows related to their experiments by exploiting a more declarative, efficient and powerful approach. This talk will start introducing the main needs and challenges regarding big data analytics workflow management for eScience and will then provide some insights about the implementation of some real use cases related to some climate change indicators on large datasets produced in the context of the CLIP-C project - a EU FP7 project aiming at providing access to climate information of direct relevance to a wide variety of users, from scientists to policy makers and private sector decision makers. All the proposed use cases have been implemented exploiting the Ophidia big data analytics framework. The software stack includes an internal workflow management system, which coordinates, orchestrates, and optimises the execution of multiple scientific data analytics and visualization tasks. Real-time workflow monitoring execution is also supported through a graphical user interface. In order to address the challenges of the use cases, the implemented data analytics workflows include parallel data analysis, metadata management, virtual file system tasks, maps generation, rolling of datasets, and import/export of datasets in NetCDF format. The use cases have been implemented on a HPC cluster of 8-nodes (16-cores/node) of the Athena Cluster available at the CMCC Supercomputing Centre. Benchmark results will be also presented during the talk.

The Need of Nested Grids for Aerial and Satellite Images and Digital Elevation Models

NASA Astrophysics Data System (ADS)

Villa, G.; Mas, S.; Fernández-Villarino, X.; Martínez-Luceño, J.; Ojeda, J. C.; Pérez-Martín, B.; Tejeiro, J. A.; García-González, C.; López-Romero, E.; Soteres, C.

2016-06-01

Usual workflows for production, archiving, dissemination and use of Earth observation images (both aerial and from remote sensing satellites) pose big interoperability problems, as for example: non-alignment of pixels at the different levels of the pyramids that makes it impossible to overlay, compare and mosaic different orthoimages, without resampling them and the need to apply multiple resamplings and compression-decompression cycles. These problems cause great inefficiencies in production, dissemination through web services and processing in "Big Data" environments. Most of them can be avoided, or at least greatly reduced, with the use of a common "nested grid" for mutiresolution production, archiving, dissemination and exploitation of orthoimagery, digital elevation models and other raster data. "Nested grids" are space allocation schemas that organize image footprints, pixel sizes and pixel positions at all pyramid levels, in order to achieve coherent and consistent multiresolution coverage of a whole working area. A "nested grid" must be complemented by an appropriate "tiling schema", ideally based on the "quad-tree" concept. In the last years a "de facto standard" grid and Tiling Schema has emerged and has been adopted by virtually all major geospatial data providers. It has also been adopted by OGC in its "WMTS Simple Profile" standard. In this paper we explain how the adequate use of this tiling schema as common nested grid for orthoimagery, DEMs and other types of raster data constitutes the most practical solution to most of the interoperability problems of these types of data.
Design and Execution of make-like, distributed Analyses based on Spotify’s Pipelining Package Luigi

NASA Astrophysics Data System (ADS)

Erdmann, M.; Fischer, B.; Fischer, R.; Rieger, M.

2017-10-01

In high-energy particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses and provides a make-like execution system. It is based on the open-source pipelining package Luigi which was developed at Spotify and enables the definition of arbitrary workloads, so-called Tasks, and the dependencies between them in a lightweight and scalable structure. Further features are multi-user support, automated dependency resolution and error handling, central scheduling, and status visualization in the web. In addition to already built-in features for remote jobs and file systems like Hadoop and HDFS, we added support for WLCG infrastructure such as LSF and CREAM job submission, as well as remote file access through the Grid File Access Library. Furthermore, we implemented automated resubmission functionality, software sandboxing, and a command line interface with auto-completion for a convenient working environment. For the implementation of a t \\overline{{{t}}} H cross section measurement, we created a generic Python interface that provides programmatic access to all external information such as datasets, physics processes, statistical models, and additional files and values. In summary, the setup enables the execution of the entire analysis in a parallelized and distributed fashion with a single command.
A microseismic workflow for managing induced seismicity risk as CO 2 storage projects

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matzel, E.; Morency, C.; Pyle, M.

2015-10-27

It is well established that fluid injection has the potential to induce earthquakes—from microseismicity to large, damaging events—by altering state-of-stress conditions in the subsurface. While induced seismicity has not been a major operational issue for carbon storage projects to date, a seismicity hazard exists and must be carefully addressed. Two essential components of effective seismic risk management are (1) sensitive microseismic monitoring and (2) robust data interpretation tools. This report describes a novel workflow, based on advanced processing algorithms applied to microseismic data, to help improve management of seismic risk. This workflow has three main goals: (1) to improve themore » resolution and reliability of passive seismic monitoring, (2) to extract additional, valuable information from continuous waveform data that is often ignored in standard processing, and (3) to minimize the turn-around time between data collection, interpretation, and decision-making. These three objectives can allow for a better-informed and rapid response to changing subsurface conditions.« less
Implementing CORAL: An Electronic Resource Management System

ERIC Educational Resources Information Center

Whitfield, Sharon

2011-01-01

A 2010 electronic resource management survey conducted by Maria Collins of North Carolina State University and Jill E. Grogg of University of Alabama Libraries found that the top six electronic resources management priorities included workflow management, communications management, license management, statistics management, administrative…
A framework for service enterprise workflow simulation with multi-agents cooperation

NASA Astrophysics Data System (ADS)

Tan, Wenan; Xu, Wei; Yang, Fujun; Xu, Lida; Jiang, Chuanqun

2013-11-01

Process dynamic modelling for service business is the key technique for Service-Oriented information systems and service business management, and the workflow model of business processes is the core part of service systems. Service business workflow simulation is the prevalent approach to be used for analysis of service business process dynamically. Generic method for service business workflow simulation is based on the discrete event queuing theory, which is lack of flexibility and scalability. In this paper, we propose a service workflow-oriented framework for the process simulation of service businesses using multi-agent cooperation to address the above issues. Social rationality of agent is introduced into the proposed framework. Adopting rationality as one social factor for decision-making strategies, a flexible scheduling for activity instances has been implemented. A system prototype has been developed to validate the proposed simulation framework through a business case study.
Improving data collection, documentation, and workflow in a dementia screening study.

PubMed

Read, Kevin B; LaPolla, Fred Willie Zametkin; Tolea, Magdalena I; Galvin, James E; Surkis, Alisa

2017-04-01

A clinical study team performing three multicultural dementia screening studies identified the need to improve data management practices and facilitate data sharing. A collaboration was initiated with librarians as part of the National Library of Medicine (NLM) informationist supplement program. The librarians identified areas for improvement in the studies' data collection, entry, and processing workflows. The librarians' role in this project was to meet needs expressed by the study team around improving data collection and processing workflows to increase study efficiency and ensure data quality. The librarians addressed the data collection, entry, and processing weaknesses through standardizing and renaming variables, creating an electronic data capture system using REDCap, and developing well-documented, reproducible data processing workflows. NLM informationist supplements provide librarians with valuable experience in collaborating with study teams to address their data needs. For this project, the librarians gained skills in project management, REDCap, and understanding of the challenges and specifics of a clinical research study. However, the time and effort required to provide targeted and intensive support for one study team was not scalable to the library's broader user community.
Scalable and cost-effective NGS genotyping in the cloud.

PubMed

Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P

2015-10-15

While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.
PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

PubMed

Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

2016-10-06

With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.
Talkoot Portals: Discover, Tag, Share, and Reuse Collaborative Science Workflows

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Ramachandran, R.; Lynnes, C.

2009-05-01

A small but growing number of scientists are beginning to harness Web 2.0 technologies, such as wikis, blogs, and social tagging, as a transformative way of doing science. These technologies provide researchers easy mechanisms to critique, suggest and share ideas, data and algorithms. At the same time, large suites of algorithms for science analysis are being made available as remotely-invokable Web Services, which can be chained together to create analysis workflows. This provides the research community an unprecedented opportunity to collaborate by sharing their workflows with one another, reproducing and analyzing research results, and leveraging colleagues' expertise to expedite the process of scientific discovery. However, wikis and similar technologies are limited to text, static images and hyperlinks, providing little support for collaborative data analysis. A team of information technology and Earth science researchers from multiple institutions have come together to improve community collaboration in science analysis by developing a customizable "software appliance" to build collaborative portals for Earth Science services and analysis workflows. The critical requirement is that researchers (not just information technologists) be able to build collaborative sites around service workflows within a few hours. We envision online communities coming together, much like Finnish "talkoot" (a barn raising), to build a shared research space. Talkoot extends a freely available, open source content management framework with a series of modules specific to Earth Science for registering, creating, managing, discovering, tagging and sharing Earth Science web services and workflows for science data processing, analysis and visualization. Users will be able to author a "science story" in shareable web notebooks, including plots or animations, backed up by an executable workflow that directly reproduces the science analysis. New services and workflows of interest will be discoverable using tag search, and advertised using "service casts" and "interest casts" (Atom feeds). Multiple science workflow systems will be plugged into the system, with initial support for UAH's Mining Workflow Composer and the open-source Active BPEL engine, and JPL's SciFlo engine and the VizFlow visual programming interface. With the ability to share and execute analysis workflows, Talkoot portals can be used to do collaborative science in addition to communicate ideas and results. It will be useful for different science domains, mission teams, research projects and organizations. Thus, it will help to solve the "sociological" problem of bringing together disparate groups of researchers, and the technical problem of advertising, discovering, developing, documenting, and maintaining inter-agency science workflows. The presentation will discuss the goals of and barriers to Science 2.0, the social web technologies employed in the Talkoot software appliance (e.g. CMS, social tagging, personal presence, advertising by feeds, etc.), illustrate the resulting collaborative capabilities, and show early prototypes of the web interfaces (e.g. embedded workflows).
Towards Exascale Seismic Imaging and Inversion

NASA Astrophysics Data System (ADS)

Tromp, J.; Bozdag, E.; Lefebvre, M. P.; Smith, J. A.; Lei, W.; Ruan, Y.

2015-12-01

Post-petascale supercomputers are now available to solve complex scientific problems that were thought unreachable a few decades ago. They also bring a cohort of concerns tied to obtaining optimum performance. Several issues are currently being investigated by the HPC community. These include energy consumption, fault resilience, scalability of the current parallel paradigms, workflow management, I/O performance and feature extraction with large datasets. In this presentation, we focus on the last three issues. In the context of seismic imaging and inversion, in particular for simulations based on adjoint methods, workflows are well defined.They consist of a few collective steps (e.g., mesh generation or model updates) and of a large number of independent steps (e.g., forward and adjoint simulations of each seismic event, pre- and postprocessing of seismic traces). The greater goal is to reduce the time to solution, that is, obtaining a more precise representation of the subsurface as fast as possible. This brings us to consider both the workflow in its entirety and the parts comprising it. The usual approach is to speedup the purely computational parts based on code optimization in order to reach higher FLOPS and better memory management. This still remains an important concern, but larger scale experiments show that the imaging workflow suffers from severe I/O bottlenecks. Such limitations occur both for purely computational data and seismic time series. The latter are dealt with by the introduction of a new Adaptable Seismic Data Format (ASDF). Parallel I/O libraries, namely HDF5 and ADIOS, are used to drastically reduce the cost of disk access. Parallel visualization tools, such as VisIt, are able to take advantage of ADIOS metadata to extract features and display massive datasets. Because large parts of the workflow are embarrassingly parallel, we are investigating the possibility of automating the imaging process with the integration of scientific workflow management tools, specifically Pegasus.
First benchmark of the Unstructured Grid Adaptation Working Group

NASA Technical Reports Server (NTRS)

Ibanez, Daniel; Barral, Nicolas; Krakos, Joshua; Loseille, Adrien; Michal, Todd; Park, Mike

2017-01-01

Unstructured grid adaptation is a technology that holds the potential to improve the automation and accuracy of computational fluid dynamics and other computational disciplines. Difficulty producing the highly anisotropic elements necessary for simulation on complex curved geometries that satisfies a resolution request has limited this technology's widespread adoption. The Unstructured Grid Adaptation Working Group is an open gathering of researchers working on adapting simplicial meshes to conform to a metric field. Current members span a wide range of institutions including academia, industry, and national laboratories. The purpose of this group is to create a common basis for understanding and improving mesh adaptation. We present our first major contribution: a common set of benchmark cases, including input meshes and analytic metric specifications, that are publicly available to be used for evaluating any mesh adaptation code. We also present the results of several existing codes on these benchmark cases, to illustrate their utility in identifying key challenges common to all codes and important differences between available codes. Future directions are defined to expand this benchmark to mature the technology necessary to impact practical simulation workflows.
Development of an Excel-based laboratory information management system for improving workflow efficiencies in early ADME screening.

PubMed

Lu, Xinyan

2016-01-01

There is a clear requirement for enhancing laboratory information management during early absorption, distribution, metabolism and excretion (ADME) screening. The application of a commercial laboratory information management system (LIMS) is limited by complexity, insufficient flexibility, high costs and extended timelines. An improved custom in-house LIMS for ADME screening was developed using Excel. All Excel templates were generated through macros and formulae, and information flow was streamlined as much as possible. This system has been successfully applied in task generation, process control and data management, with a reduction in both labor time and human error rates. An Excel-based LIMS can provide a simple, flexible and cost/time-saving solution for improving workflow efficiencies in early ADME screening.
Integrated Automatic Workflow for Phylogenetic Tree Analysis Using Public Access and Local Web Services.

PubMed

Damkliang, Kasikrit; Tandayya, Pichaya; Sangket, Unitsa; Pasomsub, Ekawat

2016-11-28

At the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow's execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. All local services have been deployed at our portal http://bioservices.sci.psu.ac.th.
Integrated Automatic Workflow for Phylogenetic Tree Analysis Using Public Access and Local Web Services.

PubMed

Damkliang, Kasikrit; Tandayya, Pichaya; Sangket, Unitsa; Pasomsub, Ekawat

2016-03-01

At the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow's execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. The all local services have been deployed at our portal http://bioservices.sci.psu.ac.th.
What's the Point of a Raster ? Advantages of 3D Point Cloud Processing over Raster Based Methods for Accurate Geomorphic Analysis of High Resolution Topography.

NASA Astrophysics Data System (ADS)

Lague, D.

2014-12-01

High Resolution Topographic (HRT) datasets are predominantly stored and analyzed as 2D raster grids of elevations (i.e., Digital Elevation Models). Raster grid processing is common in GIS software and benefits from a large library of fast algorithms dedicated to geometrical analysis, drainage network computation and topographic change measurement. Yet, all instruments or methods currently generating HRT datasets (e.g., ALS, TLS, SFM, stereo satellite imagery) output natively 3D unstructured point clouds that are (i) non-regularly sampled, (ii) incomplete (e.g., submerged parts of river channels are rarely measured), and (iii) include 3D elements (e.g., vegetation, vertical features such as river banks or cliffs) that cannot be accurately described in a DEM. Interpolating the raw point cloud onto a 2D grid generally results in a loss of position accuracy, spatial resolution and in more or less controlled interpolation. Here I demonstrate how studying earth surface topography and processes directly on native 3D point cloud datasets offers several advantages over raster based methods: point cloud methods preserve the accuracy of the original data, can better handle the evaluation of uncertainty associated to topographic change measurements and are more suitable to study vegetation characteristics and steep features of the landscape. In this presentation, I will illustrate and compare Point Cloud based and Raster based workflows with various examples involving ALS, TLS and SFM for the analysis of bank erosion processes in bedrock and alluvial rivers, rockfall statistics (including rockfall volume estimate directly from point clouds) and the interaction of vegetation/hydraulics and sedimentation in salt marshes. These workflows use 2 recently published algorithms for point cloud classification (CANUPO) and point cloud comparison (M3C2) now implemented in the open source software CloudCompare.
Interplay between Clinical Guidelines and Organizational Workflow Systems. Experience from the MobiGuide Project.

PubMed

Shabo, Amnon; Peleg, Mor; Parimbelli, Enea; Quaglini, Silvana; Napolitano, Carlo

2016-12-07

Implementing a decision-support system within a healthcare organization requires integration of clinical domain knowledge with resource constraints. Computer-interpretable guidelines (CIG) are excellent instruments for addressing clinical aspects while business process management (BPM) languages and Workflow (Wf) engines manage the logistic organizational constraints. Our objective is the orchestration of all the relevant factors needed for a successful execution of patient's care pathways, especially when spanning the continuum of care, from acute to community or home care. We considered three strategies for integrating CIGs with organizational workflows: extending the CIG or BPM languages and their engines, or creating an interplay between them. We used the interplay approach to implement a set of use cases arising from a CIG implementation in the domain of Atrial Fibrillation. To provide a more scalable and standards-based solution, we explored the use of Cross-Enterprise Document Workflow Integration Profile. We describe our proof-of-concept implementation of five use cases. We utilized the Personal Health Record of the MobiGuide project to implement a loosely-coupled approach between the Activiti BPM engine and the Picard CIG engine. Changes in the PHR were detected by polling. IHE profiles were used to develop workflow documents that orchestrate cross-enterprise execution of cardioversion. Interplay between CIG and BPM engines can support orchestration of care flows within organizational settings.
Using Semantic Components to Represent Dynamics of an Interdisciplinary Healthcare Team in a Multi-Agent Decision Support System.

PubMed

Wilk, Szymon; Kezadri-Hamiaz, Mounira; Rosu, Daniela; Kuziemsky, Craig; Michalowski, Wojtek; Amyot, Daniel; Carrier, Marc

2016-02-01

In healthcare organizations, clinical workflows are executed by interdisciplinary healthcare teams (IHTs) that operate in ways that are difficult to manage. Responding to a need to support such teams, we designed and developed the MET4 multi-agent system that allows IHTs to manage patients according to presentation-specific clinical workflows. In this paper, we describe a significant extension of the MET4 system that allows for supporting rich team dynamics (understood as team formation, management and task-practitioner allocation), including selection and maintenance of the most responsible physician and more complex rules of selecting practitioners for the workflow tasks. In order to develop this extension, we introduced three semantic components: (1) a revised ontology describing concepts and relations pertinent to IHTs, workflows, and managed patients, (2) a set of behavioral rules describing the team dynamics, and (3) an instance base that stores facts corresponding to instances of concepts from the ontology and to relations between these instances. The semantic components are represented in first-order logic and they can be automatically processed using theorem proving and model finding techniques. We employ these techniques to find models that correspond to specific decisions controlling the dynamics of IHT. In the paper, we present the design of extended MET4 with a special focus on the new semantic components. We then describe its proof-of-concept implementation using the WADE multi-agent platform and the Z3 solver (theorem prover/model finder). We illustrate the main ideas discussed in the paper with a clinical scenario of an IHT managing a patient with chronic kidney disease.
Design of investment management optimization system for power grid companies under new electricity reform

NASA Astrophysics Data System (ADS)

Yang, Chunhui; Su, Zhixiong; Wang, Xin; Liu, Yang; Qi, Yongwei

2017-03-01

The new normalization of the economic situation and the implementation of a new round of electric power system reform put forward higher requirements to the daily operation of power grid companies. As an important day-to-day operation of power grid companies, investment management is directly related to the promotion of the company's operating efficiency and management level. In this context, the establishment of power grid company investment management optimization system will help to improve the level of investment management and control the company, which is of great significance for power gird companies to adapt to market environment changing as soon as possible and meet the policy environment requirements. Therefore, the purpose of this paper is to construct the investment management optimization system of power grid companies, which includes investment management system, investment process control system, investment structure optimization system, and investment project evaluation system and investment management information platform support system.
An ontological knowledge framework for adaptive medical workflow.

PubMed

Dang, Jiangbo; Hedayati, Amir; Hampel, Ken; Toklu, Candemir

2008-10-01

As emerging technologies, semantic Web and SOA (Service-Oriented Architecture) allow BPMS (Business Process Management System) to automate business processes that can be described as services, which in turn can be used to wrap existing enterprise applications. BPMS provides tools and methodologies to compose Web services that can be executed as business processes and monitored by BPM (Business Process Management) consoles. Ontologies are a formal declarative knowledge representation model. It provides a foundation upon which machine understandable knowledge can be obtained, and as a result, it makes machine intelligence possible. Healthcare systems can adopt these technologies to make them ubiquitous, adaptive, and intelligent, and then serve patients better. This paper presents an ontological knowledge framework that covers healthcare domains that a hospital encompasses-from the medical or administrative tasks, to hospital assets, medical insurances, patient records, drugs, and regulations. Therefore, our ontology makes our vision of personalized healthcare possible by capturing all necessary knowledge for a complex personalized healthcare scenario involving patient care, insurance policies, and drug prescriptions, and compliances. For example, our ontology facilitates a workflow management system to allow users, from physicians to administrative assistants, to manage, even create context-aware new medical workflows and execute them on-the-fly.
Optimizing insulin pump therapy: the potential advantages of using a structured diabetes management program.

PubMed

Lange, Karin; Ziegler, Ralph; Neu, Andreas; Reinehr, Thomas; Daab, Iris; Walz, Marion; Maraun, Michael; Schnell, Oliver; Kulzer, Bernhard; Reichel, Andreas; Heinemann, Lutz; Parkin, Christopher G; Haak, Thomas

2015-03-01

Use of continuous subcutaneous insulin infusion (CSII) therapy improves glycemic control, reduces hypoglycemia and increases treatment satisfaction in individuals with diabetes. As a number of patient- and clinician-related factors can hinder the effectiveness and optimal usage of CSII therapy, new approaches are needed to address these obstacles. Ceriello and colleagues recently proposed a model of care that incorporates the collaborative use of structured SMBG into a formal approach to personalized diabetes management within all diabetes populations. We adapted this model for use in CSII-treated patients in order to enable the implementation of a workflow structure that enhances patient-physician communication and supports patients' diabetes self-management skills. We recognize that time constraints and current reimbursement policies pose significant challenges to healthcare providers integrating the Personalised Diabetes Management (PDM) process into clinical practice. We believe, however, that the time invested in modifying practice workflow and learning to apply the various steps of the PDM process will be offset by improved workflow and more effective patient consultations. This article describes how to implement PDM into clinical practice as a systematic, standardized process that can optimize CSII therapy.

Performance Studies on Distributed Virtual Screening

PubMed Central

Krüger, Jens; de la Garza, Luis; Kohlbacher, Oliver; Nagel, Wolfgang E.

2014-01-01

Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting maximizing the speedup while considering overhead and available cores on Distributed Computing Infrastructures (DCIs). We have conducted thorough performance studies accounting not only for the runtime of the docking itself, but also for structure preparation. Performance studies were conducted via the workflow-enabled science gateway MoSGrid (Molecular Simulation Grid). As input we used benchmark datasets for protein kinases. Our performance studies show that docking workflows can be made to scale almost linearly up to 500 concurrent processes distributed even over large DCIs, thus accelerating vHTS campaigns significantly. PMID:25032219
Stability and Scalability of the CMS Global Pool: Pushing HTCondor and GlideinWMS to New Limits

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balcas, J.; Bockelman, B.; Hufnagel, D.

The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such asmore » multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE PAGES

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...

2017-09-29

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits

NASA Astrophysics Data System (ADS)

Balcas, J.; Bockelman, B.; Hufnagel, D.; Hurtado Anampa, K.; Aftab Khan, F.; Larson, K.; Letts, J.; Marra da Silva, J.; Mascheroni, M.; Mason, D.; Perez-Calero Yzquierdo, A.; Tiradani, A.

2017-10-01

The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.
Using NERSC High-Performance Computing (HPC) systems for high-energy nuclear physics applications with ALICE

NASA Astrophysics Data System (ADS)

Fasel, Markus

2016-10-01

High-Performance Computing Systems are powerful tools tailored to support large- scale applications that rely on low-latency inter-process communications to run efficiently. By design, these systems often impose constraints on application workflows, such as limited external network connectivity and whole node scheduling, that make more general-purpose computing tasks, such as those commonly found in high-energy nuclear physics applications, more difficult to carry out. In this work, we present a tool designed to simplify access to such complicated environments by handling the common tasks of job submission, software management, and local data management, in a framework that is easily adaptable to the specific requirements of various computing systems. The tool, initially constructed to process stand-alone ALICE simulations for detector and software development, was successfully deployed on the NERSC computing systems, Carver, Hopper and Edison, and is being configured to provide access to the next generation NERSC system, Cori. In this report, we describe the tool and discuss our experience running ALICE applications on NERSC HPC systems. The discussion will include our initial benchmarks of Cori compared to other systems and our attempts to leverage the new capabilities offered with Cori to support data-intensive applications, with a future goal of full integration of such systems into ALICE grid operations.
An ontology-based framework for bioinformatics workflows.

PubMed

Digiampietri, Luciano A; Perez-Alcazar, Jose de J; Medeiros, Claudia Bauzer

2007-01-01

The proliferation of bioinformatics activities brings new challenges - how to understand and organise these resources, how to exchange and reuse successful experimental procedures, and to provide interoperability among data and tools. This paper describes an effort toward these directions. It is based on combining research on ontology management, AI and scientific workflows to design, reuse and annotate bioinformatics experiments. The resulting framework supports automatic or interactive composition of tasks based on AI planning techniques and takes advantage of ontologies to support the specification and annotation of bioinformatics workflows. We validate our proposal with a prototype running on real data.
Structural development and web service based sensitivity analysis of the Biome-BGC MuSo model

NASA Astrophysics Data System (ADS)

Hidy, Dóra; Balogh, János; Churkina, Galina; Haszpra, László; Horváth, Ferenc; Ittzés, Péter; Ittzés, Dóra; Ma, Shaoxiu; Nagy, Zoltán; Pintér, Krisztina; Barcza, Zoltán

2014-05-01

Studying the greenhouse gas exchange, mainly the carbon dioxide sink and source character of ecosystems is still a highly relevant research topic in biogeochemistry. During the past few years research focused on managed ecosystems, because human intervention has an important role in the formation of the land surface through agricultural management, land use change, and other practices. In spite of considerable developments current biogeochemical models still have uncertainties to adequately quantify greenhouse gas exchange processes of managed ecosystem. Therefore, it is an important task to develop and test process-based biogeochemical models. Biome-BGC is a widely used, popular biogeochemical model that simulates the storage and flux of water, carbon, and nitrogen between the ecosystem and the atmosphere, and within the components of the terrestrial ecosystems. Biome-BGC was originally developed by the Numerical Terradynamic Simulation Group (NTSG) of University of Montana (http://www.ntsg.umt.edu/project/biome-bgc), and several other researchers used and modified it in the past. Our research group developed Biome-BGC version 4.1.1 to improve essentially the ability of the model to simulate carbon and water cycle in real managed ecosystems. The modifications included structural improvements of the model (e.g., implementation of multilayer soil module and drought related plant senescence; improved model phenology). Beside these improvements management modules and annually varying options were introduced and implemented (simulate mowing, grazing, planting, harvest, ploughing, application of fertilizers, forest thinning). Dynamic (annually varying) whole plant mortality was also enabled in the model to support more realistic simulation of forest stand development and natural disturbances. In the most recent model version separate pools have been defined for fruit. The model version which contains every former and new development is referred as Biome-BGC MuSo (Biome-BGC with multi-soil layer). Within the frame of the BioVeL project (http://www.biovel.eu) an open source and domain independent scientific workflow management system (http://www.taverna.org.uk) are used to support 'in silico' experimentation and easy applicability of different models including Biome-BGC MuSo. Workflows can be built upon functionally linked sets of web services like retrieval of meteorological dataset and other parameters; preparation of single run or spatial run model simulation; desk top grid technology based Monte Carlo experiment with parallel processing; model sensitivity analysis, etc. The newly developed, Monte Carlo experiment based sensitivity analysis is described in this study and results are presented about differences in the sensitivity of the original and the developed Biome-BGC model.
European grid services for global earth science

NASA Astrophysics Data System (ADS)

Brewer, S.; Sipos, G.

2012-04-01

This presentation will provide an overview of the distributed computing services that the European Grid Infrastructure (EGI) offers to the Earth Sciences community and also explain the processes whereby Earth Science users can engage with the infrastructure. One of the main overarching goals for EGI over the coming year is to diversify its user-base. EGI therefore - through the National Grid Initiatives (NGIs) that provide the bulk of resources that make up the infrastructure - offers a number of routes whereby users, either individually or as communities, can make use of its services. At one level there are two approaches to working with EGI: either users can make use of existing resources and contribute to their evolution and configuration; or alternatively they can work with EGI, and hence the NGIs, to incorporate their own resources into the infrastructure to take advantage of EGI's monitoring, networking and managing services. Adopting this approach does not imply a loss of ownership of the resources. Both of these approaches are entirely applicable to the Earth Sciences community. The former because researchers within this field have been involved with EGI (and previously EGEE) as a Heavy User Community and the latter because they have very specific needs, such as incorporating HPC services into their workflows, and these will require multi-skilled interventions to fully provide such services. In addition to the technical support services that EGI has been offering for the last year or so - the applications database, the training marketplace and the Virtual Organisation services - there now exists a dynamic short-term project framework that can be utilised to establish and operate services for Earth Science users. During this talk we will present a summary of various on-going projects that will be of interest to Earth Science users with the intention that suggestions for future projects will emerge from the subsequent discussions: • The Federated Cloud Task Force is already providing a cloud infrastructure through a few committed NGIs. This is being made available to research communities participating in the Task Force and the long-term aim is to integrate these national clouds into a pan-European infrastructure for scientific communities. • The MPI group provides support for application developers to port and scale up parallel applications to the global European Grid Infrastructure. • A lively portal developer and provider community that is able to setup and operate custom, application and/or community specific portals for members of the Earth Science community to interact with EGI. • A project to assess the possibilities for federated identity management in EGI and the readiness of EGI member states for federated authentication and authorisation mechanisms. • Operating resources and user support services to process data with new types of services and infrastructures, such as desktop grids, map-reduce frameworks, GPU clusters.
Teaching Workflow Analysis and Lean Thinking via Simulation: A Formative Evaluation

PubMed Central

Campbell, Robert James; Gantt, Laura; Congdon, Tamara

2009-01-01

This article presents the rationale for the design and development of a video simulation used to teach lean thinking and workflow analysis to health services and health information management students enrolled in a course on the management of health information. The discussion includes a description of the design process, a brief history of the use of simulation in healthcare, and an explanation of how video simulation can be used to generate experiential learning environments. Based on the results of a survey given to 75 students as part of a formative evaluation, the video simulation was judged effective because it allowed students to visualize a real-world process (concrete experience), contemplate the scenes depicted in the video along with the concepts presented in class in a risk-free environment (reflection), develop hypotheses about why problems occurred in the workflow process (abstract conceptualization), and develop solutions to redesign a selected process (active experimentation). PMID:19412533
Jflow: a workflow management system for web applications.

PubMed

Mariette, Jérôme; Escudié, Frédéric; Bardou, Philippe; Nabihoudine, Ibouniyamine; Noirot, Céline; Trotard, Marie-Stéphane; Gaspin, Christine; Klopp, Christophe

2016-02-01

Biologists produce large data sets and are in demand of rich and simple web portals in which they can upload and analyze their files. Providing such tools requires to mask the complexity induced by the needed High Performance Computing (HPC) environment. The connection between interface and computing infrastructure is usually specific to each portal. With Jflow, we introduce a Workflow Management System (WMS), composed of jQuery plug-ins which can easily be embedded in any web application and a Python library providing all requested features to setup, run and monitor workflows. Jflow is available under the GNU General Public License (GPL) at http://bioinfo.genotoul.fr/jflow. The package is coming with full documentation, quick start and a running test portal. Jerome.Mariette@toulouse.inra.fr. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Virtual Sensor Web Architecture

NASA Astrophysics Data System (ADS)

Bose, P.; Zimdars, A.; Hurlburt, N.; Doug, S.

2006-12-01

NASA envisions the development of smart sensor webs, intelligent and integrated observation network that harness distributed sensing assets, their associated continuous and complex data sets, and predictive observation processing mechanisms for timely, collaborative hazard mitigation and enhanced science productivity and reliability. This paper presents Virtual Sensor Web Infrastructure for Collaborative Science (VSICS) Architecture for sustained coordination of (numerical and distributed) model-based processing, closed-loop resource allocation, and observation planning. VSICS's key ideas include i) rich descriptions of sensors as services based on semantic markup languages like OWL and SensorML; ii) service-oriented workflow composition and repair for simple and ensemble models; event-driven workflow execution based on event-based and distributed workflow management mechanisms; and iii) development of autonomous model interaction management capabilities providing closed-loop control of collection resources driven by competing targeted observation needs. We present results from initial work on collaborative science processing involving distributed services (COSEC framework) that is being extended to create VSICS.
Creating a comprehensive customer service program to help convey critical and acute results of radiology studies.

PubMed

Towbin, Alexander J; Hall, Seth; Moskovitz, Jay; Johnson, Neil D; Donnelly, Lane F

2011-01-01

Communication of acute or critical results between the radiology department and referring clinicians has been a deficiency of many radiology departments. The failure to perform or document these communications can lead to poor patient care, patient safety issues, medical-legal issues, and complaints from referring clinicians. To mitigate these factors, a communication and documentation tool was created and incorporated into our departmental customer service program. This article will describe the implementation of a comprehensive customer service program in a hospital-based radiology department. A comprehensive customer service program was created in the radiology department. Customer service representatives were hired to answer the telephone calls to the radiology reading rooms and to help convey radiology results. The radiologists, referring clinicians, and customer service representatives were then linked via a novel workflow management system. This workflow management system provided tools to help facilitate the communication needs of each group. The number of studies with results conveyed was recorded from the implementation of the workflow management system. Between the implementation of the workflow management system on August 1, 2005, and June 1, 2009, 116,844 radiology results were conveyed to the referring clinicians and documented in the system. This accounts for more than 14% of the 828,516 radiology cases performed in this time frame. We have been successful in creating a comprehensive customer service program to convey and document communication of radiology results. This program has been widely used by the ordering clinicians as well as radiologists since its inception.
Multi-core processing and scheduling performance in CMS

NASA Astrophysics Data System (ADS)

Hernández, J. M.; Evans, D.; Foulkes, S.

2012-12-01

Commodity hardware is going many-core. We might soon not be able to satisfy the job memory needs per core in the current single-core processing model in High Energy Physics. In addition, an ever increasing number of independent and incoherent jobs running on the same physical hardware not sharing resources might significantly affect processing performance. It will be essential to effectively utilize the multi-core architecture. CMS has incorporated support for multi-core processing in the event processing framework and the workload management system. Multi-core processing jobs share common data in memory, such us the code libraries, detector geometry and conditions data, resulting in a much lower memory usage than standard single-core independent jobs. Exploiting this new processing model requires a new model in computing resource allocation, departing from the standard single-core allocation for a job. The experiment job management system needs to have control over a larger quantum of resource since multi-core aware jobs require the scheduling of multiples cores simultaneously. CMS is exploring the approach of using whole nodes as unit in the workload management system where all cores of a node are allocated to a multi-core job. Whole-node scheduling allows for optimization of the data/workflow management (e.g. I/O caching, local merging) but efficient utilization of all scheduled cores is challenging. Dedicated whole-node queues have been setup at all Tier-1 centers for exploring multi-core processing workflows in CMS. We present the evaluation of the performance scheduling and executing multi-core workflows in whole-node queues compared to the standard single-core processing workflows.
Merging the Machines of Modern Science

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Laura; Collins, Jim

Two recent projects have harnessed supercomputing resources at the US Department of Energy’s Argonne National Laboratory in a novel way to support major fusion science and particle collider experiments. Using leadership computing resources, one team ran fine-grid analysis of real-time data to make near-real-time adjustments to an ongoing experiment, while a second team is working to integrate Argonne’s supercomputers into the Large Hadron Collider/ATLAS workflow. Together these efforts represent a new paradigm of the high-performance computing center as a partner in experimental science.
Talkoot Portals: Discover, Tag, Share, and Reuse Collaborative Science Workflows (Invited)

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Ramachandran, R.; Lynnes, C.

2009-12-01

A small but growing number of scientists are beginning to harness Web 2.0 technologies, such as wikis, blogs, and social tagging, as a transformative way of doing science. These technologies provide researchers easy mechanisms to critique, suggest and share ideas, data and algorithms. At the same time, large suites of algorithms for science analysis are being made available as remotely-invokable Web Services, which can be chained together to create analysis workflows. This provides the research community an unprecedented opportunity to collaborate by sharing their workflows with one another, reproducing and analyzing research results, and leveraging colleagues’ expertise to expedite the process of scientific discovery. However, wikis and similar technologies are limited to text, static images and hyperlinks, providing little support for collaborative data analysis. A team of information technology and Earth science researchers from multiple institutions have come together to improve community collaboration in science analysis by developing a customizable “software appliance” to build collaborative portals for Earth Science services and analysis workflows. The critical requirement is that researchers (not just information technologists) be able to build collaborative sites around service workflows within a few hours. We envision online communities coming together, much like Finnish “talkoot” (a barn raising), to build a shared research space. Talkoot extends a freely available, open source content management framework with a series of modules specific to Earth Science for registering, creating, managing, discovering, tagging and sharing Earth Science web services and workflows for science data processing, analysis and visualization. Users will be able to author a “science story” in shareable web notebooks, including plots or animations, backed up by an executable workflow that directly reproduces the science analysis. New services and workflows of interest will be discoverable using tag search, and advertised using “service casts” and “interest casts” (Atom feeds). Multiple science workflow systems will be plugged into the system, with initial support for UAH’s Mining Workflow Composer and the open-source Active BPEL engine, and JPL’s SciFlo engine and the VizFlow visual programming interface. With the ability to share and execute analysis workflows, Talkoot portals can be used to do collaborative science in addition to communicate ideas and results. It will be useful for different science domains, mission teams, research projects and organizations. Thus, it will help to solve the “sociological” problem of bringing together disparate groups of researchers, and the technical problem of advertising, discovering, developing, documenting, and maintaining inter-agency science workflows. The presentation will discuss the goals of and barriers to Science 2.0, the social web technologies employed in the Talkoot software appliance (e.g. CMS, social tagging, personal presence, advertising by feeds, etc.), illustrate the resulting collaborative capabilities, and show early prototypes of the web interfaces (e.g. embedded workflows).
GENESIS SciFlo: Choreographing Interoperable Web Services on the Grid using a Semantically-Enabled Dataflow Execution Environment

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Xing, Z.

2007-12-01

The General Earth Science Investigation Suite (GENESIS) project is a NASA-sponsored partnership between the Jet Propulsion Laboratory, academia, and NASA data centers to develop a new suite of Web Services tools to facilitate multi-sensor investigations in Earth System Science. The goal of GENESIS is to enable large-scale, multi-instrument atmospheric science using combined datasets from the AIRS, MODIS, MISR, and GPS sensors. Investigations include cross-comparison of spaceborne climate sensors, cloud spectral analysis, study of upper troposphere-stratosphere water transport, study of the aerosol indirect cloud effect, and global climate model validation. The challenges are to bring together very large datasets, reformat and understand the individual instrument retrievals, co-register or re-grid the retrieved physical parameters, perform computationally-intensive data fusion and data mining operations, and accumulate complex statistics over months to years of data. To meet these challenges, we have developed a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data access, subsetting, registration, mining, fusion, compression, and advanced statistical analysis. SciFlo leverages remote Web Services, called via Simple Object Access Protocol (SOAP) or REST (one-line) URLs, and the Grid Computing standards (WS-* & Globus Alliance toolkits), and enables scientists to do multi- instrument Earth Science by assembling reusable Web Services and native executables into a distributed computing flow (tree of operators). The SciFlo client & server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. In particular, SciFlo exploits the wealth of datasets accessible by OpenGIS Consortium (OGC) Web Mapping Servers & Web Coverage Servers (WMS/WCS), and by Open Data Access Protocol (OpenDAP) servers. SciFlo also publishes its own SOAP services for space/time query and subsetting of Earth Science datasets, and automated access to large datasets via lists of (FTP, HTTP, or DAP) URLs which point to on-line HDF or netCDF files. Typical distributed workflows obtain datasets by calling standard WMS/WCS servers or discovering and fetching data granules from ftp sites; invoke remote analysis operators available as SOAP services (interface described by a WSDL document); and merge results into binary containers (netCDF or HDF files) for further analysis using local executable operators. Naming conventions (HDFEOS and CF-1.0 for netCDF) are exploited to automatically understand and read on-line datasets. More interoperable conventions, and broader adoption of existing converntions, are vital if we are to "scale up" automated choreography of Web Services beyond toy applications. Recently, the ESIP Federation sponsored a collaborative activity in which several ESIP members developed some collaborative science scenarios for atmospheric and aerosol science, and then choreographed services from multiple groups into demonstration workflows using the SciFlo engine and a Business Process Execution Language (BPEL) workflow engine. We will discuss the lessons learned from this activity, the need for standardized interfaces (like WMS/WCS), the difficulty in agreeing on even simple XML formats and interfaces, the benefits of doing collaborative science analysis at the "touch of a button" once services are connected, and further collaborations that are being pursued.
Ctrl "C"-Ctrl "V"; Using Gaming Peripherals to Improve Library Workflows and Enhance Staff Efficiency

ERIC Educational Resources Information Center

Litsey, Ryan; Harris, Rea; London, Jessie

2018-01-01

Library workflows are an area where repetitive stress can potentially reduce staff efficiency. Day to day activities that require a repetitive motion can bring about physical and psychological fatigue. For library managers, it is important to seek ways in which this type of repetitive stress can be alleviated while having the added benefit of…
Improving data collection, documentation, and workflow in a dementia screening study

PubMed Central

Read, Kevin B.; LaPolla, Fred Willie Zametkin; Tolea, Magdalena I.; Galvin, James E.; Surkis, Alisa

2017-01-01

Background A clinical study team performing three multicultural dementia screening studies identified the need to improve data management practices and facilitate data sharing. A collaboration was initiated with librarians as part of the National Library of Medicine (NLM) informationist supplement program. The librarians identified areas for improvement in the studies’ data collection, entry, and processing workflows. Case Presentation The librarians’ role in this project was to meet needs expressed by the study team around improving data collection and processing workflows to increase study efficiency and ensure data quality. The librarians addressed the data collection, entry, and processing weaknesses through standardizing and renaming variables, creating an electronic data capture system using REDCap, and developing well-documented, reproducible data processing workflows. Conclusions NLM informationist supplements provide librarians with valuable experience in collaborating with study teams to address their data needs. For this project, the librarians gained skills in project management, REDCap, and understanding of the challenges and specifics of a clinical research study. However, the time and effort required to provide targeted and intensive support for one study team was not scalable to the library’s broader user community. PMID:28377680
The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory

PubMed Central

Morris, Chris; Pajon, Anne; Griffiths, Susanne L.; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M.; Wilter da Silva, Alan; Pilicheva, Katya; Troshin, Peter; van Niekerk, Johannes; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S.; Stuart, David I.; Henrick, Kim; Esnouf, Robert M.

2011-01-01

The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service. PMID:21460443

The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory.

PubMed

Morris, Chris; Pajon, Anne; Griffiths, Susanne L; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M; da Silva, Alan Wilter; Pilicheva, Katya; Troshin, Peter; van Niekerk, Johannes; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S; Stuart, David I; Henrick, Kim; Esnouf, Robert M

2011-04-01

The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service.
Adaptive Workflows for Diabetes Management: Self-Management Assistant and Remote Treatment for Diabetes.

PubMed

Contreras, Iván; Kiefer, Stephan; Vehi, Josep

2017-01-01

Diabetes self-management is a crucial element for all people with diabetes and those at risk for developing the disease. Diabetic patients should be empowered to increase their self-management skills in order to prevent or delay the complications of diabetes. This work presents the proposal and first development stages of a smartphone application focused on the empowerment of the patients with diabetes. The concept of this interventional tool is based on the personalization of the user experience from an adaptive and dynamic perspective. The segmentation of the population and the dynamical treatment of user profiles among the different experience levels is the main challenge of the implementation. The self-management assistant and remote treatment for diabetes aims to develop a platform to integrate a series of innovative models and tools rigorously tested and supported by the research literature in diabetes together the use of a proved engine to manage workflows for healthcare.
Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator.

PubMed

Garcia Castro, Alexander; Thoraval, Samuel; Garcia, Leyla J; Ragan, Mark A

2005-04-07

Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive), ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download). From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous analytical tools.
Support for Taverna workflows in the VPH-Share cloud platform.

PubMed

Kasztelnik, Marek; Coto, Ernesto; Bubak, Marian; Malawski, Maciej; Nowakowski, Piotr; Arenas, Juan; Saglimbeni, Alfredo; Testi, Debora; Frangi, Alejandro F

2017-07-01

To address the increasing need for collaborative endeavours within the Virtual Physiological Human (VPH) community, the VPH-Share collaborative cloud platform allows researchers to expose and share sequences of complex biomedical processing tasks in the form of computational workflows. The Taverna Workflow System is a very popular tool for orchestrating complex biomedical & bioinformatics processing tasks in the VPH community. This paper describes the VPH-Share components that support the building and execution of Taverna workflows, and explains how they interact with other VPH-Share components to improve the capabilities of the VPH-Share platform. Taverna workflow support is delivered by the Atmosphere cloud management platform and the VPH-Share Taverna plugin. These components are explained in detail, along with the two main procedures that were developed to enable this seamless integration: workflow composition and execution. 1) Seamless integration of VPH-Share with other components and systems. 2) Extended range of different tools for workflows. 3) Successful integration of scientific workflows from other VPH projects. 4) Execution speed improvement for medical applications. The presented workflow integration provides VPH-Share users with a wide range of different possibilities to compose and execute workflows, such as desktop or online composition, online batch execution, multithreading, remote execution, etc. The specific advantages of each supported tool are presented, as are the roles of Atmosphere and the VPH-Share plugin within the VPH-Share project. The combination of the VPH-Share plugin and Atmosphere engenders the VPH-Share infrastructure with far more flexible, powerful and usable capabilities for the VPH-Share community. As both components can continue to evolve and improve independently, we acknowledge that further improvements are still to be developed and will be described. Copyright © 2017 Elsevier B.V. All rights reserved.
PIMS-Universal Payload Information Management

NASA Technical Reports Server (NTRS)

Elmore, Ralph; McNair, Ann R. (Technical Monitor)

2002-01-01

As the overall manager and integrator of International Space Station (ISS) science payloads and experiments, the Payload Operations Integration Center (POIC) at Marshall Space Flight Center had a critical need to provide an information management system for exchange and management of ISS payload files as well as to coordinate ISS payload related operational changes. The POIC's information management system has a fundamental requirement to provide secure operational access not only to users physically located at the POIC, but also to provide collaborative access to remote experimenters and International Partners. The Payload Information Management System (PIMS) is a ground based electronic document configuration management and workflow system that was built to service that need. Functionally, PIMS provides the following document management related capabilities: 1. File access control, storage and retrieval from a central repository vault. 2. Collect supplemental data about files in the vault. 3. File exchange with a PMS GUI client, or any FTP connection. 4. Files placement into an FTP accessible dropbox for pickup by interfacing facilities, included files transmitted for spacecraft uplink. 5. Transmission of email messages to users notifying them of new version availability. 6. Polling of intermediate facility dropboxes for files that will automatically be processed by PIMS. 7. Provide an API that allows other POIC applications to access PIMS information. Functionally, PIMS provides the following Change Request processing capabilities: 1. Ability to create, view, manipulate, and query information about Operations Change Requests (OCRs). 2. Provides an adaptable workflow approval of OCRs with routing through developers, facility leads, POIC leads, reviewers, and implementers. Email messages can be sent to users either involving them in the workflow process or simply notifying them of OCR approval progress. All PIMS document management and OCR workflow controls are coordinated through and routed to individual user's "to do" list tasks. A user is given a task when it is their turn to perform some action relating to the approval of the Document or OCR. The user's available actions are restricted to only functions available for the assigned task. Certain actions, such as review or action implementation by non-PIMS users, can also be coordinated through automated emails.
DIaaS: Data-Intensive workflows as a service - Enabling easy composition and deployment of data-intensive workflows on Virtual Research Environments

NASA Astrophysics Data System (ADS)

Filgueira, R.; Ferreira da Silva, R.; Deelman, E.; Atkinson, M.

2016-12-01

We present the Data-Intensive workflows as a Service (DIaaS) model for enabling easy data-intensive workflow composition and deployment on clouds using containers. DIaaS model backbone is Asterism, an integrated solution for running data-intensive stream-based applications on heterogeneous systems, which combines the benefits of dispel4py with Pegasus workflow systems. The stream-based executions of an Asterism workflow are managed by dispel4py, while the data movement between different e-Infrastructures, and the coordination of the application execution are automatically managed by Pegasus. DIaaS combines Asterism framework with Docker containers to provide an integrated, complete, easy-to-use, portable approach to run data-intensive workflows on distributed platforms. Three containers integrate the DIaaS model: a Pegasus node, and an MPI and an Apache Storm clusters. Container images are described as Dockerfiles (available online at http://github.com/dispel4py/pegasus_dispel4py), linked to Docker Hub for providing continuous integration (automated image builds), and image storing and sharing. In this model, all required software (workflow systems and execution engines) for running scientific applications are packed into the containers, which significantly reduces the effort (and possible human errors) required by scientists or VRE administrators to build such systems. The most common use of DIaaS will be to act as a backend of VREs or Scientific Gateways to run data-intensive applications, deploying cloud resources upon request. We have demonstrated the feasibility of DIaaS using the data-intensive seismic ambient noise cross-correlation application (Figure 1). The application preprocesses (Phase1) and cross-correlates (Phase2) traces from several seismic stations. The application is submitted via Pegasus (Container1), and Phase1 and Phase2 are executed in the MPI (Container2) and Storm (Container3) clusters respectively. Although both phases could be executed within the same environment, this setup demonstrates the flexibility of DIaaS to run applications across e-Infrastructures. In summary, DIaaS delivers specialized software to execute data-intensive applications in a scalable, efficient, and robust manner reducing the engineering time and computational cost.
Mapping PetaSHA Applications to TeraGrid Architectures

NASA Astrophysics Data System (ADS)

Cui, Y.; Moore, R.; Olsen, K.; Zhu, J.; Dalguer, L. A.; Day, S.; Cruz-Atienza, V.; Maechling, P.; Jordan, T.

2007-12-01

The Southern California Earthquake Center (SCEC) has a science program in developing an integrated cyberfacility - PetaSHA - for executing physics-based seismic hazard analysis (SHA) computations. The NSF has awarded PetaSHA 15 million allocation service units this year on the fastest supercomputers available within the NSF TeraGrid. However, one size does not fit all, a range of systems are needed to support this effort at different stages of the simulations. Enabling PetaSHA simulations on those TeraGrid architectures to solve both dynamic rupture and seismic wave propagation have been a challenge from both hardware and software levels. This is an adaptation procedure to meet specific requirements of each architecture. It is important to determine how fundamental system attributes affect application performance. We present an adaptive approach in our PetaSHA application that enables the simultaneous optimization of both computation and communication at run-time using flexible settings. These techniques optimize initialization, source/media partition and MPI-IO output in different ways to achieve optimal performance on the target machines. The resulting code is a factor of four faster than the orignial version. New MPI-I/O capabilities have been added for the accurate Staggered-Grid Split-Node (SGSN) method for dynamic rupture propagation in the velocity-stress staggered-grid finite difference scheme (Dalguer and Day, JGR, 2007), We use execution workflow across TeraGrid sites for managing the resulting data volumes. Our lessons learned indicate that minimizing time to solution is most critical, in particular when scheduling large scale simulations across supercomputer sites. The TeraShake platform has been ported to multiple architectures including TACC Dell lonestar and Abe, Cray XT3 Bigben and Blue Gene/L. Parallel efficiency of 96% with the PetaSHA application Olsen-AWM has been demonstrated on 40,960 Blue Gene/L processors at IBM TJ Watson Center. Notable accomplishments using the optimized code include the M7.8 ShakeOut rupture scenario, as part of the southern San Andreas Fault evaluation SoSAFE. The ShakeOut simulation domain is the same as used for the SCEC TeraShake simulations (600 km by 300 km by 80 km). However, the higher resolution of 100 m with frequency content up to 1 Hz required 14.4 billion grid points, eight times more than the TeraShake scenarios. The simulation used 2000 TACC Dell linux Lonestar processors and took 56 hours to compute 240 seconds of wave propagation. The pre-processing input partition, as well as post-processing analysis has been performed on the SDSC IBM Datastar p655 and p690. In addition, as part of the SCEC DynaShake computational platform, the SGSN capability was used to model dynamic rupture propagation for the ShakeOut scenario that match the proposed surface slip and size of the event. Mapping applications to different architectures require coordination of many areas of expertise in hardware and application level, an outstanding challenge faced on the current petascale computing effort. We believe our techniques as well as distributed data management through data grids have provided a practical example of how to effectively use multiple compute resources, and our results will benefit other geoscience disciplines as well.
[Application of information management system about medical equipment].

PubMed

Hang, Jianjin; Zhang, Chaoqun; Wu, Xiang-Yang

2011-05-01

Based on the practice of workflow, information management system about medical equipment was developed and its functions such as gathering, browsing, inquiring and counting were introduced. With dynamic and complete case management of medical equipment, the system improved the management of medical equipment.
On a learning curve for shared decision making: Interviews with clinicians using the knee osteoarthritis Option Grid.

PubMed

Elwyn, Glyn; Rasmussen, Julie; Kinsey, Katharine; Firth, Jill; Marrin, Katy; Edwards, Adrian; Wood, Fiona

2018-02-01

Tools used in clinical encounters to illustrate to patients the risks and benefits of treatment options have been shown to increase shared decision making. However, we do not have good information about how these tools are viewed by clinicians and how clinicians think patients would react to their use. Our aim was to examine clinicians' views about the possible and actual use of tools designed to support patients and clinicians to collaborate and deliberate about treatment options, namely, Option Grid decision aids. We conducted a thematic analysis of qualitative interviews embedded in the intervention phase of a trial of an Option Grid decision aid for osteoarthritis of the knee. Interviews were conducted with 6 participating clinicians before they used the tool and again after clinicians had used the tool with 6 patients. In the first interview, clinicians voiced concerns that the tool would lead to an increase in encounter duration, patient resistance regarding involvement in decision making, and potential information overload. At the second interview, after minimal training, the clinicians reported that the tool had changed their usual way of communicating, and it was generally acceptable and helpful to integrate it into practice. After experiencing the use of Option Grids, clinicians became more willing to use the tools in their clinical encounters with patients. How best to introduce Option Grids to clinicians and adopt their use into practice will need careful consideration of context, workflow, and clinical pathways. © 2016 John Wiley & Sons, Ltd.
VisTrails SAHM: visualization and workflow management for species habitat modeling

USGS Publications Warehouse

Morisette, Jeffrey T.; Jarnevich, Catherine S.; Holcombe, Tracy R.; Talbert, Colin B.; Ignizio, Drew A.; Talbert, Marian; Silva, Claudio; Koop, David; Swanson, Alan; Young, Nicholas E.

2013-01-01

The Software for Assisted Habitat Modeling (SAHM) has been created to both expedite habitat modeling and help maintain a record of the various input data, pre- and post-processing steps and modeling options incorporated in the construction of a species distribution model through the established workflow management and visualization VisTrails software. This paper provides an overview of the VisTrails:SAHM software including a link to the open source code, a table detailing the current SAHM modules, and a simple example modeling an invasive weed species in Rocky Mountain National Park, USA.
Scientific Workflows + Provenance = Better (Meta-)Data Management

NASA Astrophysics Data System (ADS)

Ludaescher, B.; Cuevas-Vicenttín, V.; Missier, P.; Dey, S.; Kianmajd, P.; Wei, Y.; Koop, D.; Chirigati, F.; Altintas, I.; Belhajjame, K.; Bowers, S.

2013-12-01

The origin and processing history of an artifact is known as its provenance. Data provenance is an important form of metadata that explains how a particular data product came about, e.g., how and when it was derived in a computational process, which parameter settings and input data were used, etc. Provenance information provides transparency and helps to explain and interpret data products. Other common uses and applications of provenance include quality control, data curation, result debugging, and more generally, 'reproducible science'. Scientific workflow systems (e.g. Kepler, Taverna, VisTrails, and others) provide controlled environments for developing computational pipelines with built-in provenance support. Workflow results can then be explained in terms of workflow steps, parameter settings, input data, etc. using provenance that is automatically captured by the system. Scientific workflows themselves provide a user-friendly abstraction of the computational process and are thus a form of ('prospective') provenance in their own right. The full potential of provenance information is realized when combining workflow-level information (prospective provenance) with trace-level information (retrospective provenance). To this end, the DataONE Provenance Working Group (ProvWG) has developed an extension of the W3C PROV standard, called D-PROV. Whereas PROV provides a 'least common denominator' for exchanging and integrating provenance information, D-PROV adds new 'observables' that described workflow-level information (e.g., the functional steps in a pipeline), as well as workflow-specific trace-level information ( timestamps for each workflow step executed, the inputs and outputs used, etc.) Using examples, we will demonstrate how the combination of prospective and retrospective provenance provides added value in managing scientific data. The DataONE ProvWG is also developing tools based on D-PROV that allow scientists to get more mileage from provenance metadata. DataONE is a federation of member nodes that store data and metadata for discovery and access. By enriching metadata with provenance information, search and reuse of data is enhanced, and the 'social life' of data (being the product of many workflow runs, different people, etc.) is revealed. We are currently prototyping a provenance repository (PBase) to demonstrate what can be achieved with advanced provenance queries. The ProvExplorer and ProPub tools support advanced ad-hoc querying and visualization of provenance as well as customized provenance publications (e.g., to address privacy issues, or to focus provenance to relevant details). In a parallel line of work, we are exploring ways to add provenance support to widely-used scripting platforms (e.g. R and Python) and then expose that information via D-PROV.
A midas plugin to enable construction of reproducible web-based image processing pipelines

PubMed Central

Grauer, Michael; Reynolds, Patrick; Hoogstoel, Marion; Budin, Francois; Styner, Martin A.; Oguz, Ipek

2013-01-01

Image processing is an important quantitative technique for neuroscience researchers, but difficult for those who lack experience in the field. In this paper we present a web-based platform that allows an expert to create a brain image processing pipeline, enabling execution of that pipeline even by those biomedical researchers with limited image processing knowledge. These tools are implemented as a plugin for Midas, an open-source toolkit for creating web based scientific data storage and processing platforms. Using this plugin, an image processing expert can construct a pipeline, create a web-based User Interface, manage jobs, and visualize intermediate results. Pipelines are executed on a grid computing platform using BatchMake and HTCondor. This represents a new capability for biomedical researchers and offers an innovative platform for scientific collaboration. Current tools work well, but can be inaccessible for those lacking image processing expertise. Using this plugin, researchers in collaboration with image processing experts can create workflows with reasonable default settings and streamlined user interfaces, and data can be processed easily from a lab environment without the need for a powerful desktop computer. This platform allows simplified troubleshooting, centralized maintenance, and easy data sharing with collaborators. These capabilities enable reproducible science by sharing datasets and processing pipelines between collaborators. In this paper, we present a description of this innovative Midas plugin, along with results obtained from building and executing several ITK based image processing workflows for diffusion weighted MRI (DW MRI) of rodent brain images, as well as recommendations for building automated image processing pipelines. Although the particular image processing pipelines developed were focused on rodent brain MRI, the presented plugin can be used to support any executable or script-based pipeline. PMID:24416016
A midas plugin to enable construction of reproducible web-based image processing pipelines.

PubMed

Grauer, Michael; Reynolds, Patrick; Hoogstoel, Marion; Budin, Francois; Styner, Martin A; Oguz, Ipek

2013-01-01

Image processing is an important quantitative technique for neuroscience researchers, but difficult for those who lack experience in the field. In this paper we present a web-based platform that allows an expert to create a brain image processing pipeline, enabling execution of that pipeline even by those biomedical researchers with limited image processing knowledge. These tools are implemented as a plugin for Midas, an open-source toolkit for creating web based scientific data storage and processing platforms. Using this plugin, an image processing expert can construct a pipeline, create a web-based User Interface, manage jobs, and visualize intermediate results. Pipelines are executed on a grid computing platform using BatchMake and HTCondor. This represents a new capability for biomedical researchers and offers an innovative platform for scientific collaboration. Current tools work well, but can be inaccessible for those lacking image processing expertise. Using this plugin, researchers in collaboration with image processing experts can create workflows with reasonable default settings and streamlined user interfaces, and data can be processed easily from a lab environment without the need for a powerful desktop computer. This platform allows simplified troubleshooting, centralized maintenance, and easy data sharing with collaborators. These capabilities enable reproducible science by sharing datasets and processing pipelines between collaborators. In this paper, we present a description of this innovative Midas plugin, along with results obtained from building and executing several ITK based image processing workflows for diffusion weighted MRI (DW MRI) of rodent brain images, as well as recommendations for building automated image processing pipelines. Although the particular image processing pipelines developed were focused on rodent brain MRI, the presented plugin can be used to support any executable or script-based pipeline.
HydroShare: An online, collaborative environment for the sharing of hydrologic data and models (Invited)

NASA Astrophysics Data System (ADS)

Tarboton, D. G.; Idaszak, R.; Horsburgh, J. S.; Ames, D.; Goodall, J. L.; Band, L. E.; Merwade, V.; Couch, A.; Arrigo, J.; Hooper, R. P.; Valentine, D. W.; Maidment, D. R.

2013-12-01

HydroShare is an online, collaborative system being developed for sharing hydrologic data and models. The goal of HydroShare is to enable scientists to easily discover and access data and models, retrieve them to their desktop or perform analyses in a distributed computing environment that may include grid, cloud or high performance computing model instances as necessary. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models and analyses. HydroShare is expanding the data sharing capability of the CUAHSI Hydrologic Information System by broadening the classes of data accommodated, creating new capability to share models and model components, and taking advantage of emerging social media functionality to enhance information about and collaboration around hydrologic data and models. One of the fundamental concepts in HydroShare is that of a Resource. All content is represented using a Resource Data Model that separates system and science metadata and has elements common to all resources as well as elements specific to the types of resources HydroShare will support. These will include different data types used in the hydrology community and models and workflows that require metadata on execution functionality. HydroShare will use the integrated Rule-Oriented Data System (iRODS) to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the execution of models and workflows. This presentation will introduce the HydroShare functionality developed to date, describe key elements of the Resource Data Model and outline the roadmap for future development.
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

PubMed Central

Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

2014-01-01

Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600
Development of the workflow kine systems for support on KAIZEN.

PubMed

Mizuno, Yuki; Ito, Toshihiko; Yoshikawa, Toru; Yomogida, Satoshi; Morio, Koji; Sakai, Kazuhiro

2012-01-01

In this paper, we introduce the new workflow line system consisted of the location and image recording, which led to the acquisition of workflow information and the analysis display. From the results of workflow line investigation, we considered the anticipated effects and the problems on KAIZEN. Workflow line information included the location information and action contents information. These technologies suggest the viewpoints to help improvement, for example, exclusion of useless movement, the redesign of layout and the review of work procedure. Manufacturing factory, it was clear that there was much movement from the standard operation place and accumulation residence time. The following was shown as a result of this investigation, to be concrete, the efficient layout was suggested by this system. In the case of the hospital, similarly, it is pointed out that the workflow has the problem of layout and setup operations based on the effective movement pattern of the experts. This system could adapt to routine work, including as well as non-routine work. By the development of this system which can fit and adapt to industrial diversification, more effective "visual management" (visualization of work) is expected in the future.
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

PubMed

Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

2014-06-01

Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.
A case study on the impacts of computerized provider order entry (CPOE) system on hospital clinical workflow.

PubMed

Mominah, Maher; Yunus, Faisel; Househ, Mowafa S

2013-01-01

Computerized provider order entry (CPOE) is a health informatics system that helps health care providers create and manage orders for medications and other health care services. Through the automation of the ordering process, CPOE has improved the overall efficiency of hospital processes and workflow. In Saudi Arabia, CPOE has been used for years, with only a few studies evaluating the impacts of CPOE on clinical workflow. In this paper, we discuss the experience of a local hospital with the use of CPOE and its impacts on clinical workflow. Results show that there are many issues related to the implementation and use of CPOE within Saudi Arabia that must be addressed, including design, training, medication errors, alert fatigue, and system dep Recommendations for improving CPOE use within Saudi Arabia are also discussed.
A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows.

PubMed

Blattner, Timothy; Keyrouz, Walid; Bhattacharyya, Shuvra S; Halem, Milton; Brady, Mary

2017-12-01

Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3× and 1.8× speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.
Storage element performance optimization for CMS analysis jobs

NASA Astrophysics Data System (ADS)

Behrmann, G.; Dahlblom, J.; Guldmyr, J.; Happonen, K.; Lindén, T.

2012-12-01

Tier-2 computing sites in the Worldwide Large Hadron Collider Computing Grid (WLCG) host CPU-resources (Compute Element, CE) and storage resources (Storage Element, SE). The vast amount of data that needs to processed from the Large Hadron Collider (LHC) experiments requires good and efficient use of the available resources. Having a good CPU efficiency for the end users analysis jobs requires that the performance of the storage system is able to scale with I/O requests from hundreds or even thousands of simultaneous jobs. In this presentation we report on the work on improving the SE performance at the Helsinki Institute of Physics (HIP) Tier-2 used for the Compact Muon Experiment (CMS) at the LHC. Statistics from CMS grid jobs are collected and stored in the CMS Dashboard for further analysis, which allows for easy performance monitoring by the sites and by the CMS collaboration. As part of the monitoring framework CMS uses the JobRobot which sends every four hours 100 analysis jobs to each site. CMS also uses the HammerCloud tool for site monitoring and stress testing and it has replaced the JobRobot. The performance of the analysis workflow submitted with JobRobot or HammerCloud can be used to track the performance due to site configuration changes, since the analysis workflow is kept the same for all sites and for months in time. The CPU efficiency of the JobRobot jobs at HIP was increased approximately by 50 % to more than 90 %, by tuning the SE and by improvements in the CMSSW and dCache software. The performance of the CMS analysis jobs improved significantly too. Similar work has been done on other CMS Tier-sites, since on average the CPU efficiency for CMSSW jobs has increased during 2011. Better monitoring of the SE allows faster detection of problems, so that the performance level can be kept high. The next storage upgrade at HIP consists of SAS disk enclosures which can be stress tested on demand with HammerCloud workflows, to make sure that the I/O-performance is good.

Privacy protection in HealthGrid: distributing encryption management over the VO.

PubMed

Torres, Erik; de Alfonso, Carlos; Blanquer, Ignacio; Hernández, Vicente

2006-01-01

Grid technologies have proven to be very successful in tackling challenging problems in which data access and processing is a bottleneck. Notwithstanding the benefits that Grid technologies could have in Health applications, privacy leakages of current DataGrid technologies due to the sharing of data in VOs and the use of remote resources, compromise its widespreading. Privacy control for Grid technology has become a key requirement for the adoption of Grids in the Healthcare sector. Encrypted storage of confidential data effectively reduces the risk of disclosure. A self-enforcing scheme for encrypted data storage can be achieved by combining Grid security systems with distributed key management and classical cryptography techniques. Virtual Organizations, as the main unit of user management in Grid, can provide a way to organize key sharing, access control lists and secure encryption management. This paper provides programming models and discusses the value, costs and behavior of such a system implemented on top of one of the latest Grid middlewares. This work is partially funded by the Spanish Ministry of Science and Technology in the frame of the project Investigación y Desarrollo de Servicios GRID: Aplicación a Modelos Cliente-Servidor, Colaborativos y de Alta Productividad, with reference TIC2003-01318.
A Scalable, Open Source Platform for Data Processing, Archiving and Dissemination

DTIC Science & Technology

2016-01-01

Object Oriented Data Technology (OODT) big data toolkit developed by NASA and the Work-flow INstance Generation and Selection (WINGS) scientific work...to several challenge big data problems and demonstrated the utility of OODT-WINGS in addressing them. Specific demonstrated analyses address i...source software, Apache, Object Oriented Data Technology, OODT, semantic work-flows, WINGS, big data , work- flow management 16. SECURITY CLASSIFICATION OF
Grid computing enhances standards-compatible geospatial catalogue service

NASA Astrophysics Data System (ADS)

Chen, Aijun; Di, Liping; Bai, Yuqi; Wei, Yaxing; Liu, Yang

2010-04-01

A catalogue service facilitates sharing, discovery, retrieval, management of, and access to large volumes of distributed geospatial resources, for example data, services, applications, and their replicas on the Internet. Grid computing provides an infrastructure for effective use of computing, storage, and other resources available online. The Open Geospatial Consortium has proposed a catalogue service specification and a series of profiles for promoting the interoperability of geospatial resources. By referring to the profile of the catalogue service for Web, an innovative information model of a catalogue service is proposed to offer Grid-enabled registry, management, retrieval of and access to geospatial resources and their replicas. This information model extends the e-business registry information model by adopting several geospatial data and service metadata standards—the International Organization for Standardization (ISO)'s 19115/19119 standards and the US Federal Geographic Data Committee (FGDC) and US National Aeronautics and Space Administration (NASA) metadata standards for describing and indexing geospatial resources. In order to select the optimal geospatial resources and their replicas managed by the Grid, the Grid data management service and information service from the Globus Toolkits are closely integrated with the extended catalogue information model. Based on this new model, a catalogue service is implemented first as a Web service. Then, the catalogue service is further developed as a Grid service conforming to Grid service specifications. The catalogue service can be deployed in both the Web and Grid environments and accessed by standard Web services or authorized Grid services, respectively. The catalogue service has been implemented at the George Mason University/Center for Spatial Information Science and Systems (GMU/CSISS), managing more than 17 TB of geospatial data and geospatial Grid services. This service makes it easy to share and interoperate geospatial resources by using Grid technology and extends Grid technology into the geoscience communities.
Fluid Simulation in the Movies: Navier and Stokes Must Be Circulating in Their Graves

NASA Astrophysics Data System (ADS)

Tessendorf, Jerry

2010-11-01

Fluid simulations based on the Incompressible Navier-Stokes equations are commonplace computer graphics tools in the visual effects industry. These simulations mostly come from custom C++ code written by the visual effects companies. Their significant impact in films was recognized in 2008 with Academy Awards to four visual effects companies for their technical achievement. However artists are not fluid dynamicists, and fluid dynamics simulations are expensive to use in a deadline-driven production environment. As a result, the simulation algorithms are modified to limit the computational resources, adapt them to production workflow, and to respect the client's vision of the film plot. Eulerian solvers on fixed rectangular grids use a mix of momentum solvers, including Semi-Lagrangian, FLIP, and QUICK. Incompressibility is enforced with FFT, Conjugate Gradient, and Multigrid methods. For liquids, a levelset field tracks the free surface. Smooth Particle Hydrodynamics is also used, and is part of a hybrid Eulerian-SPH liquid simulator. Artists use all of them in a mix and match fashion to control the appearance of the simulation. Specially designed forces and boundary conditions control the flow. The simulation can be an input to artistically driven procedural particle simulations that enhance the flow with more detail and drama. Post-simulation processing increases the visual detail beyond the grid resolution. Ultimately, iterative simulation methods that fit naturally in the production workflow are extremely desirable but not yet successful. Results from some efforts for iterative methods are shown, and other approaches motivated by the history of production are proposed.
The Academic Administrator Grid. A Guide to Developing Effective Management Teams.

ERIC Educational Resources Information Center

Blake, Robert R.; And Others

The use of the "management grid" method of organizational development in college and university administration is described in this adaptation of the 1964 book by the same authors, "The Management Grid." Five major administrative styles are identified: (1) caretaker, (2) comfortable and pleasant, (3) constituency-centered, (4) team, and (5)…
FermiGrid - experience and future plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chadwick, K.; Berman, E.; Canal, P.

2007-09-01

Fermilab supports a scientific program that includes experiments and scientists located across the globe. In order to better serve this community, Fermilab has placed its production computer resources in a Campus Grid infrastructure called 'FermiGrid'. The FermiGrid infrastructure allows the large experiments at Fermilab to have priority access to their own resources, enables sharing of these resources in an opportunistic fashion, and movement of work (jobs, data) between the Campus Grid and National Grids such as Open Science Grid and the WLCG. FermiGrid resources support multiple Virtual Organizations (VOs), including VOs from the Open Science Grid (OSG), EGEE and themore » Worldwide LHC Computing Grid Collaboration (WLCG). Fermilab also makes leading contributions to the Open Science Grid in the areas of accounting, batch computing, grid security, job management, resource selection, site infrastructure, storage management, and VO services. Through the FermiGrid interfaces, authenticated and authorized VOs and individuals may access our core grid services, the 10,000+ Fermilab resident CPUs, near-petabyte (including CMS) online disk pools and the multi-petabyte Fermilab Mass Storage System. These core grid services include a site wide Globus gatekeeper, VO management services for several VOs, Fermilab site authorization services, grid user mapping services, as well as job accounting and monitoring, resource selection and data movement services. Access to these services is via standard and well-supported grid interfaces. We will report on the user experience of using the FermiGrid campus infrastructure interfaced to a national cyberinfrastructure--the successes and the problems.« less
Multi-core processing and scheduling performance in CMS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hernandez, J. M.; Evans, D.; Foulkes, S.

2012-01-01

Commodity hardware is going many-core. We might soon not be able to satisfy the job memory needs per core in the current single-core processing model in High Energy Physics. In addition, an ever increasing number of independent and incoherent jobs running on the same physical hardware not sharing resources might significantly affect processing performance. It will be essential to effectively utilize the multi-core architecture. CMS has incorporated support for multi-core processing in the event processing framework and the workload management system. Multi-core processing jobs share common data in memory, such us the code libraries, detector geometry and conditions data, resultingmore » in a much lower memory usage than standard single-core independent jobs. Exploiting this new processing model requires a new model in computing resource allocation, departing from the standard single-core allocation for a job. The experiment job management system needs to have control over a larger quantum of resource since multi-core aware jobs require the scheduling of multiples cores simultaneously. CMS is exploring the approach of using whole nodes as unit in the workload management system where all cores of a node are allocated to a multi-core job. Whole-node scheduling allows for optimization of the data/workflow management (e.g. I/O caching, local merging) but efficient utilization of all scheduled cores is challenging. Dedicated whole-node queues have been setup at all Tier-1 centers for exploring multi-core processing workflows in CMS. We present the evaluation of the performance scheduling and executing multi-core workflows in whole-node queues compared to the standard single-core processing workflows.« less
Distributed data analysis in ATLAS

NASA Astrophysics Data System (ADS)

Nilsson, Paul; Atlas Collaboration

2012-12-01

Data analysis using grid resources is one of the fundamental challenges to be addressed before the start of LHC data taking. The ATLAS detector will produce petabytes of data per year, and roughly one thousand users will need to run physics analyses on this data. Appropriate user interfaces and helper applications have been made available to ensure that the grid resources can be used without requiring expertise in grid technology. These tools enlarge the number of grid users from a few production administrators to potentially all participating physicists. ATLAS makes use of three grid infrastructures for the distributed analysis: the EGEE sites, the Open Science Grid, and Nordu Grid. These grids are managed by the gLite workload management system, the PanDA workload management system, and ARC middleware; many sites can be accessed via both the gLite WMS and PanDA. Users can choose between two front-end tools to access the distributed resources. Ganga is a tool co-developed with LHCb to provide a common interface to the multitude of execution backends (local, batch, and grid). The PanDA workload management system provides a set of utilities called PanDA Client; with these tools users can easily submit Athena analysis jobs to the PanDA-managed resources. Distributed data is managed by Don Quixote 2, a system developed by ATLAS; DQ2 is used to replicate datasets according to the data distribution policies and maintains a central catalog of file locations. The operation of the grid resources is continually monitored by the Ganga Robot functional testing system, and infrequent site stress tests are performed using the Hammer Cloud system. In addition, the DAST shift team is a group of power users who take shifts to provide distributed analysis user support; this team has effectively relieved the burden of support from the developers.
Akuna: An Open Source User Environment for Managing Subsurface Simulation Workflows

NASA Astrophysics Data System (ADS)

Freedman, V. L.; Agarwal, D.; Bensema, K.; Finsterle, S.; Gable, C. W.; Keating, E. H.; Krishnan, H.; Lansing, C.; Moeglein, W.; Pau, G. S. H.; Porter, E.; Scheibe, T. D.

2014-12-01

The U.S. Department of Energy (DOE) is investing in development of a numerical modeling toolset called ASCEM (Advanced Simulation Capability for Environmental Management) to support modeling analyses at legacy waste sites. ASCEM is an open source and modular computing framework that incorporates new advances and tools for predicting contaminant fate and transport in natural and engineered systems. The ASCEM toolset includes both a Platform with Integrated Toolsets (called Akuna) and a High-Performance Computing multi-process simulator (called Amanzi). The focus of this presentation is on Akuna, an open-source user environment that manages subsurface simulation workflows and associated data and metadata. In this presentation, key elements of Akuna are demonstrated, which includes toolsets for model setup, database management, sensitivity analysis, parameter estimation, uncertainty quantification, and visualization of both model setup and simulation results. A key component of the workflow is in the automated job launching and monitoring capabilities, which allow a user to submit and monitor simulation runs on high-performance, parallel computers. Visualization of large outputs can also be performed without moving data back to local resources. These capabilities make high-performance computing accessible to the users who might not be familiar with batch queue systems and usage protocols on different supercomputers and clusters.
Design of Energy Storage Management System Based on FPGA in Micro-Grid

NASA Astrophysics Data System (ADS)

Liang, Yafeng; Wang, Yanping; Han, Dexiao

2018-01-01

Energy storage system is the core to maintain the stable operation of smart micro-grid. Aiming at the existing problems of the energy storage management system in the micro-grid such as Low fault tolerance, easy to cause fluctuations in micro-grid, a new intelligent battery management system based on field programmable gate array is proposed : taking advantage of FPGA to combine the battery management system with the intelligent micro-grid control strategy. Finally, aiming at the problem that during estimation of battery charge State by neural network, initialization of weights and thresholds are not accurate leading to large errors in prediction results, the genetic algorithm is proposed to optimize the neural network method, and the experimental simulation is carried out. The experimental results show that the algorithm has high precision and provides guarantee for the stable operation of micro-grid.
Seamless Provenance Representation and Use in Collaborative Science Scenarios

NASA Astrophysics Data System (ADS)

Missier, P.; Ludaescher, B.; Bowers, S.; Altintas, I.; Anand, M. K.; Dey, S.; Sarkar, A.; Shrestha, B.; Goble, C.

2010-12-01

The notion of sharing scientific data has only recently begun to gain ground in science, where data is still considered a private asset. There is growing evidence, however, that the benefits of scientific collaboration through early data sharing during the course of a science project may outgrow the risk of losing exclusive ownership of the data. As exemplar success stories are making the headlines[1], principles of effective information sharing have become the subject of e-science research. In particular, any piece of published data should be self-describing, to the extent necessary for consumers to determine its suitability for reuse in their own projects. This is accomplished by associating a body of formally specified and machine-processable metadata to the data. When data is produced and reused by independent groups, however, metadata interoperability issues emerge. This is the case for provenance, a form of metadata that describes the history of a data product, Y. Provenance is typically expressed as a graph-structured set of dependencies that account for the sequence of computational or interactive steps that led to Y, often starting from some primary, observational data. Traversing dependency graphs is one of the mechanisms used to answer questions on data reliability. In the context of the NSF DataONE project[2], we have been studying issues of provenance interoperability in scientific collaboration scenarios. Consider a first scientist, Alice, who publishes a data product X along with its provenance, and a second scientist who further transforms X into a new product Y, also along with its provenance. A third scientist, who is interested in Y, expects to be able to trace Y's history up to the inputs used by Alice. This is only possible, however, if provenance accumulates into a single, uniform graph that can be seamlessly traversed. This becomes problematic when provenance is captured using different tools and computational models (i.e. workflow systems), as well as when data is published and reused using mechanisms that are not provenance-aware. In this presentation we discuss requirements for ensuring provenance-aware data publishing and reuse, and describe the design and implementation of a prototype toolkit that involves two specific, and broadly used, workflow models, Kepler [3] and Taverna [4]. The implementation is expected to be adopted as part of DataONE's investigators' toolkit, in support of its mission of large-scale data preservation. Refs. [1]Sharing of Data Leads to Progress on Alzheimer’s, G. Kolata, NYT, 8/12/2010 [2]http://www.dataone.org [3]Ludaescher B., Altintas I. et al. Scientific Workflow Management and the Kepler System. Special Issue: Workflow in Grid Systems. Concurrency and Computation: Practice & Experience 18(10): 1039-1065, 2006 [4]D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, T. Oinn. Taverna: a tool for building and running workflows of services. Nucl. Acids Res. 34: W729-W732, 2006
The Einstein Genome Gateway using WASP - a high throughput multi-layered life sciences portal for XSEDE.

PubMed

Golden, Aaron; McLellan, Andrew S; Dubin, Robert A; Jing, Qiang; O Broin, Pilib; Moskowitz, David; Zhang, Zhengdong; Suzuki, Masako; Hargitai, Joseph; Calder, R Brent; Greally, John M

2012-01-01

Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface. WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ~ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
Smarter Grid Solutions Works with NREL to Enhance Grid-Hosting Capacity |

Science.gov Websites

autonomously manages, coordinates, and controls distributed energy resources in real time to maintain the coordination and real-time management of an entire distribution grid, subsuming the smart home and smart campus
Delivery of femtolitre droplets using surface acoustic wave based atomisation for cryo-EM grid preparation.

PubMed

Ashtiani, Dariush; Venugopal, Hari; Belousoff, Matthew; Spicer, Bradley; Mak, Johnson; Neild, Adrian; de Marco, Alex

2018-04-06

Cryo-Electron Microscopy (cryo-EM) has become an invaluable tool for structural biology. Over the past decade, the advent of direct electron detectors and automated data acquisition has established cryo-EM as a central method in structural biology. However, challenges remain in the reliable and efficient preparation of samples in a manner which is compatible with high time resolution. The delivery of sample onto the grid is recognized as a critical step in the workflow as it is a source of variability and loss of material due to the blotting which is usually required. Here, we present a method for sample delivery and plunge freezing based on the use of Surface Acoustic Waves to deploy 6-8 µm droplets to the EM grid. This method minimises the sample dead volume and ensures vitrification within 52.6 ms from the moment the sample leaves the microfluidics chip. We demonstrate a working protocol to minimize the atomised volume and apply it to plunge freeze three different samples and provide proof that no damage occurs due to the interaction between the sample and the acoustic waves. Copyright © 2018 Elsevier Inc. All rights reserved.
Implementing standards for the interoperability among healthcare providers in the public regionalized Healthcare Information System of the Lombardy Region.

PubMed

Barbarito, Fulvio; Pinciroli, Francesco; Mason, John; Marceglia, Sara; Mazzola, Luca; Bonacina, Stefano

2012-08-01

Information technologies (ITs) have now entered the everyday workflow in a variety of healthcare providers with a certain degree of independence. This independence may be the cause of difficulty in interoperability between information systems and it can be overcome through the implementation and adoption of standards. Here we present the case of the Lombardy Region, in Italy, that has been able, in the last 10 years, to set up the Regional Social and Healthcare Information System, connecting all the healthcare providers within the region, and providing full access to clinical and health-related documents independently from the healthcare organization that generated the document itself. This goal, in a region with almost 10 millions citizens, was achieved through a twofold approach: first, the political and operative push towards the adoption of the Health Level 7 (HL7) standard within single hospitals and, second, providing a technological infrastructure for data sharing based on interoperability specifications recognized at the regional level for messages transmitted from healthcare providers to the central domain. The adoption of such regional interoperability specifications enabled the communication among heterogeneous systems placed in different hospitals in Lombardy. Integrating the Healthcare Enterprise (IHE) integration profiles which refer to HL7 standards are adopted within hospitals for message exchange and for the definition of integration scenarios. The IHE patient administration management (PAM) profile with its different workflows is adopted for patient management, whereas the Scheduled Workflow (SWF), the Laboratory Testing Workflow (LTW), and the Ambulatory Testing Workflow (ATW) are adopted for order management. At present, the system manages 4,700,000 pharmacological e-prescriptions, and 1,700,000 e-prescriptions for laboratory exams per month. It produces, monthly, 490,000 laboratory medical reports, 180,000 radiology medical reports, 180,000 first aid medical reports, and 58,000 discharge summaries. Hence, despite there being still work in progress, the Lombardy Region healthcare system is a fully interoperable social healthcare system connecting patients, healthcare providers, healthcare organizations, and healthcare professionals in a large and heterogeneous territory through the implementation of international health standards. Copyright © 2012 Elsevier Inc. All rights reserved.
Parametric Workflow (BIM) for the Repair Construction of Traditional Historic Architecture in Taiwan

NASA Astrophysics Data System (ADS)

Ma, Y.-P.; Hsu, C. C.; Lin, M.-C.; Tsai, Z.-W.; Chen, J.-Y.

2015-08-01

In Taiwan, numerous existing traditional buildings are constructed with wooden structures, brick structures, and stone structures. This paper will focus on the Taiwan traditional historic architecture and target the traditional wooden structure buildings as the design proposition and process the BIM workflow for modeling complex wooden combination geometry, integrating with more traditional 2D documents and for visualizing repair construction assumptions within the 3D model representation. The goal of this article is to explore the current problems to overcome in wooden historic building conservation, and introduce the BIM technology in the case of conserving, documenting, managing, and creating full engineering drawings and information for effectively support historic conservation. Although BIM is mostly oriented to current construction praxis, there have been some attempts to investigate its applicability in historic conservation projects. This article also illustrates the importance and advantages of using BIM workflow in repair construction process, when comparing with generic workflow.
Task Delegation Based Access Control Models for Workflow Systems

NASA Astrophysics Data System (ADS)

Gaaloul, Khaled; Charoy, François

e-Government organisations are facilitated and conducted using workflow management systems. Role-based access control (RBAC) is recognised as an efficient access control model for large organisations. The application of RBAC in workflow systems cannot, however, grant permissions to users dynamically while business processes are being executed. We currently observe a move away from predefined strict workflow modelling towards approaches supporting flexibility on the organisational level. One specific approach is that of task delegation. Task delegation is a mechanism that supports organisational flexibility, and ensures delegation of authority in access control systems. In this paper, we propose a Task-oriented Access Control (TAC) model based on RBAC to address these requirements. We aim to reason about task from organisational perspectives and resources perspectives to analyse and specify authorisation constraints. Moreover, we present a fine grained access control protocol to support delegation based on the TAC model.
FermiGrid—experience and future plans

NASA Astrophysics Data System (ADS)

Chadwick, K.; Berman, E.; Canal, P.; Hesselroth, T.; Garzoglio, G.; Levshina, T.; Sergeev, V.; Sfiligoi, I.; Sharma, N.; Timm, S.; Yocum, D. R.

2008-07-01

Fermilab supports a scientific program that includes experiments and scientists located across the globe. In order to better serve this community, Fermilab has placed its production computer resources in a Campus Grid infrastructure called 'FermiGrid'. The FermiGrid infrastructure allows the large experiments at Fermilab to have priority access to their own resources, enables sharing of these resources in an opportunistic fashion, and movement of work (jobs, data) between the Campus Grid and National Grids such as Open Science Grid (OSG) and the Worldwide LHC Computing Grid Collaboration (WLCG). FermiGrid resources support multiple Virtual Organizations (VOs), including VOs from the OSG, EGEE, and the WLCG. Fermilab also makes leading contributions to the Open Science Grid in the areas of accounting, batch computing, grid security, job management, resource selection, site infrastructure, storage management, and VO services. Through the FermiGrid interfaces, authenticated and authorized VOs and individuals may access our core grid services, the 10,000+ Fermilab resident CPUs, near-petabyte (including CMS) online disk pools and the multi-petabyte Fermilab Mass Storage System. These core grid services include a site wide Globus gatekeeper, VO management services for several VOs, Fermilab site authorization services, grid user mapping services, as well as job accounting and monitoring, resource selection and data movement services. Access to these services is via standard and well-supported grid interfaces. We will report on the user experience of using the FermiGrid campus infrastructure interfaced to a national cyberinfrastructure - the successes and the problems.
A Web application for the management of clinical workflow in image-guided and adaptive proton therapy for prostate cancer treatments.

PubMed

Yeung, Daniel; Boes, Peter; Ho, Meng Wei; Li, Zuofeng

2015-05-08

Image-guided radiotherapy (IGRT), based on radiopaque markers placed in the prostate gland, was used for proton therapy of prostate patients. Orthogonal X-rays and the IBA Digital Image Positioning System (DIPS) were used for setup correction prior to treatment and were repeated after treatment delivery. Following a rationale for margin estimates similar to that of van Herk,(1) the daily post-treatment DIPS data were analyzed to determine if an adaptive radiotherapy plan was necessary. A Web application using ASP.NET MVC5, Entity Framework, and an SQL database was designed to automate this process. The designed features included state-of-the-art Web technologies, a domain model closely matching the workflow, a database-supporting concurrency and data mining, access to the DIPS database, secured user access and roles management, and graphing and analysis tools. The Model-View-Controller (MVC) paradigm allowed clean domain logic, unit testing, and extensibility. Client-side technologies, such as jQuery, jQuery Plug-ins, and Ajax, were adopted to achieve a rich user environment and fast response. Data models included patients, staff, treatment fields and records, correction vectors, DIPS images, and association logics. Data entry, analysis, workflow logics, and notifications were implemented. The system effectively modeled the clinical workflow and IGRT process.
An access control model with high security for distributed workflow and real-time application

NASA Astrophysics Data System (ADS)

Han, Ruo-Fei; Wang, Hou-Xiang

2007-11-01

The traditional mandatory access control policy (MAC) is regarded as a policy with strict regulation and poor flexibility. The security policy of MAC is so compelling that few information systems would adopt it at the cost of facility, except some particular cases with high security requirement as military or government application. However, with the increasing requirement for flexibility, even some access control systems in military application have switched to role-based access control (RBAC) which is well known as flexible. Though RBAC can meet the demands for flexibility but it is weak in dynamic authorization and consequently can not fit well in the workflow management systems. The task-role-based access control (T-RBAC) is then introduced to solve the problem. It combines both the advantages of RBAC and task-based access control (TBAC) which uses task to manage permissions dynamically. To satisfy the requirement of system which is distributed, well defined with workflow process and critically for time accuracy, this paper will analyze the spirit of MAC, introduce it into the improved T&RBAC model which is based on T-RBAC. At last, a conceptual task-role-based access control model with high security for distributed workflow and real-time application (A_T&RBAC) is built, and its performance is simply analyzed.

The ACGT Master Ontology and its applications – Towards an ontology-driven cancer research and management system

PubMed Central

Brochhausen, Mathias; Spear, Andrew D.; Cocos, Cristian; Weiler, Gabriele; Martín, Luis; Anguita, Alberto; Stenzhorn, Holger; Daskalaki, Evangelia; Schera, Fatima; Schwarz, Ulf; Sfakianakis, Stelios; Kiefer, Stephan; Dörr, Martin; Graf, Norbert; Tsiknakis, Manolis

2017-01-01

Objective This paper introduces the objectives, methods and results of ontology development in the EU co-funded project Advancing Clinico-genomic Trials on Cancer – Open Grid Services for Improving Medical Knowledge Discovery (ACGT). While the available data in the life sciences has recently grown both in amount and quality, the full exploitation of it is being hindered by the use of different underlying technologies, coding systems, category schemes and reporting methods on the part of different research groups. The goal of the ACGT project is to contribute to the resolution of these problems by developing an ontology-driven, semantic grid services infrastructure that will enable efficient execution of discovery-driven scientific workflows in the context of multi-centric, post-genomic clinical trials. The focus of the present paper is the ACGT Master Ontology (MO). Methods ACGT project researchers undertook a systematic review of existing domain and upper-level ontologies, as well as of existing ontology design software, implementation methods, and end-user interfaces. This included the careful study of best practices, design principles and evaluation methods for ontology design, maintenance, implementation, and versioning, as well as for use on the part of domain experts and clinicians. Results To date, the results of the ACGT project include (i) the development of a master ontology (the ACGT-MO) based on clearly defined principles of ontology development and evaluation; (ii) the development of a technical infra-structure (the ACGT Platform) that implements the ACGT-MO utilizing independent tools, components and resources that have been developed based on open architectural standards, and which includes an application updating and evolving the ontology efficiently in response to end-user needs; and (iii) the development of an Ontology-based Trial Management Application (ObTiMA) that integrates the ACGT-MO into the design process of clinical trials in order to guarantee automatic semantic integration without the need to perform a separate mapping process. PMID:20438862
Experiences and lessons learned from creating a generalized workflow for data publication of field campaign datasets

NASA Astrophysics Data System (ADS)

Santhana Vannan, S. K.; Ramachandran, R.; Deb, D.; Beaty, T.; Wright, D.

2017-12-01

This paper summarizes the workflow challenges of curating and publishing data produced from disparate data sources and provides a generalized workflow solution to efficiently archive data generated by researchers. The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) for biogeochemical dynamics and the Global Hydrology Resource Center (GHRC) DAAC have been collaborating on the development of a generalized workflow solution to efficiently manage the data publication process. The generalized workflow presented here are built on lessons learned from implementations of the workflow system. Data publication consists of the following steps: Accepting the data package from the data providers, ensuring the full integrity of the data files. Identifying and addressing data quality issues Assembling standardized, detailed metadata and documentation, including file level details, processing methodology, and characteristics of data files Setting up data access mechanisms Setup of the data in data tools and services for improved data dissemination and user experience Registering the dataset in online search and discovery catalogues Preserving the data location through Digital Object Identifiers (DOI) We will describe the steps taken to automate, and realize efficiencies to the above process. The goals of the workflow system are to reduce the time taken to publish a dataset, to increase the quality of documentation and metadata, and to track individual datasets through the data curation process. Utilities developed to achieve these goal will be described. We will also share metrics driven value of the workflow system and discuss the future steps towards creation of a common software framework.
Cloud computing for energy management in smart grid - an application survey

NASA Astrophysics Data System (ADS)

Naveen, P.; Kiing Ing, Wong; Kobina Danquah, Michael; Sidhu, Amandeep S.; Abu-Siada, Ahmed

2016-03-01

The smart grid is the emerging energy system wherein the application of information technology, tools and techniques that make the grid run more efficiently. It possesses demand response capacity to help balance electrical consumption with supply. The challenges and opportunities of emerging and future smart grids can be addressed by cloud computing. To focus on these requirements, we provide an in-depth survey on different cloud computing applications for energy management in the smart grid architecture. In this survey, we present an outline of the current state of research on smart grid development. We also propose a model of cloud based economic power dispatch for smart grid.
Large-scale, high-performance and cloud-enabled multi-model analytics experiments in the context of the Earth System Grid Federation

NASA Astrophysics Data System (ADS)

Fiore, S.; Płóciennik, M.; Doutriaux, C.; Blanquer, I.; Barbera, R.; Williams, D. N.; Anantharaj, V. G.; Evans, B. J. K.; Salomoni, D.; Aloisio, G.

2017-12-01

The increased models resolution in the development of comprehensive Earth System Models is rapidly leading to very large climate simulations output that pose significant scientific data management challenges in terms of data sharing, processing, analysis, visualization, preservation, curation, and archiving.Large scale global experiments for Climate Model Intercomparison Projects (CMIP) have led to the development of the Earth System Grid Federation (ESGF), a federated data infrastructure which has been serving the CMIP5 experiment, providing access to 2PB of data for the IPCC Assessment Reports. In such a context, running a multi-model data analysis experiment is very challenging, as it requires the availability of a large amount of data related to multiple climate models simulations and scientific data management tools for large-scale data analytics. To address these challenges, a case study on climate models intercomparison data analysis has been defined and implemented in the context of the EU H2020 INDIGO-DataCloud project. The case study has been tested and validated on CMIP5 datasets, in the context of a large scale, international testbed involving several ESGF sites (LLNL, ORNL and CMCC), one orchestrator site (PSNC) and one more hosting INDIGO PaaS services (UPV). Additional ESGF sites, such as NCI (Australia) and a couple more in Europe, are also joining the testbed. The added value of the proposed solution is summarized in the following: it implements a server-side paradigm which limits data movement; it relies on a High-Performance Data Analytics (HPDA) stack to address performance; it exploits the INDIGO PaaS layer to support flexible, dynamic and automated deployment of software components; it provides user-friendly web access based on the INDIGO Future Gateway; and finally it integrates, complements and extends the support currently available through ESGF. Overall it provides a new "tool" for climate scientists to run multi-model experiments. At the time this contribution is being written, the proposed testbed represents the first implementation of a distributed large-scale, multi-model experiment in the ESGF/CMIP context, joining together server-side approaches for scientific data analysis, HPDA frameworks, end-to-end workflow management, and cloud computing.
Distinction of Concept and Discussion on Construction Idea of Smart Water Grid Project

NASA Astrophysics Data System (ADS)

Ye, Y.; Yizi, S., Sr.; Lili, L., Sr.; Sang, X.; Zhai, J.

2016-12-01

Smart water grid project includes construction of water physical grid consisting of various flow regulating infrastructures, construction of water information grid in line with the trend of intelligent technology and construction of water management grid featured by system & mechanism construction and systemization of regulation decision-making. It is the integrated platform and comprehensive carrier for water conservancy practices. Currently, there still is dispute over engineering construction idea of smart water grid which, however, represents the future development trend of water management and is increasingly emphasized. The paper, based on distinction of concept of water grid and water grid engineering, explains the concept of water grid intelligentization, actively probes into construction idea of Smart water grid project in our country and presents scientific problems to be solved as well as core technologies to be mastered for smart water grid construction.
Advanced Distribution Management Systems | Grid Modernization | NREL

Science.gov Websites

Advanced Distribution Management Systems Advanced Distribution Management Systems Electric utilities are investing in updated grid technologies such as advanced distribution management systems to management testbed for cyber security in power systems. The "advanced" elements of advanced
Enabling campus grids with open science grid technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weitzel, Derek; Bockelman, Brian; Swanson, David

2011-01-01

The Open Science Grid is a recognized key component of the US national cyber-infrastructure enabling scientific discovery through advanced high throughput computing. The principles and techniques that underlie the Open Science Grid can also be applied to Campus Grids since many of the requirements are the same, even if the implementation technologies differ. We find five requirements for a campus grid: trust relationships, job submission, resource independence, accounting, and data management. The Holland Computing Center's campus grid at the University of Nebraska-Lincoln was designed to fulfill the requirements of a campus grid. A bridging daemon was designed to bring non-Condormore » clusters into a grid managed by Condor. Condor features which make it possible to bridge Condor sites into a multi-campus grid have been exploited at the Holland Computing Center as well.« less
Managing the life cycle of electronic clinical documents.

PubMed

Payne, Thomas H; Graham, Gail

2006-01-01

To develop a model of the life cycle of clinical documents from inception to use in a person's medical record, including workflow requirements from clinical practice, local policy, and regulation. We propose a model for the life cycle of clinical documents as a framework for research on documentation within electronic medical record (EMR) systems. Our proposed model includes three axes: the stages of the document, the roles of those involved with the document, and the actions those involved may take on the document at each stage. The model includes the rules to describe who (in what role) can perform what actions on the document, and at what stages they can perform them. Rules are derived from needs of clinicians, and requirements of hospital bylaws and regulators. Our model encompasses current practices for paper medical records and workflow in some EMR systems. Commercial EMR systems include methods for implementing document workflow rules. Workflow rules that are part of this model mirror functionality in the Department of Veterans Affairs (VA) EMR system where the Authorization/ Subscription Utility permits document life cycle rules to be written in English-like fashion. Creating a model of the life cycle of clinical documents serves as a framework for discussion of document workflow, how rules governing workflow can be implemented in EMR systems, and future research of electronic documentation.
Development and Appraisal of Multiple Accounting Record System (Mars).

PubMed

Yu, H C; Chen, M C

2016-01-01

The aim of the system is to achieve simplification of workflow, reduction of recording time, and increase the income for the study hospital. The project team decided to develop a multiple accounting record system that generates the account records based on the nursing records automatically, reduces the time and effort for nurses to review the procedure and provide another note of material consumption. Three configuration files were identified to demonstrate the relationship of treatments and reimbursement items. The workflow was simplified. The nurses averagely reduced 10 minutes of daily recording time, and the reimbursement points have been increased by 7.49%. The project streamlined the workflow and provides the institute a better way in finical management.
Improving Clinical Workflow in Ambulatory Care: Implemented Recommendations in an Innovation Prototype for the Veteran’s Health Administration

PubMed Central

Patterson, Emily S.; Lowry, Svetlana Z.; Ramaiah, Mala; Gibbons, Michael C.; Brick, David; Calco, Robert; Matton, Greg; Miller, Anne; Makar, Ellen; Ferrer, Jorge A.

2015-01-01

Introduction: Human factors workflow analyses in healthcare settings prior to technology implemented are recommended to improve workflow in ambulatory care settings. In this paper we describe how insights from a workflow analysis conducted by NIST were implemented in a software prototype developed for a Veteran’s Health Administration (VHA) VAi2 innovation project and associated lessons learned. Methods: We organize the original recommendations and associated stages and steps visualized in process maps from NIST and the VA’s lessons learned from implementing the recommendations in the VAi2 prototype according to four stages: 1) before the patient visit, 2) during the visit, 3) discharge, and 4) visit documentation. NIST recommendations to improve workflow in ambulatory care (outpatient) settings and process map representations were based on reflective statements collected during one-hour discussions with three physicians. The development of the VAi2 prototype was conducted initially independently from the NIST recommendations, but at a midpoint in the process development, all of the implementation elements were compared with the NIST recommendations and lessons learned were documented. Findings: Story-based displays and templates with default preliminary order sets were used to support scheduling, time-critical notifications, drafting medication orders, and supporting a diagnosis-based workflow. These templates enabled customization to the level of diagnostic uncertainty. Functionality was designed to support cooperative work across interdisciplinary team members, including shared documentation sessions with tracking of text modifications, medication lists, and patient education features. Displays were customized to the role and included access for consultants and site-defined educator teams. Discussion: Workflow, usability, and patient safety can be enhanced through clinician-centered design of electronic health records. The lessons learned from implementing NIST recommendations to improve workflow in ambulatory care using an EHR provide a first step in moving from a billing-centered perspective on how to maintain accurate, comprehensive, and up-to-date information about a group of patients to a clinician-centered perspective. These recommendations point the way towards a “patient visit management system,” which incorporates broader notions of supporting workload management, supporting flexible flow of patients and tasks, enabling accountable distributed work across members of the clinical team, and supporting dynamic tracking of steps in tasks that have longer time distributions. PMID:26290887
Use of contextual inquiry to understand anatomic pathology workflow: Implications for digital pathology adoption

PubMed Central

Ho, Jonhan; Aridor, Orly; Parwani, Anil V.

2012-01-01

Background: For decades anatomic pathology (AP) workflow have been a highly manual process based on the use of an optical microscope and glass slides. Recent innovations in scanning and digitizing of entire glass slides are accelerating a move toward widespread adoption and implementation of a workflow based on digital slides and their supporting information management software. To support the design of digital pathology systems and ensure their adoption into pathology practice, the needs of the main users within the AP workflow, the pathologists, should be identified. Contextual inquiry is a qualitative, user-centered, social method designed to identify and understand users’ needs and is utilized for collecting, interpreting, and aggregating in-detail aspects of work. Objective: Contextual inquiry was utilized to document current AP workflow, identify processes that may benefit from the introduction of digital pathology systems, and establish design requirements for digital pathology systems that will meet pathologists’ needs. Materials and Methods: Pathologists were observed and interviewed at a large academic medical center according to contextual inquiry guidelines established by Holtzblatt et al. 1998. Notes representing user-provided data were documented during observation sessions. An affinity diagram, a hierarchal organization of the notes based on common themes in the data, was created. Five graphical models were developed to help visualize the data including sequence, flow, artifact, physical, and cultural models. Results: A total of six pathologists were observed by a team of two researchers. A total of 254 affinity notes were documented and organized using a system based on topical hierarchy, including 75 third-level, 24 second-level, and five main-level categories, including technology, communication, synthesis/preparation, organization, and workflow. Current AP workflow was labor intensive and lacked scalability. A large number of processes that may possibly improve following the introduction of digital pathology systems were identified. These work processes included case management, case examination and review, and final case reporting. Furthermore, a digital slide system should integrate with the anatomic pathologic laboratory information system. Conclusions: To our knowledge, this is the first study that utilized the contextual inquiry method to document AP workflow. Findings were used to establish key requirements for the design of digital pathology systems. PMID:23243553
Advanced technologies for scalable ATLAS conditions database access on the grid

NASA Astrophysics Data System (ADS)

Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.

2010-04-01

During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.
Bayesian calibration of coarse-grained forces: Efficiently addressing transferability

NASA Astrophysics Data System (ADS)

Patrone, Paul N.; Rosch, Thomas W.; Phelan, Frederick R.

2016-04-01

Generating and calibrating forces that are transferable across a range of state-points remains a challenging task in coarse-grained (CG) molecular dynamics. In this work, we present a coarse-graining workflow, inspired by ideas from uncertainty quantification and numerical analysis, to address this problem. The key idea behind our approach is to introduce a Bayesian correction algorithm that uses functional derivatives of CG simulations to rapidly and inexpensively recalibrate initial estimates f0 of forces anchored by standard methods such as force-matching. Taking density-temperature relationships as a running example, we demonstrate that this algorithm, in concert with various interpolation schemes, can be used to efficiently compute physically reasonable force curves on a fine grid of state-points. Importantly, we show that our workflow is robust to several choices available to the modeler, including the interpolation schemes and tools used to construct f0. In a related vein, we also demonstrate that our approach can speed up coarse-graining by reducing the number of atomistic simulations needed as inputs to standard methods for generating CG forces.
Cloud services for the Fermilab scientific stakeholders

DOE PAGES

Timm, S.; Garzoglio, G.; Mhashilkar, P.; ...

2015-12-23

As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Cloud services for the Fermilab scientific stakeholders

DOE Office of Scientific and Technical Information (OSTI.GOV)

Timm, S.; Garzoglio, G.; Mhashilkar, P.

As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Microseismic event location using global optimization algorithms: An integrated and automated workflow

NASA Astrophysics Data System (ADS)

Lagos, Soledad R.; Velis, Danilo R.

2018-02-01

We perform the location of microseismic events generated in hydraulic fracturing monitoring scenarios using two global optimization techniques: Very Fast Simulated Annealing (VFSA) and Particle Swarm Optimization (PSO), and compare them against the classical grid search (GS). To this end, we present an integrated and optimized workflow that concatenates into an automated bash script the different steps that lead to the microseismic events location from raw 3C data. First, we carry out the automatic detection, denoising and identification of the P- and S-waves. Secondly, we estimate their corresponding backazimuths using polarization information, and propose a simple energy-based criterion to automatically decide which is the most reliable estimate. Finally, after taking proper care of the size of the search space using the backazimuth information, we perform the location using the aforementioned algorithms for 2D and 3D usual scenarios of hydraulic fracturing processes. We assess the impact of restricting the search space and show the advantages of using either VFSA or PSO over GS to attain significant speed-ups.
78 FR 20087 - Privacy Act of 1974; Proposed New System of Records

Federal Register 2010, 2011, 2012, 2013, 2014

2013-04-03

... is comprised of two components--Enterprise Content Management (ECM) and the Account Management System (AMS). The heart of the system is the ECM component, which manages the workflows that were developed..., digital media, and/or CD-ROM. PAS is a customized module within USDA's Enterprise Content Management (ECM...
Inventory-based landscape-scale simulation of management effectiveness and economic feasibility with BioSum

Treesearch

Jeremy S. Fried; Larry D. Potts; Sara M. Loreno; Glenn A. Christensen; R. Jamie Barbour

2017-01-01

The Forest Inventory and Analysis (FIA)-based BioSum (Bioregional Inventory Originated Simulation Under Management) is a free policy analysis framework and workflow management software solution. It addresses complex management questions concerning forest health and vulnerability for large, multimillion acre, multiowner landscapes using FIA plot data as the initial...
Impact of Considering 110 kV Grid Structures on the Congestion Management in the German Transmission Grid

NASA Astrophysics Data System (ADS)

Hoffrichter, André; Barrios, Hans; Massmann, Janek; Venkataramanachar, Bhavasagar; Schnettler, Armin

2018-02-01

The structural changes in the European energy system lead to an increase of renewable energy sources that are primarily connected to the distribution grid. Hence the stationary analysis of the transmission grid and the regionalization of generation capacities are strongly influenced by subordinate grid structures. To quantify the impact on the congestion management in the German transmission grid, a 110 kV grid model is derived using publicly available data delivered by Open Street Map and integrated into an existing model of the European transmission grid. Power flow and redispatch simulations are performed for three different regionalization methods and grid configurations. The results show a significant impact of the 110 kV system and prove an overestimation of power flows in the transmission grid when neglecting subordinate grids. Thus, the redispatch volume in Germany to dissolve bottlenecks in case of N-1 contingencies decreases by 38 % when considering the 110 kV grid.
Risk management frameworks: supporting the next generation of Murray-Darling Basin water sharing plans

NASA Astrophysics Data System (ADS)

Podger, G. M.; Cuddy, S. M.; Peeters, L.; Smith, T.; Bark, R. H.; Black, D. C.; Wallbrink, P.

2014-09-01

Water jurisdictions in Australia are required to prepare and implement water resource plans. In developing these plans the common goal is realising the best possible use of the water resources - maximising outcomes while minimising negative impacts. This requires managing the risks associated with assessing and balancing cultural, industrial, agricultural, social and environmental demands for water within a competitive and resource-limited environment. Recognising this, conformance to international risk management principles (ISO 31000:2009) have been embedded within the Murray-Darling Basin Plan. Yet, to date, there has been little strategic investment by water jurisdictions in bridging the gap between principle and practice. The ISO 31000 principles and the risk management framework that embodies them align well with an adaptive management paradigm within which to conduct water resource planning. They also provide an integrative framework for the development of workflows that link risk analysis with risk evaluation and mitigation (adaptation) scenarios, providing a transparent, repeatable and robust platform. This study, through a demonstration use case and a series of workflows, demonstrates to policy makers how these principles can be used to support the development of the next generation of water sharing plans in 2019. The workflows consider the uncertainty associated with climate and flow inputs, and model parameters on irrigation and hydropower production, meeting environmental flow objectives and recreational use of the water resource. The results provide insights to the risks associated with meeting a range of different objectives.

Using R in Taverna: RShell v1.2

PubMed Central

Wassink, Ingo; Rauwerda, Han; Neerincx, Pieter BT; Vet, Paul E van der; Breit, Timo M; Leunissen, Jack AM; Nijholt, Anton

2009-01-01

Background R is the statistical language commonly used by many life scientists in (omics) data analysis. At the same time, these complex analyses benefit from a workflow approach, such as used by the open source workflow management system Taverna. However, Taverna had limited support for R, because it supported just a few data types and only a single output. Also, there was no support for graphical output and persistent sessions. Altogether this made using R in Taverna impractical. Findings We have developed an R plugin for Taverna: RShell, which provides R functionality within workflows designed in Taverna. In order to fully support the R language, our RShell plugin directly uses the R interpreter. The RShell plugin consists of a Taverna processor for R scripts and an RShell Session Manager that communicates with the R server. We made the RShell processor highly configurable allowing the user to define multiple inputs and outputs. Also, various data types are supported, such as strings, numeric data and images. To limit data transport between multiple RShell processors, the RShell plugin also supports persistent sessions. Here, we will describe the architecture of RShell and the new features that are introduced in version 1.2, i.e.: i) Support for R up to and including R version 2.9; ii) Support for persistent sessions to limit data transfer; iii) Support for vector graphics output through PDF; iv)Syntax highlighting of the R code; v) Improved usability through fewer port types. Our new RShell processor is backwards compatible with workflows that use older versions of the RShell processor. We demonstrate the value of the RShell processor by a use-case workflow that maps oligonucleotide probes designed with DNA sequence information from Vega onto the Ensembl genome assembly. Conclusion Our RShell plugin enables Taverna users to employ R scripts within their workflows in a highly configurable way. PMID:19607662
Workflow in clinical trial sites & its association with near miss events for data quality: ethnographic, workflow & systems simulation.

PubMed

de Carvalho, Elias Cesar Araujo; Batilana, Adelia Portero; Claudino, Wederson; Reis, Luiz Fernando Lima; Schmerling, Rafael A; Shah, Jatin; Pietrobon, Ricardo

2012-01-01

With the exponential expansion of clinical trials conducted in (Brazil, Russia, India, and China) and VISTA (Vietnam, Indonesia, South Africa, Turkey, and Argentina) countries, corresponding gains in cost and enrolment efficiency quickly outpace the consonant metrics in traditional countries in North America and European Union. However, questions still remain regarding the quality of data being collected in these countries. We used ethnographic, mapping and computer simulation studies to identify/address areas of threat to near miss events for data quality in two cancer trial sites in Brazil. Two sites in Sao Paolo and Rio Janeiro were evaluated using ethnographic observations of workflow during subject enrolment and data collection. Emerging themes related to threats to near miss events for data quality were derived from observations. They were then transformed into workflows using UML-AD and modeled using System Dynamics. 139 tasks were observed and mapped through the ethnographic study. The UML-AD detected four major activities in the workflow evaluation of potential research subjects prior to signature of informed consent, visit to obtain subject́s informed consent, regular data collection sessions following study protocol and closure of study protocol for a given project. Field observations pointed to three major emerging themes: (a) lack of standardized process for data registration at source document, (b) multiplicity of data repositories and (c) scarcity of decision support systems at the point of research intervention. Simulation with policy model demonstrates a reduction of the rework problem. Patterns of threats to data quality at the two sites were similar to the threats reported in the literature for American sites. The clinical trial site managers need to reorganize staff workflow by using information technology more efficiently, establish new standard procedures and manage professionals to reduce near miss events and save time/cost. Clinical trial sponsors should improve relevant support systems.
Workflow in Clinical Trial Sites & Its Association with Near Miss Events for Data Quality: Ethnographic, Workflow & Systems Simulation

PubMed Central

Araujo de Carvalho, Elias Cesar; Batilana, Adelia Portero; Claudino, Wederson; Lima Reis, Luiz Fernando; Schmerling, Rafael A.; Shah, Jatin; Pietrobon, Ricardo

2012-01-01

Background With the exponential expansion of clinical trials conducted in (Brazil, Russia, India, and China) and VISTA (Vietnam, Indonesia, South Africa, Turkey, and Argentina) countries, corresponding gains in cost and enrolment efficiency quickly outpace the consonant metrics in traditional countries in North America and European Union. However, questions still remain regarding the quality of data being collected in these countries. We used ethnographic, mapping and computer simulation studies to identify/address areas of threat to near miss events for data quality in two cancer trial sites in Brazil. Methodology/Principal Findings Two sites in Sao Paolo and Rio Janeiro were evaluated using ethnographic observations of workflow during subject enrolment and data collection. Emerging themes related to threats to near miss events for data quality were derived from observations. They were then transformed into workflows using UML-AD and modeled using System Dynamics. 139 tasks were observed and mapped through the ethnographic study. The UML-AD detected four major activities in the workflow evaluation of potential research subjects prior to signature of informed consent, visit to obtain subject́s informed consent, regular data collection sessions following study protocol and closure of study protocol for a given project. Field observations pointed to three major emerging themes: (a) lack of standardized process for data registration at source document, (b) multiplicity of data repositories and (c) scarcity of decision support systems at the point of research intervention. Simulation with policy model demonstrates a reduction of the rework problem. Conclusions/Significance Patterns of threats to data quality at the two sites were similar to the threats reported in the literature for American sites. The clinical trial site managers need to reorganize staff workflow by using information technology more efficiently, establish new standard procedures and manage professionals to reduce near miss events and save time/cost. Clinical trial sponsors should improve relevant support systems. PMID:22768105
Implementation of Epic Beaker Anatomic Pathology at an Academic Medical Center.

PubMed

Blau, John Larry; Wilford, Joseph D; Dane, Susan K; Karandikar, Nitin J; Fuller, Emily S; Jacobsmeier, Debbie J; Jans, Melissa A; Horning, Elisabeth A; Krasowski, Matthew D; Ford, Bradley A; Becker, Kent R; Beranek, Jeanine M; Robinson, Robert A

2017-01-01

Beaker is a relatively new laboratory information system (LIS) offered by Epic Systems Corporation as part of its suite of health-care software and bundled with its electronic medical record, EpicCare. It is divided into two modules, Beaker anatomic pathology (Beaker AP) and Beaker Clinical Pathology. In this report, we describe our experience implementing Beaker AP version 2014 at an academic medical center with a go-live date of October 2015. This report covers preimplementation preparations and challenges beginning in September 2014, issues discovered soon after go-live in October 2015, and some post go-live optimizations using data from meetings, debriefings, and the project closure document. We share specific issues that we encountered during implementation, including difficulties with the proposed frozen section workflow, developing a shared specimen source dictionary, and implementation of the standard Beaker workflow in large institution with trainees. We share specific strategies that we used to overcome these issues for a successful Beaker AP implementation. Several areas of the laboratory-required adaptation of the default Beaker build parameters to meet the needs of the workflow in a busy academic medical center. In a few areas, our laboratory was unable to use the Beaker functionality to support our workflow, and we have continued to use paper or have altered our workflow. In spite of several difficulties that required creative solutions before go-live, the implementation has been successful based on satisfaction surveys completed by pathologists and others who use the software. However, optimization of Beaker workflows has continued to be an ongoing process after go-live to the present time. The Beaker AP LIS can be successfully implemented at an academic medical center but requires significant forethought, creative adaptation, and continued shared management of the ongoing product by institutional and departmental information technology staff as well as laboratory managers to meet the needs of the laboratory.
DE-FG02-04ER25606 Identity Federation and Policy Management Guide: Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humphrey, Marty, A

The goal of this 3-year project was to facilitate a more productive dynamic matching between resource providers and resource consumers in Grid environments by explicitly specifying policies. There were broadly two problems being addressed by this project. First, there was a lack of an Open Grid Services Architecture (OGSA)-compliant mechanism for expressing, storing and retrieving user policies and Virtual Organization (VO) policies. Second, there was a lack of tools to resolve and enforce policies in the Open Services Grid Architecture. To address these problems, our overall approach in this project was to make all policies explicit (e.g., virtual organization policies,more » resource provider policies, resource consumer policies), thereby facilitating policy matching and policy negotiation. Policies defined on a per-user basis were created, held, and updated in MyPolMan, thereby providing a Grid user to centralize (where appropriate) and manage his/her policies. Organizationally, the corresponding service was VOPolMan, in which the policies of the Virtual Organization are expressed, managed, and dynamically consulted. Overall, we successfully defined, prototyped, and evaluated policy-based resource management and access control for OGSA-based Grids. This DOE project partially supported 17 peer-reviewed publications on a number of different topics: General security for Grids, credential management, Web services/OGSA/OGSI, policy-based grid authorization (for remote execution and for access to information), policy-directed Grid data movement/placement, policies for large-scale virtual organizations, and large-scale policy-aware grid architectures. In addition to supporting the PI, this project partially supported the training of 5 PhD students.« less
SU-E-J-106: The Use of Deformable Image Registration with Cone-Beam CT for a Better Evaluation of Cumulative Dose to Organs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fillion, O; Gingras, L; Archambault, L

2015-06-15

Purpose: The knowledge of dose accumulation in the patient tissues in radiotherapy helps in determining the treatment outcomes. This project aims at providing a workflow to map cumulative doses that takes into account interfraction organ motion without the need for manual re-contouring. Methods: Five prostate cancer patients were studied. Each patient had a planning CT (pCT) and 5 to 13 CBCT scans. On each series, a physician contoured the prostate, rectum, bladder, seminal vesicles and the intestine. First, a deformable image registration (DIR) of the pCTs onto the daily CBCTs yielded registered CTs (rCT) . This rCT combined the accuratemore » CT numbers of the pCT with the daily anatomy of the CBCT. Second, the original plans (220 cGy per fraction for 25 fractions) were copied on the rCT for dose re-calculation. Third, the DIR software Elastix was used to find the inverse transform from the rCT to the pCT. This transformation was then applied to the rCT dose grid to map the dose voxels back to their pCT location. Finally, the sum of these deformed dose grids for each patient was applied on the pCT to calculate the actual dose delivered to organs. Results: The discrepancy between the planned D98 and D2 and these indices re-calculated on the rCT, are, on average, of −1 ± 1 cGy and 1 ± 2 cGy per fraction, respectively. For fractions with large anatomical motion, the D98 discrepancy on the re-calculated dose grid mapped onto the pCT can raise to −17 ± 4 cGy. The obtained cumulative dose distributions illustrate the same behavior. Conclusion: This approach allowed the evaluation of cumulative doses to organs with the help of uncontoured daily CBCT scans. With this workflow, the easy evaluation of doses delivered for EBRT treatments could ultimately lead to a better follow-up of prostate cancer patients.« less
Grid-based Meteorological and Crisis Applications

NASA Astrophysics Data System (ADS)

Hluchy, Ladislav; Bartok, Juraj; Tran, Viet; Lucny, Andrej; Gazak, Martin

2010-05-01

We present several applications from domain of meteorology and crisis management we developed and/or plan to develop. Particularly, we present IMS Model Suite - a complex software system designed to address the needs of accurate forecast of weather and hazardous weather phenomena, environmental pollution assessment, prediction of consequences of nuclear accident and radiological emergency. We discuss requirements on computational means and our experiences how to meet them by grid computing. The process of a pollution assessment and prediction of the consequences in case of radiological emergence results in complex data-flows and work-flows among databases, models and simulation tools (geographical databases, meteorological and dispersion models, etc.). A pollution assessment and prediction requires running of 3D meteorological model (4 nests with resolution from 50 km to 1.8 km centered on nuclear power plant site, 38 vertical levels) as well as running of the dispersion model performing the simulation of the release transport and deposition of the pollutant with respect to the numeric weather prediction data, released material description, topography, land use description and user defined simulation scenario. Several post-processing options can be selected according to particular situation (e.g. doses calculation). Another example is a forecasting of fog as one of the meteorological phenomena hazardous to the aviation as well as road traffic. It requires complicated physical model and high resolution meteorological modeling due to its dependence on local conditions (precise topography, shorelines and land use classes). An installed fog modeling system requires a 4 time nested parallelized 3D meteorological model with 1.8 km horizontal resolution and 42 levels vertically (approx. 1 million points in 3D space) to be run four times daily. The 3D model outputs and multitude of local measurements are utilized by SPMD-parallelized 1D fog model run every hour. The fog forecast model is a subject of the parameterization and parameter optimization before its real deployment. The parameter optimization requires tens of evaluations of the parameterized model accuracy and each evaluation of the model parameters requires re-running of the hundreds of meteorological situations collected over the years and comparison of the model output with the observed data. The architecture and inherent heterogeneity of both examples and their computational complexity and their interfaces to other systems and services make them well suited for decomposition into a set of web and grid services. Such decomposition has been performed within several projects we participated or participate in cooperation with academic sphere, namely int.eu.grid (dispersion model deployed as a pilot application to an interactive grid), SEMCO-WS (semantic composition of the web and grid services), DMM (development of a significant meteorological phenomena prediction system based on the data mining), VEGA 2009-2011 and EGEE III. We present useful and practical applications of technologies of high performance computing. The use of grid technology provides access to much higher computation power not only for modeling and simulation, but also for the model parameterization and validation. This results in the model parameters optimization and more accurate simulation outputs. Having taken into account that the simulations are used for the aviation, road traffic and crisis management, even small improvement in accuracy of predictions may result in significant improvement of safety as well as cost reduction. We found grid computing useful for our applications. We are satisfied with this technology and our experience encourages us to extend its use. Within an ongoing project (DMM) we plan to include processing of satellite images which extends our requirement on computation very rapidly. We believe that thanks to grid computing we are able to handle the job almost in real time.
Techniques for Interventional MRI Guidance in Closed-Bore Systems.

PubMed

Busse, Harald; Kahn, Thomas; Moche, Michael

2018-02-01

Efficient image guidance is the basis for minimally invasive interventions. In comparison with X-ray, computed tomography (CT), or ultrasound imaging, magnetic resonance imaging (MRI) provides the best soft tissue contrast without ionizing radiation and is therefore predestined for procedural control. But MRI is also characterized by spatial constraints, electromagnetic interactions, long imaging times, and resulting workflow issues. Although many technical requirements have been met over the years-most notably magnetic resonance (MR) compatibility of tools, interventional pulse sequences, and powerful processing hardware and software-there is still a large variety of stand-alone devices and systems for specific procedures only.Stereotactic guidance with the table outside the magnet is common and relies on proper registration of the guiding grids or manipulators to the MR images. Instrument tracking, often by optical sensing, can be added to provide the physicians with proper eye-hand coordination during their navigated approach. Only in very short wide-bore systems, needles can be advanced at the extended arm under near real-time imaging. In standard magnets, control and workflow may be improved by remote operation using robotic or manual driving elements.This work highlights a number of devices and techniques for different interventional settings with a focus on percutaneous, interstitial procedures in different organ regions. The goal is to identify technical and procedural elements that might be relevant for interventional guidance in a broader context, independent of the clinical application given here. Key challenges remain the seamless integration into the interventional workflow, safe clinical translation, and proper cost effectiveness.
Towards seamless workflows in agile data science

NASA Astrophysics Data System (ADS)

Klump, J. F.; Robertson, J.

2017-12-01

Agile workflows are a response to projects with requirements that may change over time. They prioritise rapid and flexible responses to change, preferring to adapt to changes in requirements rather than predict them before a project starts. This suits the needs of research very well because research is inherently agile in its methodology. The adoption of agile methods has made collaborative data analysis much easier in a research environment fragmented across institutional data stores, HPC, personal and lab computers and more recently cloud environments. Agile workflows use tools that share a common worldview: in an agile environment, there may be more that one valid version of data, code or environment in play at any given time. All of these versions need references and identifiers. For example, a team of developers following the git-flow conventions (github.com/nvie/gitflow) may have several active branches, one for each strand of development. These workflows allow rapid and parallel iteration while maintaining identifiers pointing to individual snapshots of data and code and allowing rapid switching between strands. In contrast, the current focus of versioning in research data management is geared towards managing data for reproducibility and long-term preservation of the record of science. While both are important goals in the persistent curation domain of the institutional research data infrastructure, current tools emphasise planning over adaptation and can introduce unwanted rigidity by insisting on a single valid version or point of truth. In the collaborative curation domain of a research project, things are more fluid. However, there is no equivalent to the "versioning iso-surface" of the git protocol for the management and versioning of research data. At CSIRO we are developing concepts and tools for the agile management of software code and research data for virtual research environments, based on our experiences of actual data analytics projects in the geosciences. We use code management that allows researchers to interact with the code through tools like Jupyter Notebooks while data are held in an object store. Our aim is an architecture allowing seamless integration of code development, data management, and data processing in virtual research environments.
Distribution Management System Volt/VAR Evaluation | Grid Modernization |

Science.gov Websites

NREL Distribution Management System Volt/VAR Evaluation Distribution Management System Volt/VAR Evaluation This project involves building a prototype distribution management system testbed that links a GE Grid Solutions distribution management system to power hardware-in-the-loop testing. This setup is
Barriers to effective, safe communication and workflow between nurses and non-consultant hospital doctors during out-of-hours.

PubMed

Brady, Anne-Marie; Byrne, Gobnait; Quirke, Mary Brigid; Lynch, Aine; Ennis, Shauna; Bhangu, Jaspreet; Prendergast, Meabh

2017-11-01

This study aimed to evaluate the nature and type of communication and workflow arrangements between nurses and doctors out-of-hours (OOH). Effective communication and workflow arrangements between nurses and doctors are essential to minimize risk in hospital settings, particularly in the out-of-hour's period. Timely patient flow is a priority for all healthcare organizations and the quality of communication and workflow arrangements influences patient safety. Qualitative descriptive design and data collection methods included focus groups and individual interviews. A 500 bed tertiary referral acute hospital in Ireland. Junior and senior Non-Consultant Hospital Doctors, staff nurses and nurse managers. Both nurses and doctors acknowledged the importance of good interdisciplinary communication and collaborative working, in sustaining effective workflow and enabling a supportive working environment and patient safety. Indeed, issues of safety and missed care OOH were found to be primarily due to difficulties of communication and workflow. Medical workflow OOH is often dependent on cues and communication to/from nursing. However, communication systems and, in particular the bleep system, considered central to the process of communication between doctors and nurses OOH, can contribute to workflow challenges and increased staff stress. It was reported as commonplace for routine work, that should be completed during normal hours, to fall into OOH when resources were most limited, further compounding risk to patient safety. Enhancement of communication strategies between nurses and doctors has the potential to remove barriers to effective decision-making and patient flow. © The Author 2017. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Widening the adoption of workflows to include human and human-machine scientific processes

NASA Astrophysics Data System (ADS)

Salayandia, L.; Pinheiro da Silva, P.; Gates, A. Q.

2010-12-01

Scientific workflows capture knowledge in the form of technical recipes to access and manipulate data that help scientists manage and reuse established expertise to conduct their work. Libraries of scientific workflows are being created in particular fields, e.g., Bioinformatics, where combined with cyber-infrastructure environments that provide on-demand access to data and tools, result in powerful workbenches for scientists of those communities. The focus in these particular fields, however, has been more on automating rather than documenting scientific processes. As a result, technical barriers have impeded a wider adoption of scientific workflows by scientific communities that do not rely as heavily on cyber-infrastructure and computing environments. Semantic Abstract Workflows (SAWs) are introduced to widen the applicability of workflows as a tool to document scientific recipes or processes. SAWs intend to capture a scientists’ perspective about the process of how she or he would collect, filter, curate, and manipulate data to create the artifacts that are relevant to her/his work. In contrast, scientific workflows describe the process from the point of view of how technical methods and tools are used to conduct the work. By focusing on a higher level of abstraction that is closer to a scientist’s understanding, SAWs effectively capture the controlled vocabularies that reflect a particular scientific community, as well as the types of datasets and methods used in a particular domain. From there on, SAWs provide the flexibility to adapt to different environments to carry out the recipes or processes. These environments range from manual fieldwork to highly technical cyber-infrastructure environments, i.e., such as those already supported by scientific workflows. Two cases, one from Environmental Science and another from Geophysics, are presented as illustrative examples.
Process improvement for the safe delivery of multidisciplinary-executed treatments-A case in Y-90 microspheres therapy.

PubMed

Cai, Bin; Altman, Michael B; Garcia-Ramirez, Jose; LaBrash, Jason; Goddu, S Murty; Mutic, Sasa; Parikh, Parag J; Olsen, Jeffrey R; Saad, Nael; Zoberi, Jacqueline E

To develop a safe and robust workflow for yttrium-90 (Y-90) radioembolization procedures in a multidisciplinary team environment. A generalized Define-Measure-Analyze-Improve-Control (DMAIC)-based approach to process improvement was applied to a Y-90 radioembolization workflow. In the first DMAIC cycle, events with the Y-90 workflow were defined and analyzed. To improve the workflow, a web-based interactive electronic white board (EWB) system was adopted as the central communication platform and information processing hub. The EWB-based Y-90 workflow then underwent a second DMAIC cycle. Out of 245 treatments, three misses that went undetected until treatment initiation were recorded over a period of 21 months, and root-cause-analysis was performed to determine causes of each incident and opportunities for improvement. The EWB-based Y-90 process was further improved via new rules to define reliable sources of information as inputs into the planning process, as well as new check points to ensure this information was communicated correctly throughout the process flow. After implementation of the revised EWB-based Y-90 workflow, after two DMAIC-like cycles, there were zero misses out of 153 patient treatments in 1 year. The DMAIC-based approach adopted here allowed the iterative development of a robust workflow to achieve an adaptable, event-minimizing planning process despite a complex setting which requires the participation of multiple teams for Y-90 microspheres therapy. Implementation of such a workflow using the EWB or similar platform with a DMAIC-based process improvement approach could be expanded to other treatment procedures, especially those requiring multidisciplinary management. Copyright © 2016 American Brachytherapy Society. Published by Elsevier Inc. All rights reserved.
Smart Grid | Climate Neutral Research Campuses | NREL

Science.gov Websites

begun to build smart grids. Most operate electricity grids that include power generation; load control plant managers use these communications for energy management and load shedding, which are among the top familiar with equipment interoperability, central dispatch, and load shedding. These are common in smart
Web-based workflows to produce ocean climatologies using DIVA (Data-Interpolating Variational Analysis) and Jupyter notebooks

NASA Astrophysics Data System (ADS)

Barth, Alexander; Troupin, Charles; Watelet, Sylvain; Alvera-Azcarate, Aida; Beckers, Jean-Marie

2017-04-01

The analysis tool DIVA (Data-Interpolating Variational Analysis) is designed to generate gridded fields or climatologies from in situ observations. The tool DIVA minimizes a cost function to ensure that the analysed field is relatively close to the observations and conforms at the same time to a set of dynamical constraints. In particular, DIVA naturally decouples water bodies which are not directly connected and it uses a (potentially spatial varying) correlation length to describe over which length-scale the analysed variable is correlated. In addition, DIVA can also take ocean currents into account to introduce a preferential direction for the correlation. The SeaDataCloud project aims to facilitate the access and use of ocean in situ data from 45 national oceanographic data centres and marine data centres from 35 countries riparian to all European seas. A central aspect is to provide web-based virtual research environment, where scientists can easily access and explore the data sets through the SeaDataCloud infrastructure. For users familiar with programming languages like Julia and Python, Jupyter (acronym for Julia, Python and R) notebooks provide an exciting way to analyse and to interact with ocean data. Jupyter notebooks are made up of cells that can be run individually and can contain text, formulas or code fragment. A complete notebook explains how to go from input data and parameters to a result, in this case a gridded field obtained executing DIVA. This presentation discusses this new web-based workflow for generating climatologies using DIVA. It explores its new possibilities in particular, in terms of improved ease of use and reproducibility of the results. The integration in the infrastructure of EUDAT is also addressed.
From the CMS Computing Experience in the WLCG STEP'09 Challenge to the First Data Taking of the LHC Era

NASA Astrophysics Data System (ADS)

Bonacorsi, D.; Gutsche, O.

The Worldwide LHC Computing Grid (WLCG) project decided in March 2009 to perform scale tests of parts of its overall Grid infrastructure before the start of the LHC data taking. The "Scale Test for the Experiment Program" (STEP'09) was performed mainly in June 2009 -with more selected tests in September- October 2009 -and emphasized the simultaneous test of the computing systems of all 4 LHC experiments. CMS tested its Tier-0 tape writing and processing capabilities. The Tier-1 tape systems were stress tested using the complete range of Tier-1 work-flows: transfer from Tier-0 and custody of data on tape, processing and subsequent archival, redistribution of datasets amongst all Tier-1 sites as well as burst transfers of datasets to Tier-2 sites. The Tier-2 analysis capacity was tested using bulk analysis job submissions to backfill normal user activity. In this talk, we will report on the different performed tests and present their post-mortem analysis.
The Kiel data management infrastructure - arising from a generic data model

NASA Astrophysics Data System (ADS)

Fleischer, D.; Mehrtens, H.; Schirnick, C.; Springer, P.

2010-12-01

The Kiel Data Management Infrastructure (KDMI) started from a cooperation of three large-scale projects (SFB574, SFB754 and Cluster of Excellence The Future Ocean) and the Leibniz Institute of Marine Sciences (IFM-GEOMAR). The common strategy for project data management is a single person collecting and transforming data according to the requirements of the targeted data center(s). The intention of the KDMI cooperation is to avoid redundant and potentially incompatible data management efforts for scientists and data managers and to create a single sustainable infrastructure. An increased level of complexity in the conceptual planing arose from the diversity of marine disciplines and approximately 1000 scientists involved. KDMI key features focus on the data provenance which we consider to comprise the entire workflow from field sampling thru labwork to data calculation and evaluation. Managing the data of each individual project participant in this way yields the data management for the entire project and warrants the reusability of (meta)data. Accordingly scientists provide a workflow definition of their data creation procedures resulting in their target variables. The central idea in the development of the KDMI presented here is based on the object oriented programming concept which allows to have one object definition (workflow) and infinite numbers of object instances (data). Each definition is created by a graphical user interface and produces XML output stored in a database using a generic data model. On creation of a data instance the KDMI translates the definition into web forms for the scientist, the generic data model then accepts all information input following the given data provenance definition. An important aspect of the implementation phase is the possibility of a successive transition from daily measurement routines resulting in single spreadsheet files with well known points of failure and limited reuseability to a central infrastructure as a single point of truth. The data provenance approach has the following positive side effects: (1) the scientist designs the extend and timing of data and metadata prompts by workflow definitions himself while (2) consistency and completeness (mandatory information) of metadata in the resulting XML document can be checked by XML validation. (3) Storage of the entire data creation process (including raw data and processing steps) provides a multidimensional quality history accessible by all researchers in addition to the commonly applied one dimensional quality flag system. (4) The KDMI can be extended to other scientific disciplines by adding new workflows and domain specific outputs assisted by the KDMI-Team. The KDMI is a social network inspired system but instead of sharing privacy it is a sharing platform for daily scientific work, data and their provenance.
CERES AuTomAted job Loading SYSTem (CATALYST): An automated workflow manager for satellite data production

NASA Astrophysics Data System (ADS)

Gleason, J. L.; Hillyer, T. N.; Wilkins, J.

2012-12-01

The CERES Science Team integrates data from 5 CERES instruments onboard the Terra, Aqua and NPP missions. The processing chain fuses CERES observations with data from 19 other unique sources. The addition of CERES Flight Model 5 (FM5) onboard NPP, coupled with ground processing system upgrades further emphasizes the need for an automated job-submission utility to manage multiple processing streams concurrently. The operator-driven, legacy-processing approach relied on manually staging data from magnetic tape to limited spinning disk attached to a shared memory architecture system. The migration of CERES production code to a distributed, cluster computing environment with approximately one petabyte of spinning disk containing all precursor input data products facilitates the development of a CERES-specific, automated workflow manager. In the cluster environment, I/O is the primary system resource in contention across jobs. Therefore, system load can be maximized with a throttling workload manager. This poster discusses a Java and Perl implementation of an automated job management tool tailored for CERES processing.
Workflow Challenges of Enterprise Imaging: HIMSS-SIIM Collaborative White Paper.

PubMed

Towbin, Alexander J; Roth, Christopher J; Bronkalla, Mark; Cram, Dawn

2016-10-01

With the advent of digital cameras, there has been an explosion in the number of medical specialties using images to diagnose or document disease and guide interventions. In many specialties, these images are not added to the patient's electronic medical record and are not distributed so that other providers caring for the patient can view them. As hospitals begin to develop enterprise imaging strategies, they have found that there are multiple challenges preventing the implementation of systems to manage image capture, image upload, and image management. This HIMSS-SIIM white paper will describe the key workflow challenges related to enterprise imaging and offer suggestions for potential solutions to these challenges.
Case Report: Activity Diagrams for Integrating Electronic Prescribing Tools into Clinical Workflow

PubMed Central

Johnson, Kevin B.; FitzHenry, Fern

2006-01-01

To facilitate the future implementation of an electronic prescribing system, this case study modeled prescription management processes in various primary care settings. The Vanderbilt e-prescribing design team conducted initial interviews with clinic managers, physicians and nurses, and then represented the sequences of steps carried out to complete prescriptions in activity diagrams. The diagrams covered outpatient prescribing for patients during a clinic visit and between clinic visits. Practice size, practice setting, and practice specialty type influenced the prescribing processes used. The model developed may be useful to others engaged in building or tailoring an e-prescribing system to meet the specific workflows of various clinic settings. PMID:16622168

A software tool to analyze clinical workflows from direct observations.

PubMed

Schweitzer, Marco; Lasierra, Nelia; Hoerbst, Alexander

2015-01-01

Observational data of clinical processes need to be managed in a convenient way, so that process information is reliable, valid and viable for further analysis. However, existing tools for allocating observations fail in systematic data collection of specific workflow recordings. We present a software tool which was developed to facilitate the analysis of clinical process observations. The tool was successfully used in the project OntoHealth, to build, store and analyze observations of diabetes routine consultations.
A Web application for the management of clinical workflow in image‐guided and adaptive proton therapy for prostate cancer treatments

PubMed Central

Boes, Peter; Ho, Meng Wei; Li, Zuofeng

2015-01-01

Image‐guided radiotherapy (IGRT), based on radiopaque markers placed in the prostate gland, was used for proton therapy of prostate patients. Orthogonal X‐rays and the IBA Digital Image Positioning System (DIPS) were used for setup correction prior to treatment and were repeated after treatment delivery. Following a rationale for margin estimates similar to that of van Herk,(1) the daily post‐treatment DIPS data were analyzed to determine if an adaptive radiotherapy plan was necessary. A Web application using ASP.NET MVC5, Entity Framework, and an SQL database was designed to automate this process. The designed features included state‐of‐the‐art Web technologies, a domain model closely matching the workflow, a database‐supporting concurrency and data mining, access to the DIPS database, secured user access and roles management, and graphing and analysis tools. The Model‐View‐Controller (MVC) paradigm allowed clean domain logic, unit testing, and extensibility. Client‐side technologies, such as jQuery, jQuery Plug‐ins, and Ajax, were adopted to achieve a rich user environment and fast response. Data models included patients, staff, treatment fields and records, correction vectors, DIPS images, and association logics. Data entry, analysis, workflow logics, and notifications were implemented. The system effectively modeled the clinical workflow and IGRT process. PACS number: 87 PMID:26103504
SARS Grid--an AG-based disease management and collaborative platform.

PubMed

Hung, Shu-Hui; Hung, Tsung-Chieh; Juang, Jer-Nan

2006-01-01

This paper describes the development of the NCHC's Severe Acute Respiratory Syndrome (SARS) Grid project-An Access Grid (AG)-based disease management and collaborative platform that allowed for SARS patient's medical data to be dynamically shared and discussed between hospitals and doctors using AG's video teleconferencing (VTC) capabilities. During the height of the SARS epidemic in Asia, SARS Grid and the SARShope website significantly curved the spread of SARS by helping doctors manage the in-hospital and in-home care of quarantined SARS patients through medical data exchange and the monitoring of the patient's symptoms. Now that the SARS epidemic has ended, the primary function of the SARS Grid project is that of a web-based informatics tool to increase pubic awareness of SARS and other epidemic diseases. Additionally, the SARS Grid project can be viewed and further studied as an outstanding model of epidemic disease prevention and/or containment.
Reduction of peak energy demand based on smart appliances energy consumption adjustment

NASA Astrophysics Data System (ADS)

Powroźnik, P.; Szulim, R.

2017-08-01

In the paper the concept of elastic model of energy management for smart grid and micro smart grid is presented. For the proposed model a method for reducing peak demand in micro smart grid has been defined. The idea of peak demand reduction in elastic model of energy management is to introduce a balance between demand and supply of current power for the given Micro Smart Grid in the given moment. The results of the simulations studies were presented. They were carried out on real household data available on UCI Machine Learning Repository. The results may have practical application in the smart grid networks, where there is a need for smart appliances energy consumption adjustment. The article presents a proposal to implement the elastic model of energy management as the cloud computing solution. This approach of peak demand reduction might have application particularly in a large smart grid.
Using conceptual work products of health care to design health IT.

PubMed

Berry, Andrew B L; Butler, Keith A; Harrington, Craig; Braxton, Melissa O; Walker, Amy J; Pete, Nikki; Johnson, Trevor; Oberle, Mark W; Haselkorn, Jodie; Paul Nichol, W; Haselkorn, Mark

2016-02-01

This paper introduces a new, model-based design method for interactive health information technology (IT) systems. This method extends workflow models with models of conceptual work products. When the health care work being modeled is substantially cognitive, tacit, and complex in nature, graphical workflow models can become too complex to be useful to designers. Conceptual models complement and simplify workflows by providing an explicit specification for the information product they must produce. We illustrate how conceptual work products can be modeled using standard software modeling language, which allows them to provide fundamental requirements for what the workflow must accomplish and the information that a new system should provide. Developers can use these specifications to envision how health IT could enable an effective cognitive strategy as a workflow with precise information requirements. We illustrate the new method with a study conducted in an outpatient multiple sclerosis (MS) clinic. This study shows specifically how the different phases of the method can be carried out, how the method allows for iteration across phases, and how the method generated a health IT design for case management of MS that is efficient and easy to use. Copyright © 2015 Elsevier Inc. All rights reserved.
Using CREAM and CEMonitor for job submission and management in the gLite middleware

NASA Astrophysics Data System (ADS)

Aiftimiei, C.; Andreetto, P.; Bertocco, S.; Dalla Fina, S.; Dorigo, A.; Frizziero, E.; Gianelle, A.; Marzolla, M.; Mazzucato, M.; Mendez Lorenzo, P.; Miccio, V.; Sgaravatto, M.; Traldi, S.; Zangrando, L.

2010-04-01

In this paper we describe the use of CREAM and CEMonitor services for job submission and management within the gLite Grid middleware. Both CREAM and CEMonitor address one of the most fundamental operations of a Grid middleware, that is job submission and management. Specifically, CREAM is a job management service used for submitting, managing and monitoring computational jobs. CEMonitor is an event notification framework, which can be coupled with CREAM to provide the users with asynchronous job status change notifications. Both components have been integrated in the gLite Workload Management System by means of ICE (Interface to CREAM Environment). These software components have been released for production in the EGEE Grid infrastructure and, for what concerns the CEMonitor service, also in the OSG Grid. In this paper we report the current status of these services, the achieved results, and the issues that still have to be addressed.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chao, Tian-Jy; Kim, Younghun

An end-to-end interoperability and workflows from building architecture design to one or more simulations, in one aspect, may comprise establishing a BIM enablement platform architecture. A data model defines data entities and entity relationships for enabling the interoperability and workflows. A data definition language may be implemented that defines and creates a table schema of a database associated with the data model. Data management services and/or application programming interfaces may be implemented for interacting with the data model. Web services may also be provided for interacting with the data model via the Web. A user interface may be implemented thatmore » communicates with users and uses the BIM enablement platform architecture, the data model, the data definition language, data management services and application programming interfaces to provide functions to the users to perform work related to building information management.« less
Vel-IO 3D: A tool for 3D velocity model construction, optimization and time-depth conversion in 3D geological modeling workflow

NASA Astrophysics Data System (ADS)

Maesano, Francesco E.; D'Ambrogi, Chiara

2017-02-01

We present Vel-IO 3D, a tool for 3D velocity model creation and time-depth conversion, as part of a workflow for 3D model building. The workflow addresses the management of large subsurface dataset, mainly seismic lines and well logs, and the construction of a 3D velocity model able to describe the variation of the velocity parameters related to strong facies and thickness variability and to high structural complexity. Although it is applicable in many geological contexts (e.g. foreland basins, large intermountain basins), it is particularly suitable in wide flat regions, where subsurface structures have no surface expression. The Vel-IO 3D tool is composed by three scripts, written in Python 2.7.11, that automate i) the 3D instantaneous velocity model building, ii) the velocity model optimization, iii) the time-depth conversion. They determine a 3D geological model that is consistent with the primary geological constraints (e.g. depth of the markers on wells). The proposed workflow and the Vel-IO 3D tool have been tested, during the EU funded Project GeoMol, by the construction of the 3D geological model of a flat region, 5700 km2 in area, located in the central part of the Po Plain. The final 3D model showed the efficiency of the workflow and Vel-IO 3D tool in the management of large amount of data both in time and depth domain. A 4 layer-cake velocity model has been applied to a several thousand (5000-13,000 m) thick succession, with 15 horizons from Triassic up to Pleistocene, complicated by a Mesozoic extensional tectonics and by buried thrusts related to Southern Alps and Northern Apennines.
A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

PubMed Central

Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

2018-01-01

Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/
Task-technology fit of video telehealth for nurses in an outpatient clinic setting.

PubMed

Cady, Rhonda G; Finkelstein, Stanley M

2014-07-01

Incorporating telehealth into outpatient care delivery supports management of consumer health between clinic visits. Task-technology fit is a framework for understanding how technology helps and/or hinders a person during work processes. Evaluating the task-technology fit of video telehealth for personnel working in a pediatric outpatient clinic and providing care between clinic visits ensures the information provided matches the information needed to support work processes. The workflow of advanced practice registered nurse (APRN) care coordination provided via telephone and video telehealth was described and measured using a mixed-methods workflow analysis protocol that incorporated cognitive ethnography and time-motion study. Qualitative and quantitative results were merged and analyzed within the task-technology fit framework to determine the workflow fit of video telehealth for APRN care coordination. Incorporating video telehealth into APRN care coordination workflow provided visual information unavailable during telephone interactions. Despite additional tasks and interactions needed to obtain the visual information, APRN workflow efficiency, as measured by time, was not significantly changed. Analyzed within the task-technology fit framework, the increased visual information afforded by video telehealth supported the assessment and diagnostic information needs of the APRN. Telehealth must provide the right information to the right clinician at the right time. Evaluating task-technology fit using a mixed-methods protocol ensured rigorous analysis of fit within work processes and identified workflows that benefit most from the technology.
GWM-2005 - A Groundwater-Management Process for MODFLOW-2005 with Local Grid Refinement (LGR) Capability

USGS Publications Warehouse

Ahlfeld, David P.; Baker, Kristine M.; Barlow, Paul M.

2009-01-01

This report describes the Groundwater-Management (GWM) Process for MODFLOW-2005, the 2005 version of the U.S. Geological Survey modular three-dimensional groundwater model. GWM can solve a broad range of groundwater-management problems by combined use of simulation- and optimization-modeling techniques. These problems include limiting groundwater-level declines or streamflow depletions, managing groundwater withdrawals, and conjunctively using groundwater and surface-water resources. GWM was initially released for the 2000 version of MODFLOW. Several modifications and enhancements have been made to GWM since its initial release to increase the scope of the program's capabilities and to improve its operation and reporting of results. The new code, which is called GWM-2005, also was designed to support the local grid refinement capability of MODFLOW-2005. Local grid refinement allows for the simulation of one or more higher resolution local grids (referred to as child models) within a coarser grid parent model. Local grid refinement is often needed to improve simulation accuracy in regions where hydraulic gradients change substantially over short distances or in areas requiring detailed representation of aquifer heterogeneity. GWM-2005 can be used to formulate and solve groundwater-management problems that include components in both parent and child models. Although local grid refinement increases simulation accuracy, it can also substantially increase simulation run times.
Work Flow Analysis Report Consisting of Work Management - Preventive Maintenance - Materials and Equipment

DOE Office of Scientific and Technical Information (OSTI.GOV)

JENNINGS, T.L.

The Work Flow analysis Report will be used to facilitate the requirements for implementing the Work Control module of Passport. The report consists of workflow integration processes for Work Management, Preventative Maintenance, Materials and Equipment
Achieving production-level use of HEP software at the Argonne Leadership Computing Facility

NASA Astrophysics Data System (ADS)

Uram, T. D.; Childers, J. T.; LeCompte, T. J.; Papka, M. E.; Benjamin, D.

2015-12-01

HEP's demand for computing resources has grown beyond the capacity of the Grid, and these demands will accelerate with the higher energy and luminosity planned for Run II. Mira, the ten petaFLOPs supercomputer at the Argonne Leadership Computing Facility, is a potentially significant compute resource for HEP research. Through an award of fifty million hours on Mira, we have delivered millions of events to LHC experiments by establishing the means of marshaling jobs through serial stages on local clusters, and parallel stages on Mira. We are running several HEP applications, including Alpgen, Pythia, Sherpa, and Geant4. Event generators, such as Sherpa, typically have a split workload: a small scale integration phase, and a second, more scalable, event-generation phase. To accommodate this workload on Mira we have developed two Python-based Django applications, Balsam and ARGO. Balsam is a generalized scheduler interface which uses a plugin system for interacting with scheduler software such as HTCondor, Cobalt, and TORQUE. ARGO is a workflow manager that submits jobs to instances of Balsam. Through these mechanisms, the serial and parallel tasks within jobs are executed on the appropriate resources. This approach and its integration with the PanDA production system will be discussed.
ATLAS WORLD-cloud and networking in PanDA

NASA Astrophysics Data System (ADS)

Barreiro Megino, F.; De, K.; Di Girolamo, A.; Maeno, T.; Walker, R.; ATLAS Collaboration

2017-10-01

The ATLAS computing model was originally designed as static clouds (usually national or geographical groupings of sites) around the Tier 1 centres, which confined tasks and most of the data traffic. Since those early days, the sites’ network bandwidth has increased at 0(1000) and the difference in functionalities between Tier 1s and Tier 2s has reduced. After years of manual, intermediate solutions, we have now ramped up to full usage of World-cloud, the latest step in the PanDA Workload Management System to increase resource utilization on the ATLAS Grid, for all workflows (MC production, data (re)processing, etc.). We have based the development on two new site concepts. Nuclei sites are the Tier 1s and large Tier 2s, where tasks will be assigned and the output aggregated, and satellites are the sites that will execute the jobs and send the output to their nucleus. PanDA dynamically pairs nuclei and satellite sites for each task based on the input data availability, capability matching, site load and network connectivity. This contribution will introduce the conceptual changes for World-cloud, the development necessary in PanDA, an insight into the network model and the first half-year of operational experience.
Exploratory Climate Data Visualization and Analysis Using DV3D and UVCDAT

NASA Technical Reports Server (NTRS)

Maxwell, Thomas

2012-01-01

Earth system scientists are being inundated by an explosion of data generated by ever-increasing resolution in both global models and remote sensors. Advanced tools for accessing, analyzing, and visualizing very large and complex climate data are required to maintain rapid progress in Earth system research. To meet this need, NASA, in collaboration with the Ultra-scale Visualization Climate Data Analysis Tools (UVCOAT) consortium, is developing exploratory climate data analysis and visualization tools which provide data analysis capabilities for the Earth System Grid (ESG). This paper describes DV3D, a UV-COAT package that enables exploratory analysis of climate simulation and observation datasets. OV3D provides user-friendly interfaces for visualization and analysis of climate data at a level appropriate for scientists. It features workflow inte rfaces, interactive 40 data exploration, hyperwall and stereo visualization, automated provenance generation, and parallel task execution. DV30's integration with CDAT's climate data management system (COMS) and other climate data analysis tools provides a wide range of high performance climate data analysis operations. DV3D expands the scientists' toolbox by incorporating a suite of rich new exploratory visualization and analysis methods for addressing the complexity of climate datasets.
Hbim Methodology as a Bridge Between Italy and Argentina

NASA Astrophysics Data System (ADS)

Moreira, A.; Quattrini, R.; Maggiolo, G.; Mammoli, R.

2018-05-01

The availability of efficient HBIM workflows could represent a very important change towards a more efficient management of the historical real estate. The present work shows how to obtain accurate and reliable information of heritage buildings through reality capture and 3D modelling to support restoration purposes or knowledge-based applications. Two cases studies metaphorically joint Italy with Argentina. The research article explains the workflows applied at the Palazzo Ferretti at Ancona and the Manzana Histórica de la Universidad National del Litoral, providing a constructive comparison and blending technological and theoretical approaches. In a bottom-up process, the assessment of two cases study validates a workflow allowing the achievement of a useful and proper data enrichment of each HBIM model. Another key aspect is the Level of Development (LOD) evaluation of both models: different ranges and scales are defined in America (100-500) and in Italy (A-G), nevertheless is possible to obtain standard shared procedures, enabling facilitation of HBIM development and diffusion in operating workflows.
Review of Strategies and Technologies for Demand-Side Management on Isolated Mini-Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harper, Meg

This review provides an overview of strategies and currently available technologies used for demandside management (DSM) on mini-grids throughout the world. For the purposes of this review, mini-grids are defined as village-scale electricity distribution systems powered by small local generation sources and not connected to a main grid.1 Mini-grids range in size from less than 1 kW to several hundred kW of installed generation capacity and may utilize different generation technologies, such as micro-hydro, biomass gasification, solar, wind, diesel generators, or a hybrid combination of any of these. This review will primarily refer to AC mini-grids, though much of themore » discussion could apply to DC grids as well. Many mini-grids include energy storage, though some rely solely on real-time generation.« less
Daily hydro- and morphodynamic simulations at Duck, NC, USA using Delft3D

NASA Astrophysics Data System (ADS)

Penko, Allison; Veeramony, Jay; Palmsten, Margaret; Bak, Spicer; Brodie, Katherine; Hesser, Tyler

2017-04-01

Operational forecasting of the coastal nearshore has wide ranging societal and humanitarian benefits, specifically for the prediction of natural hazards due to extreme storm events. However, understanding the model limitations and uncertainty is as equally important as the predictions themselves. By comparing and contrasting the predictions of multiple high-resolution models in a location with near real-time collection of observations, we are able to perform a vigorous analysis of the model results in order to achieve more robust and certain predictions. In collaboration with the U.S. Army Corps of Engineers Field Research Facility (USACE FRF) as part of the Coastal Model Test Bed (CMTB) project, we have set up Delft3D at Duck, NC, USA to run in near-real time, driven by measured wave data at the boundary. The CMTB at the USACE FRF allows for the unique integration of operational wave, circulation, and morphology models with real-time observations. The FRF has an extensive array of in-situ and remotely sensed oceanographic, bathymetric, and meteorological data that is broadcast in near-real time onto a publically accessible server. Wave, current, and bed elevation instruments are permanently installed across the model domain including 2 waverider buoys in 17-m and 26-m water depths at 3.5-km and 17-km offshore, respectively, that record directional wave data every 30-min. Here, we present the workflow and output of the Delft3D hydro- and morphodynamic simulations at Duck, and show the tactical benefits and operational potential of such a system. A nested Delft3D simulation runs a parent grid that extends 12-km in the along-shore and 3.5-km in the cross-shore with 50-m resolution and a maximum depth of approximately 17-m. The bathymetry for the parent grid was obtained from a regional digital elevation model (DEM) generated by the Federal Emergency Management Agency (FEMA). The inner nested grid extends 1.8-km in the along-shore and 1-km in the cross-shore with 5-m resolution and a maximum depth of approximately 8-m. The inner nested grid initial model bathymetry is set to either the predicted bathymetry from the previous day's simulation or a survey, whichever is more recent. Delft3D-WAVE runs in the parent grid and is driven with the real-time spectral wave measurements from the waverider buoy in 17-m depth. The spectral output from Delft3D-WAVE in the parent grid is then used as the boundary condition for the inner nested high-resolution grid, in which the coupled Delft3D wave-flow-morphology model is run. The model results are then compared to the wave, current, and bathymetry observations collected at the FRF as well as other models that are run in the CMTB.
The impact of automation on organizational changes in a community hospital clinical microbiology laboratory.

PubMed

Camporese, Alessandro

2004-06-01

The diagnosis of infectious diseases and the role of the microbiology laboratory are currently undergoing a process of change. The need for overall efficiency in providing results is now given the same importance as accuracy. This means that laboratories must be able to produce quality results in less time with the capacity to interpret the results clinically. To improve the clinical impact of microbiology results, the new challenge facing the microbiologist has become one of process management instead of pure analysis. A proper project management process designed to improve workflow, reduce analytical time, and provide the same high quality results without losing valuable time treating the patient, has become essential. Our objective was to study the impact of introducing automation and computerization into the microbiology laboratory, and the reorganization of the laboratory workflow, i.e. scheduling personnel to work shifts covering both the entire day and the entire week. In our laboratory, the introduction of automation and computerization, as well as the reorganization of personnel, thus the workflow itself, has resulted in an improvement in response time and greater efficiency in diagnostic procedures.
KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.

PubMed

Hastreiter, Maximilian; Jeske, Tim; Hoser, Jonathan; Kluge, Michael; Ahomaa, Kaarin; Friedl, Marie-Sophie; Kopetzky, Sebastian J; Quell, Jan-Dominik; Mewes, H Werner; Küffner, Robert

2017-05-15

Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). robert.kueffner@helmholtz-muenchen.de. Supplementary data are available at Bioinformatics online.

Implementation of a 'lean' cytopathology service: towards routine same-day reporting.

PubMed

Hewer, Ekkehard; Hammer, Caroline; Fricke-Vetsch, Daniela; Baumann, Cinzia; Perren, Aurel; Schmitt, Anja M

2018-05-01

To systematically assess the effects of a Lean management intervention in an academic cytopathology service. We monitored outcomes including specimen turnaround times during stepwise implementation of a lean cytopathology workflow for gynaecological and non-gynaecological cytology. The intervention resulted in a major reduction of turnaround times for both gynaecological (3rd quartile 4.1 vs 2.3 working days) and non-gynaecological cytology (3rd quartile 1.9 vs. 1.2 working days). Introduction of fully electronic reporting had additional effect over continuous staining of slides alone. The rate of non-gynaecological specimens reported the same day increased from 4.5% to 56.5% of specimens received before noon. Lean management principles provide a useful framework for organization of a cytopathology workflow. Stepwise implementation beginning with a simplified gynaecological cytology workflow allowed involved staff to monitor the effects of individual changes and allowed for a smooth transition. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The impact of missing sensor information on surgical workflow management.

PubMed

Liebmann, Philipp; Meixensberger, Jürgen; Wiedemann, Peter; Neumuth, Thomas

2013-09-01

Sensor systems in the operating room may encounter intermittent data losses that reduce the performance of surgical workflow management systems (SWFMS). Sensor data loss could impact SWFMS-based decision support, device parameterization, and information presentation. The purpose of this study was to understand the robustness of surgical process models when sensor information is partially missing. SWFMS changes caused by wrong or no data from the sensor system which tracks the progress of a surgical intervention were tested. The individual surgical process models (iSPMs) from 100 different cataract procedures of 3 ophthalmologic surgeons were used to select a randomized subset and create a generalized surgical process model (gSPM). A disjoint subset was selected from the iSPMs and used to simulate the surgical process against the gSPM. The loss of sensor data was simulated by removing some information from one task in the iSPM. The effect of missing sensor data was measured using several metrics: (a) successful relocation of the path in the gSPM, (b) the number of steps to find the converging point, and (c) the perspective with the highest occurrence of unsuccessful path findings. A gSPM built using 30% of the iSPMs successfully found the correct path in 90% of the cases. The most critical sensor data were the information regarding the instrument used by the surgeon. We found that use of a gSPM to provide input data for a SWFMS is robust and can be accurate despite missing sensor data. A surgical workflow management system can provide the surgeon with workflow guidance in the OR for most cases. Sensor systems for surgical process tracking can be evaluated based on the stability and accuracy of functional and spatial operative results.
The development and implementation of MOSAIQ Integration Platform (MIP) based on the radiotherapy workflow

NASA Astrophysics Data System (ADS)

Yang, Xin; He, Zhen-yu; Jiang, Xiao-bo; Lin, Mao-sheng; Zhong, Ning-shan; Hu, Jiang; Qi, Zhen-yu; Bao, Yong; Li, Qiao-qiao; Li, Bao-yue; Hu, Lian-ying; Lin, Cheng-guang; Gao, Yuan-hong; Liu, Hui; Huang, Xiao-yan; Deng, Xiao-wu; Xia, Yun-fei; Liu, Meng-zhong; Sun, Ying

2017-03-01

To meet the special demands in China and the particular needs for the radiotherapy department, a MOSAIQ Integration Platform CHN (MIP) based on the workflow of radiation therapy (RT) has been developed, as a supplement system to the Elekta MOSAIQ. The MIP adopts C/S (client-server) structure mode, and its database is based on the Treatment Planning System (TPS) and MOSAIQ SQL Server 2008, running on the hospital local network. Five network servers, as a core hardware, supply data storage and network service based on the cloud services. The core software, using C# programming language, is developed based on Microsoft Visual Studio Platform. The MIP server could offer network service, including entry, query, statistics and print information for about 200 workstations at the same time. The MIP was implemented in the past one and a half years, and some practical patient-oriented functions were developed. And now the MIP is almost covering the whole workflow of radiation therapy. There are 15 function modules, such as: Notice, Appointment, Billing, Document Management (application/execution), System Management, and so on. By June of 2016, recorded data in the MIP are as following: 13546 patients, 13533 plan application, 15475 RT records, 14656 RT summaries, 567048 billing records and 506612 workload records, etc. The MIP based on the RT workflow has been successfully developed and clinically implemented with real-time performance, data security, stable operation. And it is demonstrated to be user-friendly and is proven to significantly improve the efficiency of the department. It is a key to facilitate the information sharing and department management. More functions can be added or modified for further enhancement its potentials in research and clinical practice.
Electronic health records and patient safety: co-occurrence of early EHR implementation with patient safety practices in primary care settings.

PubMed

Tanner, C; Gans, D; White, J; Nath, R; Pohl, J

2015-01-01

The role of electronic health records (EHR) in enhancing patient safety, while substantiated in many studies, is still debated. This paper examines early EHR adopters in primary care to understand the extent to which EHR implementation is associated with the workflows, policies and practices that promote patient safety, as compared to practices with paper records. Early adoption is defined as those who were using EHR prior to implementation of the Meaningful Use program. We utilized the Physician Practice Patient Safety Assessment (PPPSA) to compare primary care practices with fully implemented EHR to those utilizing paper records. The PPPSA measures the extent of adoption of patient safety practices in the domains: medication management, handoffs and transition, personnel qualifications and competencies, practice management and culture, and patient communication. Data from 209 primary care practices responding between 2006-2010 were included in the analysis: 117 practices used paper medical records and 92 used an EHR. Results showed that, within all domains, EHR settings showed significantly higher rates of having workflows, policies and practices that promote patient safety than paper record settings. While these results were expected in the area of medication management, EHR use was also associated with adoption of patient safety practices in areas in which the researchers had no a priori expectations of association. Sociotechnical models of EHR use point to complex interactions between technology and other aspects of the environment related to human resources, workflow, policy, culture, among others. This study identifies that among primary care practices in the national PPPSA database, having an EHR was strongly empirically associated with the workflow, policy, communication and cultural practices recommended for safe patient care in ambulatory settings.
Leadership characteristics and business management in modern academic surgery.

PubMed

Büchler, Peter; Martin, David; Knaebel, Hanns-Peter; Büchler, Markus W

2006-04-01

Management skills are necessary to successfully lead a surgical department in future. This article focuses on practical aspects of surgical management, leadership and training. It demonstrates how the implementation of business management concepts changes workflow management and surgical training. A systematic Medline search was performed and business management publications were analysed. Neither management nor leadership skills are inborn but acquired. Management is about planning, controlling and putting appropriate structures in place. Leadership is anticipating and coping with change and people, and adopting a visionary stance. More change requires more leadership. Changes in surgery occur with unprecedented speed because of a growing demand for surgical procedures with limited financial resources. Modern leadership and management theories have to be tailored to surgery. It is clear that not all of them are applicable but some of them are essential for surgeons. In business management, common traits of successful leaders include team orientation and communication skills. As the most important character, however, appears to be the emotional intelligence. Novel training concepts for surgeons include on-the-job training and introduction of improved workflow management systems, e.g. the central case management. The need for surgeons with advanced skills in business, finance and organisational management is evident and will require systematic and tailored training.
DietPal: A Web-Based Dietary Menu-Generating and Management System

PubMed Central

Abdullah, Siti Norulhuda; Shahar, Suzana; Abdul-Hamid, Helmi; Khairudin, Nurkahirizan; Yusoff, Mohamed; Ghazali, Rafidah; Mohd-Yusoff, Nooraini; Shafii, Nik Shanita; Abdul-Manaf, Zaharah

2004-01-01

Background Attempts in current health care practice to make health care more accessible, effective, and efficient through the use of information technology could include implementation of computer-based dietary menu generation. While several of such systems already exist, their focus is mainly to assist healthy individuals calculate their calorie intake and to help monitor the selection of menus based upon a prespecified calorie value. Although these prove to be helpful in some ways, they are not suitable for monitoring, planning, and managing patients' dietary needs and requirements. This paper presents a Web-based application that simulates the process of menu suggestions according to a standard practice employed by dietitians. Objective To model the workflow of dietitians and to develop, based on this workflow, a Web-based system for dietary menu generation and management. The system is aimed to be used by dietitians or by medical professionals of health centers in rural areas where there are no designated qualified dietitians. Methods First, a user-needs study was conducted among dietitians in Malaysia. The first survey of 93 dietitians (with 52 responding) was an assessment of information needed for dietary management and evaluation of compliance towards a dietary regime. The second study consisted of ethnographic observation and semi-structured interviews with 14 dietitians in order to identify the workflow of a menu-suggestion process. We subsequently designed and developed a Web-based dietary menu generation and management system called DietPal. DietPal has the capability of automatically calculating the nutrient and calorie intake of each patient based on the dietary recall as well as generating suitable diet and menu plans according to the calorie and nutrient requirement of the patient, calculated from anthropometric measurements. The system also allows reusing stored or predefined menus for other patients with similar health and nutrient requirements. Results We modeled the workflow of menu-suggestion activity currently adhered to by dietitians in Malaysia. Based on this workflow, a Web-based system was developed. Initial post evaluation among 10 dietitians indicates that they are comfortable with the organization of the modules and information. Conclusions The system has the potential of enhancing the quality of services with the provision of standard and healthy menu plans and at the same time increasing outreach, particularly to rural areas. With its potential capability of optimizing the time spent by dietitians to plan suitable menus, more quality time could be spent delivering nutrition education to the patients. PMID:15111270
DietPal: a Web-based dietary menu-generating and management system.

PubMed

Noah, Shahrul A; Abdullah, Siti Norulhuda; Shahar, Suzana; Abdul-Hamid, Helmi; Khairudin, Nurkahirizan; Yusoff, Mohamed; Ghazali, Rafidah; Mohd-Yusoff, Nooraini; Shafii, Nik Shanita; Abdul-Manaf, Zaharah

2004-01-30

Attempts in current health care practice to make health care more accessible, effective, and efficient through the use of information technology could include implementation of computer-based dietary menu generation. While several of such systems already exist, their focus is mainly to assist healthy individuals calculate their calorie intake and to help monitor the selection of menus based upon a prespecified calorie value. Although these prove to be helpful in some ways, they are not suitable for monitoring, planning, and managing patients' dietary needs and requirements. This paper presents a Web-based application that simulates the process of menu suggestions according to a standard practice employed by dietitians. To model the workflow of dietitians and to develop, based on this workflow, a Web-based system for dietary menu generation and management. The system is aimed to be used by dietitians or by medical professionals of health centers in rural areas where there are no designated qualified dietitians. First, a user-needs study was conducted among dietitians in Malaysia. The first survey of 93 dietitians (with 52 responding) was an assessment of information needed for dietary management and evaluation of compliance towards a dietary regime. The second study consisted of ethnographic observation and semi-structured interviews with 14 dietitians in order to identify the workflow of a menu-suggestion process. We subsequently designed and developed a Web-based dietary menu generation and management system called DietPal. DietPal has the capability of automatically calculating the nutrient and calorie intake of each patient based on the dietary recall as well as generating suitable diet and menu plans according to the calorie and nutrient requirement of the patient, calculated from anthropometric measurements. The system also allows reusing stored or predefined menus for other patients with similar health and nutrient requirements. We modeled the workflow of menu-suggestion activity currently adhered to by dietitians in Malaysia. Based on this workflow, a Web-based system was developed. Initial post evaluation among 10 dietitians indicates that they are comfortable with the organization of the modules and information. The system has the potential of enhancing the quality of services with the provision of standard and healthy menu plans and at the same time increasing outreach, particularly to rural areas. With its potential capability of optimizing the time spent by dietitians to plan suitable menus, more quality time could be spent delivering nutrition education to the patients.
New procedure to reduce the time and cost of broncho-pulmonary specimen management using the Previ Isola® automated inoculation system.

PubMed

Nebbad-Lechani, Biba; Emirian, Aurélie; Maillebuau, Fabienne; Mahjoub, Nadia; Fihman, Vincent; Legrand, Patrick; Decousser, Jean-Winoc

2013-12-01

The microbiological diagnosis of respiratory tract infections requires serial manual dilutions of the clinical specimen before agar plate inoculation, disrupting the workflow in bacteriology clinical laboratories. Automated plating instrument systems have been designed to increase the speed, reproducibility and safety of this inoculating step; nevertheless, data concerning respiratory specimens are lacking. We tested a specific procedure that uses the Previ Isola® (bioMérieux, Craponne, France) to inoculate with broncho-pulmonary specimens (BPS). A total of 350 BPS from a university-affiliated hospital were managed in parallel using the manual reference and the automated methods (expectoration: 75; broncho-alveolar lavage: 68; tracheal aspiration: 17; protected distal sample: 190). A specific enumeration reading grid, a pre-liquefaction step and a fluidity test, performed before the inoculation, were designed for the automated method. The qualitative (i.e., the number of specimens yielding a bacterial count greater than the clinical threshold) and quantitative (i.e., the discrepancy within a 0.5 log value) concordances were 100% and 98.2%, respectively. The slimmest subgroup of expectorations could not be managed by the automated method (8%, 6/75). The technical time and cost savings (i.e., number of consumed plates) reached 50%. Additional studies are required for specific populations, such as cystic fibrosis specimens and associated bacterial variants. An automated decapper should be implemented to increase the biosafety of the process. The PREVI Isola® adapted procedure is a time- and cost-saving method for broncho-pulmonary specimen processing. © 2013.
Applications of process improvement techniques to improve workflow in abdominal imaging.

PubMed

Tamm, Eric Peter

2016-03-01

Major changes in the management and funding of healthcare are underway that will markedly change the way radiology studies will be reimbursed. The result will be the need to deliver radiology services in a highly efficient manner while maintaining quality. The science of process improvement provides a practical approach to improve the processes utilized in radiology. This article will address in a step-by-step manner how to implement process improvement techniques to improve workflow in abdominal imaging.
Home and Building Energy Management Systems | Grid Modernization | NREL

Science.gov Websites

Home and Building Energy Management Systems Home and Building Energy Management Systems NREL building assets and energy management systems can provide value to the grid. Photo of a pair of NREL researchers who received a record of invention for a home energy management system in a smart home laboratory
Automation in an addiction treatment research clinic: computerised contingency management, ecological momentary assessment and a protocol workflow system.

PubMed

Vahabzadeh, Massoud; Lin, Jia-Ling; Mezghanni, Mustapha; Epstein, David H; Preston, Kenzie L

2009-01-01

A challenge in treatment research is the necessity of adhering to protocol and regulatory strictures while maintaining flexibility to meet patients' treatment needs and to accommodate variations among protocols. Another challenge is the acquisition of large amounts of data in an occasionally hectic environment, along with the provision of seamless methods for exporting, mining and querying the data. We have automated several major functions of our outpatient treatment research clinic for studies in drug abuse and dependence. Here we describe three such specialised applications: the Automated Contingency Management (ACM) system for the delivery of behavioural interventions, the transactional electronic diary (TED) system for the management of behavioural assessments and the Protocol Workflow System (PWS) for computerised workflow automation and guidance of each participant's daily clinic activities. These modules are integrated into our larger information system to enable data sharing in real time among authorised staff. ACM and the TED have each permitted us to conduct research that was not previously possible. In addition, the time to data analysis at the end of each study is substantially shorter. With the implementation of the PWS, we have been able to manage a research clinic with an 80 patient capacity, having an annual average of 18,000 patient visits and 7300 urine collections with a research staff of five. Finally, automated data management has considerably enhanced our ability to monitor and summarise participant safety data for research oversight. When developed in consultation with end users, automation in treatment research clinics can enable more efficient operations, better communication among staff and expansions in research methods.
Delineating Beach and Dune Morphology from Massive Terrestrial Laser Scanning Data Using the Generic Mapping Tools

NASA Astrophysics Data System (ADS)

Zhou, X.; Wang, G.; Yan, B.; Kearns, T.

2016-12-01

Terrestrial laser scanning (TLS) techniques have been proven to be efficient tools to collect three-dimensional high-density and high-accuracy point clouds for coastal research and resource management. However, the processing and presenting of massive TLS data is always a challenge for research when targeting a large area with high-resolution. This article introduces a workflow using shell-scripting techniques to chain together tools from the Generic Mapping Tools (GMT), Geographic Resources Analysis Support System (GRASS), and other command-based open-source utilities for automating TLS data processing. TLS point clouds acquired in the beach and dune area near Freeport, Texas in May 2015 were used for the case study. Shell scripts for rotating the coordinate system, removing anomalous points, assessing data quality, generating high-accuracy bare-earth DEMs, and quantifying beach and sand dune features (shoreline, cross-dune section, dune ridge, toe, and volume) are presented in this article. According to this investigation, the accuracy of the laser measurements (distance from the scanner to the targets) is within a couple of centimeters. However, the positional accuracy of TLS points with respect to a global coordinate system is about 5 cm, which is dominated by the accuracy of GPS solutions for obtaining the positions of the scanner and reflector. The accuracy of TLS-derived bare-earth DEM is primarily determined by the size of grid cells and roughness of the terrain surface for the case study. A DEM with grid cells of 4m x 1m (shoreline by cross-shore) provides a suitable spatial resolution and accuracy for deriving major beach and dune features.
Topobathymetric elevation model development using a new methodology: Coastal National Elevation Database

USGS Publications Warehouse

Danielson, Jeffrey J.; Poppenga, Sandra K.; Brock, John C.; Evans, Gayla A.; Tyler, Dean; Gesch, Dean B.; Thatcher, Cindy A.; Barras, John

2016-01-01

During the coming decades, coastlines will respond to widely predicted sea-level rise, storm surge, and coastalinundation flooding from disastrous events. Because physical processes in coastal environments are controlled by the geomorphology of over-the-land topography and underwater bathymetry, many applications of geospatial data in coastal environments require detailed knowledge of the near-shore topography and bathymetry. In this paper, an updated methodology used by the U.S. Geological Survey Coastal National Elevation Database (CoNED) Applications Project is presented for developing coastal topobathymetric elevation models (TBDEMs) from multiple topographic data sources with adjacent intertidal topobathymetric and offshore bathymetric sources to generate seamlessly integrated TBDEMs. This repeatable, updatable, and logically consistent methodology assimilates topographic data (land elevation) and bathymetry (water depth) into a seamless coastal elevation model. Within the overarching framework, vertical datum transformations are standardized in a workflow that interweaves spatially consistent interpolation (gridding) techniques with a land/water boundary mask delineation approach. Output gridded raster TBDEMs are stacked into a file storage system of mosaic datasets within an Esri ArcGIS geodatabase for efficient updating while maintaining current and updated spatially referenced metadata. Topobathymetric data provide a required seamless elevation product for several science application studies, such as shoreline delineation, coastal inundation mapping, sediment-transport, sea-level rise, storm surge models, and tsunami impact assessment. These detailed coastal elevation data are critical to depict regions prone to climate change impacts and are essential to planners and managers responsible for mitigating the associated risks and costs to both human communities and ecosystems. The CoNED methodology approach has been used to construct integrated TBDEM models in Mobile Bay, the northern Gulf of Mexico, San Francisco Bay, the Hurricane Sandy region, and southern California.
The Particle Physics Data Grid. Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Livny, Miron

2002-08-16

The main objective of the Particle Physics Data Grid (PPDG) project has been to implement and evaluate distributed (Grid-enabled) data access and management technology for current and future particle and nuclear physics experiments. The specific goals of PPDG have been to design, implement, and deploy a Grid-based software infrastructure capable of supporting the data generation, processing and analysis needs common to the physics experiments represented by the participants, and to adapt experiment-specific software to operate in the Grid environment and to exploit this infrastructure. To accomplish these goals, the PPDG focused on the implementation and deployment of several critical services:more » reliable and efficient file replication service, high-speed data transfer services, multisite file caching and staging service, and reliable and recoverable job management services. The focus of the activity was the job management services and the interplay between these services and distributed data access in a Grid environment. Software was developed to study the interaction between HENP applications and distributed data storage fabric. One key conclusion was the need for a reliable and recoverable tool for managing large collections of interdependent jobs. An attached document provides an overview of the current status of the Directed Acyclic Graph Manager (DAGMan) with its main features and capabilities.« less
Optimal management of stationary lithium-ion battery system in electricity distribution grids

NASA Astrophysics Data System (ADS)

Purvins, Arturs; Sumner, Mark

2013-11-01

The present article proposes an optimal battery system management model in distribution grids for stationary applications. The main purpose of the management model is to maximise the utilisation of distributed renewable energy resources in distribution grids, preventing situations of reverse power flow in the distribution transformer. Secondly, battery management ensures efficient battery utilisation: charging at off-peak prices and discharging at peak prices when possible. This gives the battery system a shorter payback time. Management of the system requires predictions of residual distribution grid demand (i.e. demand minus renewable energy generation) and electricity price curves (e.g. for 24 h in advance). Results of a hypothetical study in Great Britain in 2020 show that the battery can contribute significantly to storing renewable energy surplus in distribution grids while being highly utilised. In a distribution grid with 25 households and an installed 8.9 kW wind turbine, a battery system with rated power of 8.9 kW and battery capacity of 100 kWh can store 7 MWh of 8 MWh wind energy surplus annually. Annual battery utilisation reaches 235 cycles in per unit values, where one unit is a full charge-depleting cycle depth of a new battery (80% of 100 kWh).
Strategic Planning for Electronic Resources Management: A Case Study at Gustavus Adolphus College

ERIC Educational Resources Information Center

Hulseberg, Anna; Monson, Sarah

2009-01-01

Electronic resources, the tools we use to manage them, and the needs and expectations of our users are constantly evolving; at the same time, the roles, responsibilities, and workflow of the library staff who manage e-resources are also in flux. Recognizing a need to be more intentional and proactive about how we manage e-resources, the…
Big data analytics workflow management for eScience

NASA Astrophysics Data System (ADS)

Fiore, Sandro; D'Anca, Alessandro; Palazzo, Cosimo; Elia, Donatello; Mariello, Andrea; Nassisi, Paola; Aloisio, Giovanni

2015-04-01

In many domains such as climate and astrophysics, scientific data is often n-dimensional and requires tools that support specialized data types and primitives if it is to be properly stored, accessed, analysed and visualized. Currently, scientific data analytics relies on domain-specific software and libraries providing a huge set of operators and functionalities. However, most of these software fail at large scale since they: (i) are desktop based, rely on local computing capabilities and need the data locally; (ii) cannot benefit from available multicore/parallel machines since they are based on sequential codes; (iii) do not provide declarative languages to express scientific data analysis tasks, and (iv) do not provide newer or more scalable storage models to better support the data multidimensionality. Additionally, most of them: (v) are domain-specific, which also means they support a limited set of data formats, and (vi) do not provide a workflow support, to enable the construction, execution and monitoring of more complex "experiments". The Ophidia project aims at facing most of the challenges highlighted above by providing a big data analytics framework for eScience. Ophidia provides several parallel operators to manipulate large datasets. Some relevant examples include: (i) data sub-setting (slicing and dicing), (ii) data aggregation, (iii) array-based primitives (the same operator applies to all the implemented UDF extensions), (iv) data cube duplication, (v) data cube pivoting, (vi) NetCDF-import and export. Metadata operators are available too. Additionally, the Ophidia framework provides array-based primitives to perform data sub-setting, data aggregation (i.e. max, min, avg), array concatenation, algebraic expressions and predicate evaluation on large arrays of scientific data. Bit-oriented plugins have also been implemented to manage binary data cubes. Defining processing chains and workflows with tens, hundreds of data analytics operators is the real challenge in many practical scientific use cases. This talk will specifically address the main needs, requirements and challenges regarding data analytics workflow management applied to large scientific datasets. Three real use cases concerning analytics workflows for sea situational awareness, fire danger prevention, climate change and biodiversity will be discussed in detail.
SMITH: a LIMS for handling next-generation sequencing workflows

PubMed Central

2014-01-01

Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc. Conclusions SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis. PMID:25471934
SMITH: a LIMS for handling next-generation sequencing workflows.

PubMed

Venco, Francesco; Vaskin, Yuriy; Ceol, Arnaud; Muller, Heiko

2014-01-01

Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc. SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis.
Toward a geoinformatics framework for understanding the social and biophysical influences on urban nutrient pollution due to residential impervious service connectivity

NASA Astrophysics Data System (ADS)

Miles, B.; Band, L. E.

2012-12-01

Water sustainability has been recognized as a fundamental problem of science whose solution relies in part on high-performance computing. Stormwater management is a major concern of urban sustainability. Understanding interactions between urban landcover and stormwater nutrient pollution requires consideration of fine-scale residential stormwater management, which in turn requires high-resolution LIDAR and landcover data not provided through national spatial data infrastructure, as well as field observation at the household scale. The objectives of my research are twofold: (1) advance understanding of the relationship between residential stormwater management practices and the export of nutrient pollution from stormwater in urbanized ecosystems; and (2) improve the informatics workflows used in community ecohydrology modeling as applied to heterogeneous urbanized ecosystems. In support of these objectives, I present preliminary results from initial work to: (1) develop an ecohydrology workflow platform that automates data preparation while maintaining data provenance and model metadata to yield reproducible workflows and support model benchmarking; (2) perform field observation of existing patterns of residential rooftop impervious surface connectivity to stormwater networks; and (3) develop Regional Hydro-Ecological Simulation System (RHESSys) models for watersheds in Baltimore, MD (as part of the Baltimore Ecosystem Study (BES) NSF Long-Term Ecological Research (LTER) site) and Durham, NC (as part of the NSF Urban Long-Term Research Area (ULTRA) program); these models will be used to simulate nitrogen loading resulting from both baseline residential rooftop impervious connectivity and for disconnection scenarios (e.g. roof drainage to lawn v. engineered rain garden, upslope v. riparian). This research builds on work done as part of the NSF EarthCube Layered Architecture Concept Award where a RHESSys workflow is being implemented in an iRODS (integrated Rule-Oriented Data System) environment. Modeling the ecohydrology of urban ecosystems in a reliable and reproducible manner requires a flexible scientific workflow platform that allows rapid prototyping with large-scale spatial datasets and model refinement integrating expert knowledge with local datasets and household surveys.

A data management and publication workflow for a large-scale, heterogeneous sensor network.

PubMed

Jones, Amber Spackman; Horsburgh, Jeffery S; Reeder, Stephanie L; Ramírez, Maurier; Caraballo, Juan

2015-06-01

It is common for hydrology researchers to collect data using in situ sensors at high frequencies, for extended durations, and with spatial distributions that produce data volumes requiring infrastructure for data storage, management, and sharing. The availability and utility of these data in addressing scientific questions related to water availability, water quality, and natural disasters relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into usable data products. It also depends on the ability of researchers to share and access the data in useable formats. In this paper, we describe a data management and publication workflow and software tools for research groups and sites conducting long-term monitoring using in situ sensors. Functionality includes the ability to track monitoring equipment inventory and events related to field maintenance. Linking this information to the observational data is imperative in ensuring the quality of sensor-based data products. We present these tools in the context of a case study for the innovative Urban Transitions and Aridregion Hydrosustainability (iUTAH) sensor network. The iUTAH monitoring network includes sensors at aquatic and terrestrial sites for continuous monitoring of common meteorological variables, snow accumulation and melt, soil moisture, surface water flow, and surface water quality. We present the overall workflow we have developed for effectively transferring data from field monitoring sites to ultimate end-users and describe the software tools we have deployed for storing, managing, and sharing the sensor data. These tools are all open source and available for others to use.
Purdue ionomics information management system. An integrated functional genomics platform.

PubMed

Baxter, Ivan; Ouzzani, Mourad; Orcun, Seza; Kennedy, Brad; Jandhyala, Shrinivas S; Salt, David E

2007-02-01

The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics.
A Foundation for Enterprise Imaging: HIMSS-SIIM Collaborative White Paper.

PubMed

Roth, Christopher J; Lannum, Louis M; Persons, Kenneth R

2016-10-01

Care providers today routinely obtain valuable clinical multimedia with mobile devices, scope cameras, ultrasound, and many other modalities at the point of care. Image capture and storage workflows may be heterogeneous across an enterprise, and as a result, they often are not well incorporated in the electronic health record. Enterprise Imaging refers to a set of strategies, initiatives, and workflows implemented across a healthcare enterprise to consistently and optimally capture, index, manage, store, distribute, view, exchange, and analyze all clinical imaging and multimedia content to enhance the electronic health record. This paper is intended to introduce Enterprise Imaging as an important initiative to clinical and informatics leadership, and outline its key elements of governance, strategy, infrastructure, common multimedia content, acquisition workflows, enterprise image viewers, and image exchange services.
Advanced Micro Grid Energy Management Coupled with Integrated Volt/VAR Control for Improved Energy Efficiency, Energy Security, and Power Quality at DoD Installations

DTIC Science & Technology

2016-10-28

assumptions. List of Assumptions: Price of electrical energy : $0.07/kWh flat rate for energy at the base Price of peak power: $15/MW peak power...EW-201147) Advanced Micro-Grid Energy Management Coupled with Integrated Volt/VAR Control for Improved Energy Efficiency, Energy Security, and...12-C-0002 5b. GRANT NUMBER Advanced Micro-Grid Energy Management Coupled with Integrated Volt/VAR Control for Improved Energy Efficiency, Energy
Systematic Redaction for Neuroimage Data

PubMed Central

Matlock, Matt; Schimke, Nakeisha; Kong, Liang; Macke, Stephen; Hale, John

2013-01-01

In neuroscience, collaboration and data sharing are undermined by concerns over the management of protected health information (PHI) and personal identifying information (PII) in neuroimage datasets. The HIPAA Privacy Rule mandates measures for the preservation of subject privacy in neuroimaging studies. Unfortunately for the researcher, the management of information privacy is a burdensome task. Wide scale data sharing of neuroimages is challenging for three primary reasons: (i) A dearth of tools to systematically expunge PHI/PII from neuroimage data sets, (ii) a facility for tracking patient identities in redacted datasets has not been produced, and (iii) a sanitization workflow remains conspicuously absent. This article describes the XNAT Redaction Toolkit—an integrated redaction workflow which extends a popular neuroimage data management toolkit to remove PHI/PII from neuroimages. Quickshear defacing is also presented as a complementary technique for deidentifying the image data itself. Together, these tools improve subject privacy through systematic removal of PII/PHI. PMID:24179597
Software Project Management and Measurement on the World-Wide-Web (WWW)

NASA Technical Reports Server (NTRS)

Callahan, John; Ramakrishnan, Sudhaka

1996-01-01

We briefly describe a system for forms-based, work-flow management that helps members of a software development team overcome geographical barriers to collaboration. Our system, called the Web Integrated Software Environment (WISE), is implemented as a World-Wide-Web service that allows for management and measurement of software development projects based on dynamic analysis of change activity in the workflow. WISE tracks issues in a software development process, provides informal communication between the users with different roles, supports to-do lists, and helps in software process improvement. WISE minimizes the time devoted to metrics collection and analysis by providing implicit delivery of messages between users based on the content of project documents. The use of a database in WISE is hidden from the users who view WISE as maintaining a personal 'to-do list' of tasks related to the many projects on which they may play different roles.
Grid commerce, market-driven G-negotiation, and Grid resource management.

PubMed

Sim, Kwang Mong

2006-12-01

Although the management of resources is essential for realizing a computational grid, providing an efficient resource allocation mechanism is a complex undertaking. Since Grid providers and consumers may be independent bodies, negotiation among them is necessary. The contribution of this paper is showing that market-driven agents (MDAs) are appropriate tools for Grid resource negotiation. MDAs are e-negotiation agents designed with the flexibility of: 1) making adjustable amounts of concession taking into account market rivalry, outside options, and time preferences and 2) relaxing bargaining terms in the face of intense pressure. A heterogeneous testbed consisting of several types of e-negotiation agents to simulate a Grid computing environment was developed. It compares the performance of MDAs against other e-negotiation agents (e.g., Kasbah) in a Grid-commerce environment. Empirical results show that MDAs generally achieve: 1) higher budget efficiencies in many market situations than other e-negotiation agents in the testbed and 2) higher success rates in acquiring Grid resources under high Grid loadings.
[Implementation of modern operating room management -- experiences made at an university hospital].

PubMed

Hensel, M; Wauer, H; Bloch, A; Volk, T; Kox, W J; Spies, C

2005-07-01

Caused by structural changes in health care the general need for cost control is evident for all hospitals. As operating room is one of the most cost-intensive sectors in a hospital, optimisation of workflow processes in this area is of particular interest for health care providers. While modern operating room management is established in several clinics yet, others are less prepared for economic challenges. Therefore, the operating room statute of the Charité university hospital useful for other hospitals to develop an own concept is presented. In addition, experiences made with implementation of new management structures are described and results obtained over the last 5 years are reported. Whereas the total number of operation procedures increased by 15 %, the operating room utilization increased more markedly in terms of time and cases. Summarizing the results, central operating room management has been proved to be an effective tool to increase the efficiency of workflow processes in the operating room.
Smart Grid Constraint Violation Management for Balancing and Regulating Purposes

DOE PAGES

Bhattarai, Bishnu; Kouzelis, Konstantinos; Mendaza, Iker; ...

2017-03-29

The gradual active load penetration in low voltage distribution grids is expected to challenge their network capacity in the near future. Distribution system operators should for this reason resort to either costly grid reinforcements or to demand side management mechanisms. Since demand side management implementation is usually cheaper, it is also the favorable solution. To this end, this article presents a framework for handling grid limit violations, both voltage and current, to ensure a secure and qualitative operation of the distribution grid. This framework consists of two steps, namely a proactive centralized and subsequently a reactive decentralized control scheme. Themore » former is employed to balance the one hour ahead load while the latter aims at regulating the consumption in real-time. In both cases, the importance of fair use of electricity demand flexibility is emphasized. Thus, it is demonstrated that this methodology aids in keeping the grid status within preset limits while utilizing flexibility from all flexibility participants.« less
Investment risk evaluation and countermeasures suggestions of Power Grid Corp under the background of electric power reform

NASA Astrophysics Data System (ADS)

Yang, Chunhui; Su, Zhixiong; Wang, Yuqing; Liu, Yiqun; Qi, Yongwei

2017-03-01

Investment management is an important part of Power Grid Corp. The new electricity reform put forward the general idea of "three release, three strengthening, one independence", which brings new risks to the investment management of the Power Grid Corp. First, the paper analyzes the new risks faced by the Power Grid Corp investment under the background of the electricity reform. Second, the AHP-Fuzzy evaluation model of investment risk of Power Grid Corp is established, and taking Shenzhen Power Supply Bureau as an example, the paper evaluated its risk level of investment plan in 2017. Finally, in the context of the electricity reform, the strategy of the Power Grid Corp's investment risk is proposed.
The SCEC Community Modeling Environment(SCEC/CME): A Collaboratory for Seismic Hazard Analysis

NASA Astrophysics Data System (ADS)

Maechling, P. J.; Jordan, T. H.; Minster, J. B.; Moore, R.; Kesselman, C.

2005-12-01

The SCEC Community Modeling Environment (SCEC/CME) Project is an NSF-supported Geosciences/IT partnership that is actively developing an advanced information infrastructure for system-level earthquake science in Southern California. This partnership includes SCEC, USC's Information Sciences Institute (ISI), the San Diego Supercomputer Center (SDSC), the Incorporated Institutions for Research in Seismology (IRIS), and the U.S. Geological Survey. The goal of the SCEC/CME is to develop seismological applications and information technology (IT) infrastructure to support the development of Seismic Hazard Analysis (SHA) programs and other geophysical simulations. The SHA application programs developed on the Project include a Probabilistic Seismic Hazard Analysis system called OpenSHA. OpenSHA computational elements that are currently available include a collection of attenuation relationships, and several Earthquake Rupture Forecasts (ERFs). Geophysicists in the collaboration have also developed Anelastic Wave Models (AWMs) using both finite-difference and finite-element approaches. Earthquake simulations using these codes have been run for a variety of earthquake sources. Rupture Dynamic Model (RDM) codes have also been developed that simulate friction-based fault slip. The SCEC/CME collaboration has also developed IT software and hardware infrastructure to support the development, execution, and analysis of these SHA programs. To support computationally expensive simulations, we have constructed a grid-based scientific workflow system. Using the SCEC grid, project collaborators can submit computations from the SCEC/CME servers to High Performance Computers at USC and TeraGrid High Performance Computing Centers. Data generated and archived by the SCEC/CME is stored in a digital library system, the Storage Resource Broker (SRB). This system provides a robust and secure system for maintaining the association between the data seta and their metadata. To provide an easy-to-use system for constructing SHA computations, a browser-based workflow assembly web portal has been developed. Users can compose complex SHA calculations, specifying SCEC/CME data sets as inputs to calculations, and calling SCEC/CME computational programs to process the data and the output. Knowledge-based software tools have been implemented that utilize ontological descriptions of SHA software and data can validate workflows created with this pathway assembly tool. Data visualization software developed by the collaboration supports analysis and validation of data sets. Several programs have been developed to visualize SCEC/CME data including GMT-based map making software for PSHA codes, 4D wavefield propagation visualization software based on OpenGL, and 3D Geowall-based visualization of earthquakes, faults, and seismic wave propagation. The SCEC/CME Project also helps to sponsor the SCEC UseIT Intern program. The UseIT Intern Program provides research opportunities in both Geosciences and Information Technology to undergraduate students in a variety of fields. The UseIT group has developed a 3D data visualization tool, called SCEC-VDO, as a part of this undergraduate research program.
Task–Technology Fit of Video Telehealth for Nurses in an Outpatient Clinic Setting

PubMed Central

Finkelstein, Stanley M.

2014-01-01

Abstract Background: Incorporating telehealth into outpatient care delivery supports management of consumer health between clinic visits. Task–technology fit is a framework for understanding how technology helps and/or hinders a person during work processes. Evaluating the task–technology fit of video telehealth for personnel working in a pediatric outpatient clinic and providing care between clinic visits ensures the information provided matches the information needed to support work processes. Materials and Methods: The workflow of advanced practice registered nurse (APRN) care coordination provided via telephone and video telehealth was described and measured using a mixed-methods workflow analysis protocol that incorporated cognitive ethnography and time–motion study. Qualitative and quantitative results were merged and analyzed within the task–technology fit framework to determine the workflow fit of video telehealth for APRN care coordination. Results: Incorporating video telehealth into APRN care coordination workflow provided visual information unavailable during telephone interactions. Despite additional tasks and interactions needed to obtain the visual information, APRN workflow efficiency, as measured by time, was not significantly changed. Analyzed within the task–technology fit framework, the increased visual information afforded by video telehealth supported the assessment and diagnostic information needs of the APRN. Conclusions: Telehealth must provide the right information to the right clinician at the right time. Evaluating task–technology fit using a mixed-methods protocol ensured rigorous analysis of fit within work processes and identified workflows that benefit most from the technology. PMID:24841219
A Mixed-Methods Research Framework for Healthcare Process Improvement.

PubMed

Bastian, Nathaniel D; Munoz, David; Ventura, Marta

2016-01-01

The healthcare system in the United States is spiraling out of control due to ever-increasing costs without significant improvements in quality, access to care, satisfaction, and efficiency. Efficient workflow is paramount to improving healthcare value while maintaining the utmost standards of patient care and provider satisfaction in high stress environments. This article provides healthcare managers and quality engineers with a practical healthcare process improvement framework to assess, measure and improve clinical workflow processes. The proposed mixed-methods research framework integrates qualitative and quantitative tools to foster the improvement of processes and workflow in a systematic way. The framework consists of three distinct phases: 1) stakeholder analysis, 2a) survey design, 2b) time-motion study, and 3) process improvement. The proposed framework is applied to the pediatric intensive care unit of the Penn State Hershey Children's Hospital. The implementation of this methodology led to identification and categorization of different workflow tasks and activities into both value-added and non-value added in an effort to provide more valuable and higher quality patient care. Based upon the lessons learned from the case study, the three-phase methodology provides a better, broader, leaner, and holistic assessment of clinical workflow. The proposed framework can be implemented in various healthcare settings to support continuous improvement efforts in which complexity is a daily element that impacts workflow. We proffer a general methodology for process improvement in a healthcare setting, providing decision makers and stakeholders with a useful framework to help their organizations improve efficiency. Published by Elsevier Inc.
Semi-automted analysis of high-resolution aerial images to quantify docks in Upper Midwest glacial lakes

USGS Publications Warehouse

Beck, Marcus W.; Vondracek, Bruce C.; Hatch, Lorin K.; Vinje, Jason

2013-01-01

Lake resources can be negatively affected by environmental stressors originating from multiple sources and different spatial scales. Shoreline development, in particular, can negatively affect lake resources through decline in habitat quality, physical disturbance, and impacts on fisheries. The development of remote sensing techniques that efficiently characterize shoreline development in a regional context could greatly improve management approaches for protecting and restoring lake resources. The goal of this study was to develop an approach using high-resolution aerial photographs to quantify and assess docks as indicators of shoreline development. First, we describe a dock analysis workflow that can be used to quantify the spatial extent of docks using aerial images. Our approach incorporates pixel-based classifiers with object-based techniques to effectively analyze high-resolution digital imagery. Second, we apply the analysis workflow to quantify docks for 4261 lakes managed by the Minnesota Department of Natural Resources. Overall accuracy of the analysis results was 98.4% (87.7% based on ) after manual post-processing. The analysis workflow was also 74% more efficient than the time required for manual digitization of docks. These analyses have immediate relevance for resource planning in Minnesota, whereas the dock analysis workflow could be used to quantify shoreline development in other regions with comparable imagery. These data can also be used to better understand the effects of shoreline development on aquatic resources and to evaluate the effects of shoreline development relative to other stressors.
[Applications of the hospital statistics management system].

PubMed

Zhai, Hong; Ren, Yong; Liu, Jing; Li, You-Zhang; Ma, Xiao-Long; Jiao, Tao-Tao

2008-01-01

The Hospital Statistics Management System is built on an Office Automation Platform of Shandong provincial hospital system. Its workflow, role and popedom technologies are used to standardize and optimize the management program of statistics in the total quality control of hospital statistics. The system's applications have combined the office automation platform with the statistics management in a hospital and this provides a practical example of a modern hospital statistics management model.
A knowledge-based decision support system in bioinformatics: an application to protein complex extraction

PubMed Central

2013-01-01

Background We introduce a Knowledge-based Decision Support System (KDSS) in order to face the Protein Complex Extraction issue. Using a Knowledge Base (KB) coding the expertise about the proposed scenario, our KDSS is able to suggest both strategies and tools, according to the features of input dataset. Our system provides a navigable workflow for the current experiment and furthermore it offers support in the configuration and running of every processing component of that workflow. This last feature makes our system a crossover between classical DSS and Workflow Management Systems. Results We briefly present the KDSS' architecture and basic concepts used in the design of the knowledge base and the reasoning component. The system is then tested using a subset of Saccharomyces cerevisiae Protein-Protein interaction dataset. We used this subset because it has been well studied in literature by several research groups in the field of complex extraction: in this way we could easily compare the results obtained through our KDSS with theirs. Our system suggests both a preprocessing and a clustering strategy, and for each of them it proposes and eventually runs suited algorithms. Our system's final results are then composed of a workflow of tasks, that can be reused for other experiments, and the specific numerical results for that particular trial. Conclusions The proposed approach, using the KDSS' knowledge base, provides a novel workflow that gives the best results with regard to the other workflows produced by the system. This workflow and its numeric results have been compared with other approaches about PPI network analysis found in literature, offering similar results. PMID:23368995
Leveraging workflow control patterns in the domain of clinical practice guidelines.

PubMed

Kaiser, Katharina; Marcos, Mar

2016-02-10

Clinical practice guidelines (CPGs) include recommendations describing appropriate care for the management of patients with a specific clinical condition. A number of representation languages have been developed to support executable CPGs, with associated authoring/editing tools. Even with tool assistance, authoring of CPG models is a labor-intensive task. We aim at facilitating the early stages of CPG modeling task. In this context, we propose to support the authoring of CPG models based on a set of suitable procedural patterns described in an implementation-independent notation that can be then semi-automatically transformed into one of the alternative executable CPG languages. We have started with the workflow control patterns which have been identified in the fields of workflow systems and business process management. We have analyzed the suitability of these patterns by means of a qualitative analysis of CPG texts. Following our analysis we have implemented a selection of workflow patterns in the Asbru and PROforma CPG languages. As implementation-independent notation for the description of patterns we have chosen BPMN 2.0. Finally, we have developed XSLT transformations to convert the BPMN 2.0 version of the patterns into the Asbru and PROforma languages. We showed that although a significant number of workflow control patterns are suitable to describe CPG procedural knowledge, not all of them are applicable in the context of CPGs due to their focus on single-patient care. Moreover, CPGs may require additional patterns not included in the set of workflow control patterns. We also showed that nearly all the CPG-suitable patterns can be conveniently implemented in the Asbru and PROforma languages. Finally, we demonstrated that individual patterns can be semi-automatically transformed from a process specification in BPMN 2.0 to executable implementations in these languages. We propose a pattern and transformation-based approach for the development of CPG models. Such an approach can form the basis of a valid framework for the authoring of CPG models. The identification of adequate patterns and the implementation of transformations to convert patterns from a process specification into different executable implementations are the first necessary steps for our approach.
OGC and Grid Interoperability in enviroGRIDS Project

NASA Astrophysics Data System (ADS)

Gorgan, Dorian; Rodila, Denisa; Bacu, Victor; Giuliani, Gregory; Ray, Nicolas

2010-05-01

EnviroGRIDS (Black Sea Catchment Observation and Assessment System supporting Sustainable Development) [1] is a 4-years FP7 Project aiming to address the subjects of ecologically unsustainable development and inadequate resource management. The project develops a Spatial Data Infrastructure of the Black Sea Catchment region. The geospatial technologies offer very specialized functionality for Earth Science oriented applications as well as the Grid oriented technology that is able to support distributed and parallel processing. One challenge of the enviroGRIDS project is the interoperability between geospatial and Grid infrastructures by providing the basic and the extended features of the both technologies. The geospatial interoperability technology has been promoted as a way of dealing with large volumes of geospatial data in distributed environments through the development of interoperable Web service specifications proposed by the Open Geospatial Consortium (OGC), with applications spread across multiple fields but especially in Earth observation research. Due to the huge volumes of data available in the geospatial domain and the additional introduced issues (data management, secure data transfer, data distribution and data computation), the need for an infrastructure capable to manage all those problems becomes an important aspect. The Grid promotes and facilitates the secure interoperations of geospatial heterogeneous distributed data within a distributed environment, the creation and management of large distributed computational jobs and assures a security level for communication and transfer of messages based on certificates. This presentation analysis and discusses the most significant use cases for enabling the OGC Web services interoperability with the Grid environment and focuses on the description and implementation of the most promising one. In these use cases we give a special attention to issues such as: the relations between computational grid and the OGC Web service protocols, the advantages offered by the Grid technology - such as providing a secure interoperability between the distributed geospatial resource -and the issues introduced by the integration of distributed geospatial data in a secure environment: data and service discovery, management, access and computation. enviroGRIDS project proposes a new architecture which allows a flexible and scalable approach for integrating the geospatial domain represented by the OGC Web services with the Grid domain represented by the gLite middleware. The parallelism offered by the Grid technology is discussed and explored at the data level, management level and computation level. The analysis is carried out for OGC Web service interoperability in general but specific details are emphasized for Web Map Service (WMS), Web Feature Service (WFS), Web Coverage Service (WCS), Web Processing Service (WPS) and Catalog Service for Web (CSW). Issues regarding the mapping and the interoperability between the OGC and the Grid standards and protocols are analyzed as they are the base in solving the communication problems between the two environments: grid and geospatial. The presetation mainly highlights how the Grid environment and Grid applications capabilities can be extended and utilized in geospatial interoperability. Interoperability between geospatial and Grid infrastructures provides features such as the specific geospatial complex functionality and the high power computation and security of the Grid, high spatial model resolution and geographical area covering, flexible combination and interoperability of the geographical models. According with the Service Oriented Architecture concepts and requirements of interoperability between geospatial and Grid infrastructures each of the main functionality is visible from enviroGRIDS Portal and consequently, by the end user applications such as Decision Maker/Citizen oriented Applications. The enviroGRIDS portal is the single way of the user to get into the system and the portal faces a unique style of the graphical user interface. Main reference for further information: [1] enviroGRIDS Project, http://www.envirogrids.net/
Model medication management process in Australian nursing homes using business process modeling.

PubMed

Qian, Siyu; Yu, Ping

2013-01-01

One of the reasons for end user avoidance or rejection to use health information systems is poor alignment of the system with healthcare workflow, likely causing by system designers' lack of thorough understanding about healthcare process. Therefore, understanding the healthcare workflow is the essential first step for the design of optimal technologies that will enable care staff to complete the intended tasks faster and better. The often use of multiple or "high risk" medicines by older people in nursing homes has the potential to increase medication error rate. To facilitate the design of information systems with most potential to improve patient safety, this study aims to understand medication management process in nursing homes using business process modeling method. The paper presents study design and preliminary findings from interviewing two registered nurses, who were team leaders in two nursing homes. Although there were subtle differences in medication management between the two homes, major medication management activities were similar. Further field observation will be conducted. Based on the data collected from observations, an as-is process model for medication management will be developed.
AGILE/GRID Science Alert Monitoring System: The Workflow and the Crab Flare Case

NASA Astrophysics Data System (ADS)

Bulgarelli, A.; Trifoglio, M.; Gianotti, F.; Tavani, M.; Conforti, V.; Parmiggiani, N.

2013-10-01

During the first five years of the AGILE mission we have observed many gamma-ray transients of Galactic and extragalactic origin. A fast reaction to unexpected transient events is a crucial part of the AGILE monitoring program, because the follow-up of astrophysical transients is a key point for this space mission. We present the workflow and the software developed by the AGILE Team to perform the automatic analysis for the detection of gamma-ray transients. In addition, an App for iPhone will be released enabling the Team to access the monitoring system through mobile phones. In 2010 September the science alert monitoring system presented in this paper recorded a transient phenomena from the Crab Nebula, generating an automated alert sent via email and SMS two hours after the end of an AGILE satellite orbit, i.e. two hours after the Crab flare itself: for this discovery AGILE won the 2012 Bruno Rossi prize. The design of this alert system is maximized to reach the maximum speed, and in this, as in many other cases, AGILE has demonstrated that the reaction speed of the monitoring system is crucial for the scientific return of the mission.

ASCEM Data Brower (ASCEMDB) v0.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

ROMOSAN, ALEXANDRU

Data management tool designed for the Advanced Simulation Capability for Environmental Management (ASCEM) framework. Distinguishing features of this gateway include: (1) handling of complex geometry data, (2) advance selection mechanism, (3) state of art rendering of spatiotemporal data records, and (4) seamless integration with a distributed workflow engine.
Electronic Resource Management 2.0: Using Web 2.0 Technologies as Cost-Effective Alternatives to an Electronic Resource Management System

ERIC Educational Resources Information Center

Murray, Adam

2008-01-01

Designed to assist with the management of e-resources, electronic resource management (ERM) systems are time- and fund-consuming to purchase and maintain. Questions of system compatibility, data population, and workflow design/redesign can be difficult to answer; sometimes those answers are not what we'd prefer to hear. The two primary functions…
The Internet of things and Smart Grid

NASA Astrophysics Data System (ADS)

Li, Biao; Lv, Sen; Pan, Qing

2018-02-01

The Internet of things and smart grid are the frontier of information and Industry. The combination of Internet of things and smart grid will greatly enhance the ability of smart grid information and communication support. The key technologies of the Internet of things will be applied to the smart grid, and the grid operation and management information perception service centre will be built to support the commanding heights of the world’s smart grid.
Grids for Dummies: Featuring Earth Science Data Mining Application

NASA Technical Reports Server (NTRS)

Hinke, Thomas H.

2002-01-01

This viewgraph presentation discusses the concept and advantages of linking computers together into data grids, an emerging technology for managing information across institutions, and potential users of data grids. The logistics of access to a grid, including the use of the World Wide Web to access grids, and security concerns are also discussed. The potential usefulness of data grids to the earth science community is also discussed, as well as the Global Grid Forum, and other efforts to establish standards for data grids.
Sharing Data and Analytical Resources Securely in a Biomedical Research Grid Environment

PubMed Central

Langella, Stephen; Hastings, Shannon; Oster, Scott; Pan, Tony; Sharma, Ashish; Permar, Justin; Ervin, David; Cambazoglu, B. Barla; Kurc, Tahsin; Saltz, Joel

2008-01-01

Objectives To develop a security infrastructure to support controlled and secure access to data and analytical resources in a biomedical research Grid environment, while facilitating resource sharing among collaborators. Design A Grid security infrastructure, called Grid Authentication and Authorization with Reliably Distributed Services (GAARDS), is developed as a key architecture component of the NCI-funded cancer Biomedical Informatics Grid (caBIG™). The GAARDS is designed to support in a distributed environment 1) efficient provisioning and federation of user identities and credentials; 2) group-based access control support with which resource providers can enforce policies based on community accepted groups and local groups; and 3) management of a trust fabric so that policies can be enforced based on required levels of assurance. Measurements GAARDS is implemented as a suite of Grid services and administrative tools. It provides three core services: Dorian for management and federation of user identities, Grid Trust Service for maintaining and provisioning a federated trust fabric within the Grid environment, and Grid Grouper for enforcing authorization policies based on both local and Grid-level groups. Results The GAARDS infrastructure is available as a stand-alone system and as a component of the caGrid infrastructure. More information about GAARDS can be accessed at http://www.cagrid.org. Conclusions GAARDS provides a comprehensive system to address the security challenges associated with environments in which resources may be located at different sites, requests to access the resources may cross institutional boundaries, and user credentials are created, managed, revoked dynamically in a de-centralized manner. PMID:18308979
Potential of Laboratory Execution Systems (LESs) to Simplify the Application of Business Process Management Systems (BPMSs) in Laboratory Automation.

PubMed

Neubert, Sebastian; Göde, Bernd; Gu, Xiangyu; Stoll, Norbert; Thurow, Kerstin

2017-04-01

Modern business process management (BPM) is increasingly interesting for laboratory automation. End-to-end workflow automation and improved top-level systems integration for information technology (IT) and automation systems are especially prominent objectives. With the ISO Standard Business Process Model and Notation (BPMN) 2.X, a system-independent and interdisciplinary accepted graphical process control notation is provided, allowing process analysis, while also being executable. The transfer of BPM solutions to structured laboratory automation places novel demands, for example, concerning the real-time-critical process and systems integration. The article discusses the potential of laboratory execution systems (LESs) for an easier implementation of the business process management system (BPMS) in hierarchical laboratory automation. In particular, complex application scenarios, including long process chains based on, for example, several distributed automation islands and mobile laboratory robots for a material transport, are difficult to handle in BPMSs. The presented approach deals with the displacement of workflow control tasks into life science specialized LESs, the reduction of numerous different interfaces between BPMSs and subsystems, and the simplification of complex process modelings. Thus, the integration effort for complex laboratory workflows can be significantly reduced for strictly structured automation solutions. An example application, consisting of a mixture of manual and automated subprocesses, is demonstrated by the presented BPMS-LES approach.
Integrating the Allen Brain Institute Cell Types Database into Automated Neuroscience Workflow.

PubMed

Stockton, David B; Santamaria, Fidel

2017-10-01

We developed software tools to download, extract features, and organize the Cell Types Database from the Allen Brain Institute (ABI) in order to integrate its whole cell patch clamp characterization data into the automated modeling/data analysis cycle. To expand the potential user base we employed both Python and MATLAB. The basic set of tools downloads selected raw data and extracts cell, sweep, and spike features, using ABI's feature extraction code. To facilitate data manipulation we added a tool to build a local specialized database of raw data plus extracted features. Finally, to maximize automation, we extended our NeuroManager workflow automation suite to include these tools plus a separate investigation database. The extended suite allows the user to integrate ABI experimental and modeling data into an automated workflow deployed on heterogeneous computer infrastructures, from local servers, to high performance computing environments, to the cloud. Since our approach is focused on workflow procedures our tools can be modified to interact with the increasing number of neuroscience databases being developed to cover all scales and properties of the nervous system.
Efficient Workflows for Curation of Heterogeneous Data Supporting Modeling of U-Nb Alloy Aging

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ward, Logan Timothy; Hackenberg, Robert Errol

These are slides from a presentation summarizing a graduate research associate's summer project. The following topics are covered in these slides: data challenges in materials, aging in U-Nb Alloys, Building an Aging Model, Different Phase Trans. in U-Nb, the Challenge, Storing Materials Data, Example Data Source, Organizing Data: What is a Schema?, What does a "XML Schema" look like?, Our Data Schema: Nice and Simple, Storing Data: Materials Data Curation System (MDCS), Problem with MDCS: Slow Data Entry, Getting Literature into MDCS, Staging Data in Excel Document, Final Result: MDCS Records, Analyzing Image Data, Process for Making TTT Diagram, Bottleneckmore » Number 1: Image Analysis, Fitting a TTP Boundary, Fitting a TTP Curve: Comparable Results, How Does it Compare to Our Data?, Image Analysis Workflow, Curating Hardness Records, Hardness Data: Two Key Decisions, Before Peak Age? - Automation, Interactive Viz, Which Transformation?, Microstructure-Informed Model, Tracking the Entire Process, General Problem with Property Models, Pinyon: Toolkit for Managing Model Creation, Tracking Individual Decisions, Jupyter: Docs and Code in One File, Hardness Analysis Workflow, Workflow for Aging Models, and conclusions.« less
Towards a Unified Architecture for Data-Intensive Seismology in VERCE

NASA Astrophysics Data System (ADS)

Klampanos, I.; Spinuso, A.; Trani, L.; Krause, A.; Garcia, C. R.; Atkinson, M.

2013-12-01

Modern seismology involves managing, storing and processing large datasets, typically geographically distributed across organisations. Performing computational experiments using these data generates more data, which in turn have to be managed, further analysed and frequently be made available within or outside the scientific community. As part of the EU-funded project VERCE (http://verce.eu), we research and develop a number of use-cases, interfacing technologies to satisfy the data-intensive requirements of modern seismology. Our solution seeks to support: (1) familiar programming environments to develop and execute experiments, in particular via Python/ObsPy, (2) a unified view of heterogeneous computing resources, public or private, through the adoption of workflows, (3) monitoring the experiments and validating the data products at varying granularities, via a comprehensive provenance system, (4) reproducibility of experiments and consistency in collaboration, via a shared registry of processing units and contextual metadata (computing resources, data, etc.) Here, we provide a brief account of these components and their roles in the proposed architecture. Our design integrates heterogeneous distributed systems, while allowing researchers to retain current practices and control data handling and execution via higher-level abstractions. At the core of our solution lies the workflow language Dispel. While Dispel can be used to express workflows at fine detail, it may also be used as part of meta- or job-submission workflows. User interaction can be provided through a visual editor or through custom applications on top of parameterisable workflows, which is the approach VERCE follows. According to our design, the scientist may use versions of Dispel/workflow processing elements offered by the VERCE library or override them introducing custom scientific code, using ObsPy. This approach has the advantage that, while the scientist uses a familiar tool, the resulting workflow can be executed on a number of underlying stream-processing engines, such as STORM or OGSA-DAI, transparently. While making efficient use of arbitrarily distributed resources and large data-sets is of priority, such processing requires adequate provenance tracking and monitoring. Hiding computation and orchestration details via a workflow system, allows us to embed provenance harvesting where appropriate without impeding the user's regular working patterns. Our provenance model is based on the W3C PROV standard and can provide information of varying granularity regarding execution, systems and data consumption/production. A video demonstrating a prototype provenance exploration tool can be found at http://bit.ly/15t0Fz0. Keeping experimental methodology and results open and accessible, as well as encouraging reproducibility and collaboration, is of central importance to modern science. As our users are expected to be based at different geographical locations, to have access to different computing resources and to employ customised scientific codes, the use of a shared registry of workflow components, implementations, data and computing resources is critical.
Rapid Assessment of Contaminants and Interferences in Mass Spectrometry Data Using Skyline

NASA Astrophysics Data System (ADS)

Rardin, Matthew J.

2018-04-01

Proper sample preparation in proteomic workflows is essential to the success of modern mass spectrometry experiments. Complex workflows often require reagents which are incompatible with MS analysis (e.g., detergents) necessitating a variety of sample cleanup procedures. Efforts to understand and mitigate sample contamination are a continual source of disruption with respect to both time and resources. To improve the ability to rapidly assess sample contamination from a diverse array of sources, I developed a molecular library in Skyline for rapid extraction of contaminant precursor signals using MS1 filtering. This contaminant template library is easily managed and can be modified for a diverse array of mass spectrometry sample preparation workflows. Utilization of this template allows rapid assessment of sample integrity and indicates potential sources of contamination. [Figure not available: see fulltext.
Automation in an Addiction Treatment Research Clinic: Computerized Contingency Management, Ecological Momentary Assessment, and a Protocol Workflow System

PubMed Central

Vahabzadeh, Massoud; Lin, Jia-Ling; Mezghanni, Mustapha; Epstein, David H.; Preston, Kenzie L.

2009-01-01

Issues A challenge in treatment research is the necessity of adhering to protocol and regulatory strictures while maintaining flexibility to meet patients’ treatment needs and accommodate variations among protocols. Another challenge is the acquisition of large amounts of data in an occasionally hectic environment, along with provision of seamless methods for exporting, mining, and querying the data. Approach We have automated several major functions of our outpatient treatment research clinic for studies in drug abuse and dependence. Here we describe three such specialized applications: the Automated Contingency Management (ACM) system for delivery of behavioral interventions, the Transactional Electronic Diary (TED) system for management of behavioral assessments, and the Protocol Workflow System (PWS) for computerized workflow automation and guidance of each participant’s daily clinic activities. These modules are integrated into our larger information system to enable data sharing in real time among authorized staff. Key Findings ACM and TED have each permitted us to conduct research that was not previously possible. In addition, the time to data analysis at the end of each study is substantially shorter. With the implementation of the PWS, we have been able to manage a research clinic with an 80-patient capacity having an annual average of 18,000 patient-visits and 7,300 urine collections with a research staff of five. Finally, automated data management has considerably enhanced our ability to monitor and summarize participant-safety data for research oversight. Implications and conclusion When developed in consultation with end users, automation in treatment-research clinics can enable more efficient operations, better communication among staff, and expansions in research methods. PMID:19320669
Workflow-Based Software Development Environment

NASA Technical Reports Server (NTRS)

Izygon, Michel E.

2013-01-01

The Software Developer's Assistant (SDA) helps software teams more efficiently and accurately conduct or execute software processes associated with NASA mission-critical software. SDA is a process enactment platform that guides software teams through project-specific standards, processes, and procedures. Software projects are decomposed into all of their required process steps or tasks, and each task is assigned to project personnel. SDA orchestrates the performance of work required to complete all process tasks in the correct sequence. The software then notifies team members when they may begin work on their assigned tasks and provides the tools, instructions, reference materials, and supportive artifacts that allow users to compliantly perform the work. A combination of technology components captures and enacts any software process use to support the software lifecycle. It creates an adaptive workflow environment that can be modified as needed. SDA achieves software process automation through a Business Process Management (BPM) approach to managing the software lifecycle for mission-critical projects. It contains five main parts: TieFlow (workflow engine), Business Rules (rules to alter process flow), Common Repository (storage for project artifacts, versions, history, schedules, etc.), SOA (interface to allow internal, GFE, or COTS tools integration), and the Web Portal Interface (collaborative web environment
[Development of a medical equipment support information system based on PDF portable document].

PubMed

Cheng, Jiangbo; Wang, Weidong

2010-07-01

According to the organizational structure and management system of the hospital medical engineering support, integrate medical engineering support workflow to ensure the medical engineering data effectively, accurately and comprehensively collected and kept in electronic archives. Analyse workflow of the medical, equipment support work and record all work processes by the portable electronic document. Using XML middleware technology and SQL Server database, complete process management, data calculation, submission, storage and other functions. The practical application shows that the medical equipment support information system optimizes the existing work process, standardized and digital, automatic and efficient orderly and controllable. The medical equipment support information system based on portable electronic document can effectively optimize and improve hospital medical engineering support work, improve performance, reduce costs, and provide full and accurate digital data
Digital asset management.

PubMed

Humphrey, Clinton D; Tollefson, Travis T; Kriet, J David

2010-05-01

Facial plastic surgeons are accumulating massive digital image databases with the evolution of photodocumentation and widespread adoption of digital photography. Managing and maximizing the utility of these vast data repositories, or digital asset management (DAM), is a persistent challenge. Developing a DAM workflow that incorporates a file naming algorithm and metadata assignment will increase the utility of a surgeon's digital images. Copyright 2010 Elsevier Inc. All rights reserved.
Digital disruption ?syndromes.

PubMed

Sullivan, Clair; Staib, Andrew

2017-05-18

The digital transformation of hospitals in Australia is occurring rapidly in order to facilitate innovation and improve efficiency. Rapid transformation can cause temporary disruption of hospital workflows and staff as processes are adapted to the new digital workflows. The aim of this paper is to outline various types of digital disruption and some strategies for effective management. A large tertiary university hospital recently underwent a rapid, successful roll-out of an integrated electronic medical record (EMR). We observed this transformation and propose several digital disruption "syndromes" to assist with understanding and management during digital transformation: digital deceleration, digital transparency, digital hypervigilance, data discordance, digital churn and post-digital 'depression'. These 'syndromes' are defined and discussed in detail. Successful management of this temporary digital disruption is important to ensure a successful transition to a digital platform. What is known about this topic? Digital disruption is defined as the changes facilitated by digital technologies that occur at a pace and magnitude that disrupt established ways of value creation, social interactions, doing business and more generally our thinking. Increasing numbers of Australian hospitals are implementing digital solutions to replace traditional paper-based systems for patient care in order to create opportunities for improved care and efficiencies. Such large scale change has the potential to create transient disruption to workflows and staff. Managing this temporary disruption effectively is an important factor in the successful implementation of an EMR. What does this paper add? A large tertiary university hospital recently underwent a successful rapid roll-out of an integrated electronic medical record (EMR) to become Australia's largest digital hospital over a 3-week period. We observed and assisted with the management of several cultural, behavioural and operational forms of digital disruption which lead us to propose some digital disruption 'syndromes'. The definition and management of these 'syndromes' are discussed in detail. What are the implications for practitioners? Minimising the temporary effects of digital disruption in hospitals requires an understanding that these digital 'syndromes' are to be expected and actively managed during large-scale transformation.
Targeting safety improvements through identification of incident origination and detection in a near-miss incident learning system.

PubMed

Novak, Avrey; Nyflot, Matthew J; Ermoian, Ralph P; Jordan, Loucille E; Sponseller, Patricia A; Kane, Gabrielle M; Ford, Eric C; Zeng, Jing

2016-05-01

Radiation treatment planning involves a complex workflow that has multiple potential points of vulnerability. This study utilizes an incident reporting system to identify the origination and detection points of near-miss errors, in order to guide their departmental safety improvement efforts. Previous studies have examined where errors arise, but not where they are detected or applied a near-miss risk index (NMRI) to gauge severity. From 3/2012 to 3/2014, 1897 incidents were analyzed from a departmental incident learning system. All incidents were prospectively reviewed weekly by a multidisciplinary team and assigned a NMRI score ranging from 0 to 4 reflecting potential harm to the patient (no potential harm to potential critical harm). Incidents were classified by point of incident origination and detection based on a 103-step workflow. The individual steps were divided among nine broad workflow categories (patient assessment, imaging for radiation therapy (RT) planning, treatment planning, pretreatment plan review, treatment delivery, on-treatment quality management, post-treatment completion, equipment/software quality management, and other). The average NMRI scores of incidents originating or detected within each broad workflow area were calculated. Additionally, out of 103 individual process steps, 35 were classified as safety barriers, the process steps whose primary function is to catch errors. The safety barriers which most frequently detected incidents were identified and analyzed. Finally, the distance between event origination and detection was explored by grouping events by the number of broad workflow area events passed through before detection, and average NMRI scores were compared. Near-miss incidents most commonly originated within treatment planning (33%). However, the incidents with the highest average NMRI scores originated during imaging for RT planning (NMRI = 2.0, average NMRI of all events = 1.5), specifically during the documentation of patient positioning and localization of the patient. Incidents were most frequently detected during treatment delivery (30%), and incidents identified at this point also had higher severity scores than other workflow areas (NMRI = 1.6). Incidents identified during on-treatment quality management were also more severe (NMRI = 1.7), and the specific process steps of reviewing portal and CBCT images tended to catch highest-severity incidents. On average, safety barriers caught 46% of all incidents, most frequently at physics chart review, therapist's chart check, and the review of portal images; however, most of the incidents that pass through a particular safety barrier are not designed to be capable of being captured at that barrier. Incident learning systems can be used to assess the most common points of error origination and detection in radiation oncology. This can help tailor safety improvement efforts and target the highest impact portions of the workflow. The most severe near-miss events tend to originate during simulation, with the most severe near-miss events detected at the time of patient treatment. Safety barriers can be improved to allow earlier detection of near-miss events.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Novak, Avrey; Nyflot, Matthew J.; Ermoian, Ralph P.

Purpose: Radiation treatment planning involves a complex workflow that has multiple potential points of vulnerability. This study utilizes an incident reporting system to identify the origination and detection points of near-miss errors, in order to guide their departmental safety improvement efforts. Previous studies have examined where errors arise, but not where they are detected or applied a near-miss risk index (NMRI) to gauge severity. Methods: From 3/2012 to 3/2014, 1897 incidents were analyzed from a departmental incident learning system. All incidents were prospectively reviewed weekly by a multidisciplinary team and assigned a NMRI score ranging from 0 to 4 reflectingmore » potential harm to the patient (no potential harm to potential critical harm). Incidents were classified by point of incident origination and detection based on a 103-step workflow. The individual steps were divided among nine broad workflow categories (patient assessment, imaging for radiation therapy (RT) planning, treatment planning, pretreatment plan review, treatment delivery, on-treatment quality management, post-treatment completion, equipment/software quality management, and other). The average NMRI scores of incidents originating or detected within each broad workflow area were calculated. Additionally, out of 103 individual process steps, 35 were classified as safety barriers, the process steps whose primary function is to catch errors. The safety barriers which most frequently detected incidents were identified and analyzed. Finally, the distance between event origination and detection was explored by grouping events by the number of broad workflow area events passed through before detection, and average NMRI scores were compared. Results: Near-miss incidents most commonly originated within treatment planning (33%). However, the incidents with the highest average NMRI scores originated during imaging for RT planning (NMRI = 2.0, average NMRI of all events = 1.5), specifically during the documentation of patient positioning and localization of the patient. Incidents were most frequently detected during treatment delivery (30%), and incidents identified at this point also had higher severity scores than other workflow areas (NMRI = 1.6). Incidents identified during on-treatment quality management were also more severe (NMRI = 1.7), and the specific process steps of reviewing portal and CBCT images tended to catch highest-severity incidents. On average, safety barriers caught 46% of all incidents, most frequently at physics chart review, therapist’s chart check, and the review of portal images; however, most of the incidents that pass through a particular safety barrier are not designed to be capable of being captured at that barrier. Conclusions: Incident learning systems can be used to assess the most common points of error origination and detection in radiation oncology. This can help tailor safety improvement efforts and target the highest impact portions of the workflow. The most severe near-miss events tend to originate during simulation, with the most severe near-miss events detected at the time of patient treatment. Safety barriers can be improved to allow earlier detection of near-miss events.« less
Wireless remote control of clinical image workflow: using a PDA for off-site distribution and disaster recovery.

PubMed

Documet, Jorge; Liu, Brent J; Documet, Luis; Huang, H K

2006-07-01

This paper describes a picture archiving and communication system (PACS) tool based on Web technology that remotely manages medical images between a PACS archive and remote destinations. Successfully implemented in a clinical environment and also demonstrated for the past 3 years at the conferences of various organizations, including the Radiological Society of North America, this tool provides a very practical and simple way to manage a PACS, including off-site image distribution and disaster recovery. The application is robust and flexible and can be used on a standard PC workstation or a Tablet PC, but more important, it can be used with a personal digital assistant (PDA). With a PDA, the Web application becomes a powerful wireless and mobile image management tool. The application's quick and easy-to-use features allow users to perform Digital Imaging and Communications in Medicine (DICOM) queries and retrievals with a single interface, without having to worry about the underlying configuration of DICOM nodes. In addition, this frees up dedicated PACS workstations to perform their specialized roles within the PACS workflow. This tool has been used at Saint John's Health Center in Santa Monica, California, for 2 years. The average number of queries per month is 2,021, with 816 C-MOVE retrieve requests. Clinical staff members can use PDAs to manage image workflow and PACS examination distribution conveniently for off-site consultations by referring physicians and radiologists and for disaster recovery. This solution also improves radiologists' effectiveness and efficiency in health care delivery both within radiology departments and for off-site clinical coverage.
Several considerations with respect to the future of digital photography and photographic printing

NASA Astrophysics Data System (ADS)

Tuijn, Chris; Mahy, Marc F.

2000-12-01

Digital cameras are no longer exotic gadgets being used by a privileged group of early adopters. More and more people realize that there are obvious advantages to the digital solution over the conventional film-based workflow. Claiming that prints on paper are no longer necessary in the digit workflow, however, would be similar to reviving the myth of the paperless office. Often, people still like to share their memories on paper and this for a variety of reasons. There are still some hurdles to be taken in order to make the digital dream com true. In this paper, we will give a survey of the different workflows in digital photography. The local, semi-local and Internet solutions will be discussed as well as the preferred output systems for each of these solutions. When discussing output system, we immediately think of appropriate color management solutions. In the second part of this paper, we will discuss the major color management issues appearing in digital photography. A clear separation between the image acquisition and the image rendering phases will be made. After a quick survey of the different image restoration and enhancement techniques, we will make some reflections on the ideal color exchange space; the enhanced image should be delivered in this exchange space and, from there, the standard color management transformations can be applied to transfer the image from this exchange space to the native color space of the output device. We will also discus some color gamut characteristics and color management problems of different types of photographic printers that can occur during this conversion process.
Purdue Ionomics Information Management System. An Integrated Functional Genomics Platform1[C][W][OA

PubMed Central

Baxter, Ivan; Ouzzani, Mourad; Orcun, Seza; Kennedy, Brad; Jandhyala, Shrinivas S.; Salt, David E.

2007-01-01

The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics. PMID:17189337

CERTS: Consortium for Electric Reliability Technology Solutions - Research Highlights

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eto, Joseph

2003-07-30

Historically, the U.S. electric power industry was vertically integrated, and utilities were responsible for system planning, operations, and reliability management. As the nation moves to a competitive market structure, these functions have been disaggregated, and no single entity is responsible for reliability management. As a result, new tools, technologies, systems, and management processes are needed to manage the reliability of the electricity grid. However, a number of simultaneous trends prevent electricity market participants from pursuing development of these reliability tools: utilities are preoccupied with restructuring their businesses, research funding has declined, and the formation of Independent System Operators (ISOs) andmore » Regional Transmission Organizations (RTOs) to operate the grid means that control of transmission assets is separate from ownership of these assets; at the same time, business uncertainty, and changing regulatory policies have created a climate in which needed investment for transmission infrastructure and tools for reliability management has dried up. To address the resulting emerging gaps in reliability R&D, CERTS has undertaken much-needed public interest research on reliability technologies for the electricity grid. CERTS' vision is to: (1) Transform the electricity grid into an intelligent network that can sense and respond automatically to changing flows of power and emerging problems; (2) Enhance reliability management through market mechanisms, including transparency of real-time information on the status of the grid; (3) Empower customers to manage their energy use and reliability needs in response to real-time market price signals; and (4) Seamlessly integrate distributed technologies--including those for generation, storage, controls, and communications--to support the reliability needs of both the grid and individual customers.« less
Towards a centralized Grid Speedometer

NASA Astrophysics Data System (ADS)

Dzhunov, I.; Andreeva, J.; Fajardo, E.; Gutsche, O.; Luyckx, S.; Saiz, P.

2014-06-01

Given the distributed nature of the Worldwide LHC Computing Grid and the way CPU resources are pledged and shared around the globe, Virtual Organizations (VOs) face the challenge of monitoring the use of these resources. For CMS and the operation of centralized workflows, the monitoring of how many production jobs are running and pending in the Glidein WMS production pools is very important. The Dashboard Site Status Board (SSB) provides a very flexible framework to collect, aggregate and visualize data. The CMS production monitoring team uses the SSB to define the metrics that have to be monitored and the alarms that have to be raised. During the integration of CMS production monitoring into the SSB, several enhancements to the core functionality of the SSB were required; They were implemented in a generic way, so that other VOs using the SSB can exploit them. Alongside these enhancements, there were a number of changes to the core of the SSB framework. This paper presents the details of the implementation and the advantages for current and future usage of the new features in SSB.
Meta-manager: a requirements analysis.

PubMed

Cook, J F; Rozenblit, J W; Chacko, A K; Martinez, R; Timboe, H L

1999-05-01

The digital imaging network-picture-archiving and communications system (DIN-PACS) will be implemented in ten sites within the Great Plains Regional Medical Command (GPRMC). This network of PACS and teleradiology technology over a shared T1 network has opened the door for round the clock radiology coverage of all sites. However, the concept of a virtual radiology environment poses new issues for military medicine. A new workflow management system must be developed. This workflow management system will allow us to efficiently resolve these issues including quality of care, availability, severe capitation, and quality of the workforce. The design process of this management system must employ existing technology, operate over various telecommunication networks and protocols, be independent of platform operating systems, be flexible and scaleable, and involve the end user at the outset in the design process for which it is developed. Using the unified modeling language (UML), the specifications for this new business management system were created in concert between the University of Arizona and the GPRMC. These specifications detail a management system operating through a common object request brokered architecture (CORBA) environment. In this presentation, we characterize the Meta-Manager management system including aspects of intelligence, interfacility routing, fail-safe operations, and expected improvements in patient care and efficiency.
MIEC-SVM: automated pipeline for protein peptide/ligand interaction prediction.

PubMed

Li, Nan; Ainsworth, Richard I; Wu, Meixin; Ding, Bo; Wang, Wei

2016-03-15

MIEC-SVM is a structure-based method for predicting protein recognition specificity. Here, we present an automated MIEC-SVM pipeline providing an integrated and user-friendly workflow for construction and application of the MIEC-SVM models. This pipeline can handle standard amino acids and those with post-translational modifications (PTMs) or small molecules. Moreover, multi-threading and support to Sun Grid Engine (SGE) are implemented to significantly boost the computational efficiency. The program is available at http://wanglab.ucsd.edu/MIEC-SVM CONTACT: : wei-wang@ucsd.edu Supplementary data available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
MO-D-213-01: Workflow Monitoring for a High Volume Radiation Oncology Center

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laub, S; Dunn, M; Galbreath, G

2015-06-15

Purpose: Implement a center wide communication system that increases interdepartmental transparency and accountability while decreasing redundant work and treatment delays by actively monitoring treatment planning workflow. Methods: Intake Management System (IMS), a program developed by ProCure Treatment Centers Inc., is a multi-function database that stores treatment planning process information. It was devised to work with the oncology information system (Mosaiq) to streamline interdepartmental workflow.Each step in the treatment planning process is visually represented and timelines for completion of individual tasks are established within the software. The currently active step of each patient’s planning process is highlighted either red or greenmore » according to whether the initially allocated amount of time has passed for the given process. This information is displayed as a Treatment Planning Process Monitor (TPPM), which is shown on screens in the relevant departments throughout the center. This display also includes the individuals who are responsible for each task.IMS is driven by Mosaiq’s quality checklist (QCL) functionality. Each step in the workflow is initiated by a Mosaiq user sending the responsible party a QCL assignment. IMS is connected to Mosaiq and the sending or completing of a QCL updates the associated field in the TPPM to the appropriate status. Results: Approximately one patient a week is identified during the workflow process as needing to have his/her treatment start date modified or resources re-allocated to address the most urgent cases. Being able to identify a realistic timeline for planning each patient and having multiple departments communicate their limitations and time constraints allows for quality plans to be developed and implemented without overburdening any one department. Conclusion: Monitoring the progression of the treatment planning process has increased transparency between departments, which enables efficient communication. Having built-in timelines allows easy prioritization of tasks and resources and facilitates effective time management.« less
Grid-Scale Energy Storage Demonstration of Ancillary Services Using the UltraBattery Technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seasholtz, Jeff

2015-08-20

The collaboration described in this document is being done as part of a cooperative research agreement under the Department of Energy’s Smart Grid Demonstration Program. This document represents the Final Technical Performance Report, from July 2012 through April 2015, for the East Penn Manufacturing Smart Grid Program demonstration project. This Smart Grid Demonstration project demonstrates Distributed Energy Storage for Grid Support, in particular the economic and technical viability of a grid-scale, advanced energy storage system using UltraBattery ® technology for frequency regulation ancillary services and demand management services. This project entailed the construction of a dedicated facility on the Eastmore » Penn campus in Lyon Station, PA that is being used as a working demonstration to provide regulation ancillary services to PJM and demand management services to Metropolitan Edison (Met-Ed).« less
[Change in process management by implementing RIS, PACS and flat-panel detectors].

PubMed

Imhof, H; Dirisamer, A; Fischer, H; Grampp, S; Heiner, L; Kaderk, M; Krestan, C; Kainberger, F

2002-05-01

Implementation of radiological information systems (RIS) and picture archiving and communicating systems (PACS) results in significant changes of workflow in a radiological department. Additional connection with flat-panel detectors leads to a shortening of the work process. RIS and PACS implementation alone reduces the complete workflow by 21-80%. With flatpanel technology the image production process is further shortened by 25-30%. The workflow-steps are changed from original 17-12 with the implementation of RIS and PACS and to 5 with the integrated use of flatpanels. This clearly recognizable advantages in the workflow need an according financial investment. Several studies could show that the capitalisation-factor calculated over eight years is positive, with a gain range between 5-25%. Whether the additional implementation of flatpanel detectors results also in a positive capitalisation over the years, cannot be estimated exactly, at the moment, because the experiences are too short. Particularly critical are the interfaces, which needs a constant quality control. Our flatpanel detector-system is fixed, special images--as we have them in about 3-5% of all cases--need still conventional filmscreen or phosphorplate-systems. Full-spine and long-leg examinations cannot be performed with sufficient exactness. Without any questions implementation of integrated RIS, PACS and flatpanel detector-system needs excellent training of the employees, because of the changes in workflow etc. The main profits of such an integrated implementation are an increase in quality in image and report datas, easier handling--there are almost no more cassettes necessary--and excessive shortening of workflow.
SISYPHUS: A high performance seismic inversion factory

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
The Ophidia Stack: Toward Large Scale, Big Data Analytics Experiments for Climate Change

NASA Astrophysics Data System (ADS)

Fiore, S.; Williams, D. N.; D'Anca, A.; Nassisi, P.; Aloisio, G.

2015-12-01

The Ophidia project is a research effort on big data analytics facing scientific data analysis challenges in multiple domains (e.g. climate change). It provides a "datacube-oriented" framework responsible for atomically processing and manipulating scientific datasets, by providing a common way to run distributive tasks on large set of data fragments (chunks). Ophidia provides declarative, server-side, and parallel data analysis, jointly with an internal storage model able to efficiently deal with multidimensional data and a hierarchical data organization to manage large data volumes. The project relies on a strong background on high performance database management and On-Line Analytical Processing (OLAP) systems to manage large scientific datasets. The Ophidia analytics platform provides several data operators to manipulate datacubes (about 50), and array-based primitives (more than 100) to perform data analysis on large scientific data arrays. To address interoperability, Ophidia provides multiple server interfaces (e.g. OGC-WPS). From a client standpoint, a Python interface enables the exploitation of the framework into Python-based eco-systems/applications (e.g. IPython) and the straightforward adoption of a strong set of related libraries (e.g. SciPy, NumPy). The talk will highlight a key feature of the Ophidia framework stack: the "Analytics Workflow Management System" (AWfMS). The Ophidia AWfMS coordinates, orchestrates, optimises and monitors the execution of multiple scientific data analytics and visualization tasks, thus supporting "complex analytics experiments". Some real use cases related to the CMIP5 experiment will be discussed. In particular, with regard to the "Climate models intercomparison data analysis" case study proposed in the EU H2020 INDIGO-DataCloud project, workflows related to (i) anomalies, (ii) trend, and (iii) climate change signal analysis will be presented. Such workflows will be distributed across multiple sites - according to the datasets distribution - and will include intercomparison, ensemble, and outlier analysis. The two-level workflow solution envisioned in INDIGO (coarse grain for distributed tasks orchestration, and fine grain, at the level of a single data analytics cluster instance) will be presented and discussed.
Electric vehicle equipment for grid-integrated vehicles

DOEpatents

Kempton, Willett

2013-08-13

Methods, systems, and apparatus for interfacing an electric vehicle with an electric power grid are disclosed. An exemplary apparatus may include a station communication port for interfacing with electric vehicle station equipment (EVSE), a vehicle communication port for interfacing with a vehicle management system (VMS), and a processor coupled to the station communication port and the vehicle communication port to establish communication with the EVSE via the station communication port, receive EVSE attributes from the EVSE, and issue commands to the VMS to manage power flow between the electric vehicle and the EVSE based on the EVSE attributes. An electric vehicle may interface with the grid by establishing communication with the EVSE, receiving the EVSE attributes, and managing power flow between the EVE and the grid based on the EVSE attributes.
Pathology economic model tool: a novel approach to workflow and budget cost analysis in an anatomic pathology laboratory.

PubMed

Muirhead, David; Aoun, Patricia; Powell, Michael; Juncker, Flemming; Mollerup, Jens

2010-08-01

The need for higher efficiency, maximum quality, and faster turnaround time is a continuous focus for anatomic pathology laboratories and drives changes in work scheduling, instrumentation, and management control systems. To determine the costs of generating routine, special, and immunohistochemical microscopic slides in a large, academic anatomic pathology laboratory using a top-down approach. The Pathology Economic Model Tool was used to analyze workflow processes at The Nebraska Medical Center's anatomic pathology laboratory. Data from the analysis were used to generate complete cost estimates, which included not only materials, consumables, and instrumentation but also specific labor and overhead components for each of the laboratory's subareas. The cost data generated by the Pathology Economic Model Tool were compared with the cost estimates generated using relative value units. Despite the use of automated systems for different processes, the workflow in the laboratory was found to be relatively labor intensive. The effect of labor and overhead on per-slide costs was significantly underestimated by traditional relative-value unit calculations when compared with the Pathology Economic Model Tool. Specific workflow defects with significant contributions to the cost per slide were identified. The cost of providing routine, special, and immunohistochemical slides may be significantly underestimated by traditional methods that rely on relative value units. Furthermore, a comprehensive analysis may identify specific workflow processes requiring improvement.
Dashboard Task Monitor for Managing ATLAS User Analysis on the Grid

NASA Astrophysics Data System (ADS)

Sargsyan, L.; Andreeva, J.; Jha, M.; Karavakis, E.; Kokoszkiewicz, L.; Saiz, P.; Schovancova, J.; Tuckett, D.; Atlas Collaboration

2014-06-01

The organization of the distributed user analysis on the Worldwide LHC Computing Grid (WLCG) infrastructure is one of the most challenging tasks among the computing activities at the Large Hadron Collider. The Experiment Dashboard offers a solution that not only monitors but also manages (kill, resubmit) user tasks and jobs via a web interface. The ATLAS Dashboard Task Monitor provides analysis users with a tool that is independent of the operating system and Grid environment. This contribution describes the functionality of the application and its implementation details, in particular authentication, authorization and audit of the management operations.
The provider perspective: investigating the effect of the Electronic Patient-Reported Outcome (ePRO) mobile application and portal on primary care provider workflow.

PubMed

Hans, Parminder K; Gray, Carolyn Steele; Gill, Ashlinder; Tiessen, James

2018-03-01

Aim This qualitative study investigates how the Electronic Patient-Reported Outcome (ePRO) mobile application and portal system, designed to capture patient-reported measures to support self-management, affected primary care provider workflows. The Canadian health system is facing an ageing population that is living with chronic disease. Disruptive innovations like mobile health technologies can help to support health system transformation needed to better meet the multifaceted needs of the complex care patient. However, there are challenges with implementing these technologies in primary care settings, in particular the effect on primary care provider workflows. Over a six-week period interdisciplinary primary care providers (n=6) and their complex care patients (n=12), used the ePRO mobile application and portal to collaboratively goal-set, manage care plans, and support self-management using patient-reported measures. Secondary thematic analysis of focus groups, training sessions, and issue tracker reports captured user experiences at a Toronto area Family Health Team from October 2014 to January 2015. Findings Key issues raised by providers included: liability concerns associated with remote monitoring, increased documentation activities due to a lack of interoperability between the app and the electronic patient record, increased provider anxiety with regard to the potential for the app to disrupt and infringe upon appointment time, and increased demands for patient engagement. Primary care providers reported the app helped to focus care plans and to begin a collaborative conversation on goal-setting. However, throughout our investigation we found a high level of provider resistance evidenced by consistent attempts to shift the app towards fitting with existing workflows rather than adapting much of their behaviour. As health systems seek innovative and disruptive models to better serve this complex patient population, provider change resistance will need to be addressed. New models and technologies cannot be disruptive in an environment that is resisting change.
[General practitioners' opinion and attitude towards DMPs and the change in practice routines to implement the DMP "diabetes mellitus type 2"].

PubMed

Miksch, Antje; Trieschmann, Johanna; Ose, Dominik; Rölz, Andreas; Heiderhoff, Marc; Szecsenyi, Joachim

2011-01-01

Effective implementation of disease management programmes (DMPs) in primary care practices often requires changes in practice workflows and responsibilities and acceptance by the parties involved. Within the ELSID study (evaluation study of the DMP diabetes mellitus type 2) the physicians' attitudes toward DMPs were obtained and an optimised implementation of DMPs was developed by conducting a quality management cycle with primary care practice teams. The aim was to investigate which practice workflows will have to be changed and what kind of barriers to implement these changes are perceived. In 78 primary care practices of the two German federal states of Rheinland-Pfalz and Sachsen-Anhalt a quality management cycle was conducted using a structured analysis of the current state of DMP workflows and the need for improvement identified. Subsequently, an optimised workflow was developed and targets were agreed upon. After 6 months, the study team called to inquire about the current state of implementation and, if appropriate, actual barriers to change. After 6 months, 71 practices had been interviewed by phone. 64 of them (90.1%) had agreed on at least one target (e.g., to purchase new instrumentation, to regularly discuss feedback reports, to set up a patient registry). On average three targets had been formulated, and 2 out of 3 had been implemented in the meantime. In most cases lack of time was given as the reason for non-implementation. The majority of surveyed practices perceived some need for improvement. But sufficient resources (time, staff and money) are required to ensure efficient implementation of DMPs in primary care practices and their integration with routine processes. A redefinition of responsibilities for DMPs will strengthen the role of medical assistants and promote high-quality implementation of these programmes. Copyright © 2010. Published by Elsevier GmbH.
Implementation of a single sign-on system between practice, research and learning systems.

PubMed

Purkayastha, Saptarshi; Gichoya, Judy W; Addepally, Siva Abhishek

2017-03-29

Multiple specialized electronic medical systems are utilized in the health enterprise. Each of these systems has their own user management, authentication and authorization process, which makes it a complex web for navigation and use without a coherent process workflow. Users often have to remember multiple passwords, login/logout between systems that disrupt their clinical workflow. Challenges exist in managing permissions for various cadres of health care providers. This case report describes our experience of implementing a single sign-on system, used between an electronic medical records system and a learning management system at a large academic institution with an informatics department responsible for student education and a medical school affiliated with a hospital system caring for patients and conducting research. At our institution, we use OpenMRS for research registry tracking of interventional radiology patients as well as to provide access to medical records to students studying health informatics. To provide authentication across different users of the system with different permissions, we developed a Central Authentication Service (CAS) module for OpenMRS, released under the Mozilla Public License and deployed it for single sign-on across the academic enterprise. The module has been in implementation since August 2015 to present, and we assessed usability of the registry and education system before and after implementation of the CAS module. 54 students and 3 researchers were interviewed. The module authenticates users with appropriate privileges in the medical records system, providing secure access with minimal disruption to their workflow. No passwords requests were sent and users reported ease of use, with streamlined workflow. The project demonstrates that enterprise-wide single sign-on systems should be used in healthcare to reduce complexity like "password hell", improve usability and user navigation. We plan to extend this to work with other systems used in the health care enterprise.
Tacking Flood Risk from Watersheds using a Natural Flood Risk Management Toolkit

NASA Astrophysics Data System (ADS)

Reaney, S. M.; Pearson, C.; Barber, N.; Fraser, A.

2017-12-01

In the UK, flood risk management is moving beyond solely mitigating at the point of impact in towns and key infrastructure to tackle problem at source through a range of landscape based intervention measures. This natural flood risk management (NFM) approach has been trailed within a range of catchments in the UK and is moving towards being adopted as a key part of flood risk management. The approach offers advantages including lower cost and co-benefits for water quality and habitat creation. However, for an agency or group wishing to implement NFM within a catchment, there are two key questions that need to be addressed: Where in the catchment to place the measures? And how many measures are needed to be effective? With this toolkit, these questions are assessed with a two-stage workflow. First, SCIMAP-Flood gives a risk based mapping of likely locations that contribute to the flood peak. This tool uses information on land cover, hydrological connectivity, flood generating rainfall patterns and hydrological travel time distributions to impacted communities. The presented example applies the tool to the River Eden catchment, UK, with 5m grid resolution and hence provide sub-field scale information at the landscape extent. SCIMAP-Flood identifies sub-catchments where physically based catchment hydrological simulation models can be applied to test different NFM based mitigation measures. In this example, the CRUM3 catchment hydrological model has been applied within an uncertainty framework to consider the effectiveness of soil compaction reduction and large woody debris dams within a sub-catchment. It was found that large scale soil aeration to reduce soil compaction levels throughout the catchment is probably the most useful natural flood management measure for this catchment. NFM has potential for wide-spread application and these tools help to ensure that the measures are correctly designed and the scheme performance can be quantitatively assessed and predicted.
Mechanics of Flapping Flight: Analytical Formulations of Unsteady Aerodynamics, Kinematic Optimization, Flight Dynamics, and Control

NASA Astrophysics Data System (ADS)

Taneja, Jayant Kumar

Electricity is an indispensable commodity to modern society, yet it is delivered via a grid architecture that remains largely unchanged over the past century. A host of factors are conspiring to topple this dated yet venerated design: developments in renewable electricity generation technology, policies to reduce greenhouse gas emissions, and advances in information technology for managing energy systems. Modern electric grids are emerging as complex distributed systems in which a portfolio of power generation resources, often incorporating fluctuating renewable resources such as wind and solar, must be managed dynamically to meet uncontrolled, time-varying demand. Uncertainty in both supply and demand makes control of modern electric grids fundamentally more challenging, and growing portfolios of renewables exacerbate the challenge. We study three electricity grids: the state of California, the province of Ontario, and the country of Germany. To understand the effects of increasing renewables, we develop a methodology to scale renewables penetration. Analyzing these grids yields key insights about rigid limits to renewables penetration and their implications in meeting long-term emissions targets. We argue that to achieve deep penetration of renewables, the operational model of the grid must be inverted, changing the paradigm from load-following supplies to supply-following loads. To alleviate the challenge of supply-demand matching on deeply renewable grids, we first examine well-known techniques, including altering management of existing supply resources, employing utility-scale energy storage, targeting energy efficiency improvements, and exercising basic demand-side management. Then, we create several instantiations of supply-following loads -- including refrigerators, heating and cooling systems, and laptop computers -- by employing a combination of sensor networks, advanced control techniques, and enhanced energy storage. We examine the capacity of each load for supply-following and study the behaviors of populations of these loads, assessing their potential at various levels of deployment throughout the California electricity grid. Using combinations of supply-following strategies, we can reduce peak natural gas generation by 19% on a model of the California grid with 60% renewables. We then assess remaining variability on this deeply renewable grid incorporating supply-following loads, characterizing additional capabilities needed to ensure supply-demand matching in future sustainable electricity grids.
Bringing Federated Identity to Grid Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Teheran, Jeny

The Fermi National Accelerator Laboratory (FNAL) is facing the challenge of providing scientific data access and grid submission to scientific collaborations that span the globe but are hosted at FNAL. Users in these collaborations are currently required to register as an FNAL user and obtain FNAL credentials to access grid resources to perform their scientific computations. These requirements burden researchers with managing additional authentication credentials, and put additional load on FNAL for managing user identities. Our design integrates the existing InCommon federated identity infrastructure, CILogon Basic CA, and MyProxy with the FNAL grid submission system to provide secure access formore » users from diverse experiments and collab orations without requiring each user to have authentication credentials from FNAL. The design automates the handling of certificates so users do not need to manage them manually. Although the initial implementation is for FNAL's grid submission system, the design and the core of the implementation are general and could be applied to other distributed computing systems.« less
The TDR: A Repository for Long Term Storage of Geophysical Data and Metadata

NASA Astrophysics Data System (ADS)

Wilson, A.; Baltzer, T.; Caron, J.

2006-12-01

For many years Unidata has provided easy, low cost data access to universities and research labs. Historically Unidata technology provided access to data in near real time. In recent years Unidata has additionally turned to providing middleware to serve longer term data and associated metadata via its THREDDS technology, the most recent offering being the THREDDS Data Server (TDS). The TDS provides middleware for metadata access and management, OPeNDAP data access, and integration with the Unidata Integrated Data Viewer (IDV), among other benefits. The TDS was designed to support rolling archives of data, that is, data that exist only for a relatively short, predefined time window. Now we are creating an addition to the TDS, called the THREDDS Data Repository (TDR), which allows users to store and retrieve data and other objects for an arbitrarily long time period. Data in the TDR can also be served by the TDS. The TDR performs important functions of locating storage for the data, moving the data to and from the repository, assigning unique identifiers, and generating metadata. The TDR framework supports pluggable components that allow tailoring an implementation for a particular application. The Linked Environments for Atmospheric Discovery (LEAD) project provides an excellent use case for the TDR. LEAD is a multi-institutional Large Information Technology Research project funded by the National Science Foundation (NSF). The goal of LEAD is to create a framework based on Grid and Web Services to support mesoscale meteorology research and education. This includes capabilities such as launching forecast models, mining data for meteorological phenomena, and dynamic workflows that are automatically reconfigurable in response to changing weather. LEAD presents unique challenges in managing and storing large data volumes from real-time observational systems as well as data that are dynamically created during the execution of adaptive workflows. For example, in order to support storage of many large data products, the LEAD implementation of the TDR will provide a variety of data movement options, including gridftp. It will have a web service interface and will be callable programmatically as well as via interactive user requests. Future plans include the use of a mass storage device to provide robust long term storage. This talk will present the current state of the TDR effort.
FermiGrid

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yocum, D.R.; Berman, E.; Canal, P.

2007-05-01

As one of the founding members of the Open Science Grid Consortium (OSG), Fermilab enables coherent access to its production resources through the Grid infrastructure system called FermiGrid. This system successfully provides for centrally managed grid services, opportunistic resource access, development of OSG Interfaces for Fermilab, and an interface to the Fermilab dCache system. FermiGrid supports virtual organizations (VOs) including high energy physics experiments (USCMS, MINOS, D0, CDF, ILC), astrophysics experiments (SDSS, Auger, DES), biology experiments (GADU, Nanohub) and educational activities.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.