specialized workflow software: Topics by Science.gov

Sample records for specialized workflow software

Build and Execute Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guan, Qiang

At exascale, the challenge becomes to develop applications that run at scale and use exascale platforms reliably, efficiently, and flexibly. Workflows become much more complex because they must seamlessly integrate simulation and data analytics. They must include down-sampling, post-processing, feature extraction, and visualization. Power and data transfer limitations require these analysis tasks to be run in-situ or in-transit. We expect successful workflows will comprise multiple linked simulations along with tens of analysis routines. Users will have limited development time at scale and, therefore, must have rich tools to develop, debug, test, and deploy applications. At this scale, successful workflows willmore » compose linked computations from an assortment of reliable, well-defined computation elements, ones that can come and go as required, based on the needs of the workflow over time. We propose a novel framework that utilizes both virtual machines (VMs) and software containers to create a workflow system that establishes a uniform build and execution environment (BEE) beyond the capabilities of current systems. In this environment, applications will run reliably and repeatably across heterogeneous hardware and software. Containers, both commercial (Docker and Rocket) and open-source (LXC and LXD), define a runtime that isolates all software dependencies from the machine operating system. Workflows may contain multiple containers that run different operating systems, different software, and even different versions of the same software. We will run containers in open-source virtual machines (KVM) and emulators (QEMU) so that workflows run on any machine entirely in user-space. On this platform of containers and virtual machines, we will deliver workflow software that provides services, including repeatable execution, provenance, checkpointing, and future proofing. We will capture provenance about how containers were launched and how they interact to annotate workflows for repeatable and partial re-execution. We will coordinate the physical snapshots of virtual machines with parallel programming constructs, such as barriers, to automate checkpoint and restart. We will also integrate with HPC-specific container runtimes to gain access to accelerators and other specialized hardware to preserve native performance. Containers will link development to continuous integration. When application developers check code in, it will automatically be tested on a suite of different software and hardware architectures.« less
A Scientific Software Product Line for the Bioinformatics domain.

PubMed

Costa, Gabriella Castro B; Braga, Regina; David, José Maria N; Campos, Fernanda

2015-08-01

Most specialized users (scientists) that use bioinformatics applications do not have suitable training on software development. Software Product Line (SPL) employs the concept of reuse considering that it is defined as a set of systems that are developed from a common set of base artifacts. In some contexts, such as in bioinformatics applications, it is advantageous to develop a collection of related software products, using SPL approach. If software products are similar enough, there is the possibility of predicting their commonalities, differences and then reuse these common features to support the development of new applications in the bioinformatics area. This paper presents the PL-Science approach which considers the context of SPL and ontology in order to assist scientists to define a scientific experiment, and to specify a workflow that encompasses bioinformatics applications of a given experiment. This paper also focuses on the use of ontologies to enable the use of Software Product Line in biological domains. In the context of this paper, Scientific Software Product Line (SSPL) differs from the Software Product Line due to the fact that SSPL uses an abstract scientific workflow model. This workflow is defined according to a scientific domain and using this abstract workflow model the products (scientific applications/algorithms) are instantiated. Through the use of ontology as a knowledge representation model, we can provide domain restrictions as well as add semantic aspects in order to facilitate the selection and organization of bioinformatics workflows in a Scientific Software Product Line. The use of ontologies enables not only the expression of formal restrictions but also the inferences on these restrictions, considering that a scientific domain needs a formal specification. This paper presents the development of the PL-Science approach, encompassing a methodology and an infrastructure, and also presents an approach evaluation. This evaluation presents case studies in bioinformatics, which were conducted in two renowned research institutions in Brazil. Copyright © 2015 Elsevier Inc. All rights reserved.
Biowep: a workflow enactment portal for bioinformatics applications.

PubMed

Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

2007-03-08

The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics - LITBIO.
Biowep: a workflow enactment portal for bioinformatics applications

PubMed Central

Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

2007-01-01

Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO. PMID:17430563
Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

PubMed Central

Vernick, Kenneth D.

2017-01-01

Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932
New hardware and workflows for semi-automated correlative cryo-fluorescence and cryo-electron microscopy/tomography.

PubMed

Schorb, Martin; Gaechter, Leander; Avinoam, Ori; Sieckmann, Frank; Clarke, Mairi; Bebeacua, Cecilia; Bykov, Yury S; Sonnen, Andreas F-P; Lihl, Reinhard; Briggs, John A G

2017-02-01

Correlative light and electron microscopy allows features of interest defined by fluorescence signals to be located in an electron micrograph of the same sample. Rare dynamic events or specific objects can be identified, targeted and imaged by electron microscopy or tomography. To combine it with structural studies using cryo-electron microscopy or tomography, fluorescence microscopy must be performed while maintaining the specimen vitrified at liquid-nitrogen temperatures and in a dry environment during imaging and transfer. Here we present instrumentation, software and an experimental workflow that improves the ease of use, throughput and performance of correlated cryo-fluorescence and cryo-electron microscopy. The new cryo-stage incorporates a specially modified high-numerical aperture objective lens and provides a stable and clean imaging environment. It is combined with a transfer shuttle for contamination-free loading of the specimen. Optimized microscope control software allows automated acquisition of the entire specimen area by cryo-fluorescence microscopy. The software also facilitates direct transfer of the fluorescence image and associated coordinates to the cryo-electron microscope for subsequent fluorescence-guided automated imaging. Here we describe these technological developments and present a detailed workflow, which we applied for automated cryo-electron microscopy and tomography of various specimens. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Large Scale Software Building with CMake in ATLAS

NASA Astrophysics Data System (ADS)

Elmsheuser, J.; Krasznahorkay, A.; Obreshkov, E.; Undrus, A.; ATLAS Collaboration

2017-10-01

The offline software of the ATLAS experiment at the Large Hadron Collider (LHC) serves as the platform for detector data reconstruction, simulation and analysis. It is also used in the detector’s trigger system to select LHC collision events during data taking. The ATLAS offline software consists of several million lines of C++ and Python code organized in a modular design of more than 2000 specialized packages. Because of different workflows, many stable numbered releases are in parallel production use. To accommodate specific workflow requests, software patches with modified libraries are distributed on top of existing software releases on a daily basis. The different ATLAS software applications also require a flexible build system that strongly supports unit and integration tests. Within the last year this build system was migrated to CMake. A CMake configuration has been developed that allows one to easily set up and build the above mentioned software packages. This also makes it possible to develop and test new and modified packages on top of existing releases. The system also allows one to detect and execute partial rebuilds of the release based on single package changes. The build system makes use of CPack for building RPM packages out of the software releases, and CTest for running unit and integration tests. We report on the migration and integration of the ATLAS software to CMake and show working examples of this large scale project in production.
Introducing students to digital geological mapping: A workflow based on cheap hardware and free software

NASA Astrophysics Data System (ADS)

Vrabec, Marko; Dolžan, Erazem

2016-04-01

The undergraduate field course in Geological Mapping at the University of Ljubljana involves 20-40 students per year, which precludes the use of specialized rugged digital field equipment as the costs would be way beyond the capabilities of the Department. A different mapping area is selected each year with the aim to provide typical conditions that a professional geologist might encounter when doing fieldwork in Slovenia, which includes rugged relief, dense tree cover, and moderately-well- to poorly-exposed bedrock due to vegetation and urbanization. It is therefore mandatory that the digital tools and workflows are combined with classical methods of fieldwork, since, for example, full-time precise GNSS positioning is not viable under such circumstances. Additionally, due to the prevailing combination of complex geological structure with generally poor exposure, students cannot be expected to produce line (vector) maps of geological contacts on the go, so there is no need for such functionality in hardware and software that we use in the field. Our workflow therefore still relies on paper base maps, but is strongly complemented with digital tools to provide robust positioning, track recording, and acquisition of various point-based data. Primary field hardware are students' Android-based smartphones and optionally tablets. For our purposes, the built-in GNSS chips provide adequate positioning precision most of the time, particularly if they are GLONASS-capable. We use Oruxmaps, a powerful free offline map viewer for the Android platform, which facilitates the use of custom-made geopositioned maps. For digital base maps, which we prepare in free Windows QGIS software, we use scanned topographic maps provided by the National Geodetic Authority, but also other maps such as aerial imagery, processed Digital Elevation Models, scans of existing geological maps, etc. Point data, like important outcrop locations or structural measurements, are entered into Oruxmaps as waypoints. Students are also encouraged to directly measure structural data with specialized Android apps such as the MVE FieldMove Clino. Digital field data is exported from Oruxmaps to Windows computers primarily in the ubiquitous GPX data format and then integrated in the QGIS environment. Recorded GPX tracks are also used with the free Geosetter Windows software to geoposition and tag any digital photographs taken in the field. With minimal expenses, our workflow provides the students with basic familiarity and experience in using digital field tools and methods. The workflow is also practical enough for the prevailing field conditions of Slovenia that the faculty staff is using it in geological mapping for scientific research and consultancy work.
Language workbench user interfaces for data analysis

PubMed Central

Benson, Victoria M.

2015-01-01

Biological data analysis is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Workflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with workflows, and that do not require specialized user interfaces. However, some types of analyses can benefit from custom user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that can be expressed more succinctly with specialized user interfaces. Here, we show how Language Workbench (LW) technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS) LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/). PMID:25755929
Identifying impact of software dependencies on replicability of biomedical workflows.

PubMed

Miksa, Tomasz; Rauber, Andreas; Mina, Eleni

2016-12-01

Complex data driven experiments form the basis of biomedical research. Recent findings warn that the context in which the software is run, that is the infrastructure and the third party dependencies, can have a crucial impact on the final results delivered by a computational experiment. This implies that in order to replicate the same result, not only the same data must be used, but also it must be run on an equivalent software stack. In this paper we present the VFramework that enables assessing replicability of workflows. It identifies whether any differences in software dependencies among two executions of the same workflow exist and whether they have impact on the produced results. We also conduct a case study in which we investigate the impact of software dependencies on replicability of Taverna workflows used in biomedical research of Huntington's disease. We re-execute analysed workflows in environments differing in operating system distribution and configuration. The results show that the VFramework can be used to identify the impact of software dependencies on the replicability of biomedical workflows. Furthermore, we observe that despite the fact that the workflows are executed in a controlled environment, they still depend on specific tools installed in the environment. The context model used by the VFramework improves the deficiencies of provenance traces and documents also such tools. Based on our findings we define guidelines for workflow owners that enable them to improve replicability of their workflows. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Modeling Complex Workflow in Molecular Diagnostics

PubMed Central

Gomah, Mohamed E.; Turley, James P.; Lu, Huimin; Jones, Dan

2010-01-01

One of the hurdles to achieving personalized medicine has been implementing the laboratory processes for performing and reporting complex molecular tests. The rapidly changing test rosters and complex analysis platforms in molecular diagnostics have meant that many clinical laboratories still use labor-intensive manual processing and testing without the level of automation seen in high-volume chemistry and hematology testing. We provide here a discussion of design requirements and the results of implementation of a suite of lab management tools that incorporate the many elements required for use of molecular diagnostics in personalized medicine, particularly in cancer. These applications provide the functionality required for sample accessioning and tracking, material generation, and testing that are particular to the evolving needs of individualized molecular diagnostics. On implementation, the applications described here resulted in improvements in the turn-around time for reporting of more complex molecular test sets, and significant changes in the workflow. Therefore, careful mapping of workflow can permit design of software applications that simplify even the complex demands of specialized molecular testing. By incorporating design features for order review, software tools can permit a more personalized approach to sample handling and test selection without compromising efficiency. PMID:20007844
Introducing W.A.T.E.R.S.: a workflow for the alignment, taxonomy, and ecology of ribosomal sequences.

PubMed

Hartman, Amber L; Riddle, Sean; McPhillips, Timothy; Ludäscher, Bertram; Eisen, Jonathan A

2010-06-12

For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly. We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform. By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.
A practical data processing workflow for multi-OMICS projects.

PubMed

Kohl, Michael; Megger, Dominik A; Trippler, Martin; Meckel, Hagen; Ahrens, Maike; Bracht, Thilo; Weber, Frank; Hoffmann, Andreas-Claudius; Baba, Hideo A; Sitek, Barbara; Schlaak, Jörg F; Meyer, Helmut E; Stephan, Christian; Eisenacher, Martin

2014-01-01

Multi-OMICS approaches aim on the integration of quantitative data obtained for different biological molecules in order to understand their interrelation and the functioning of larger systems. This paper deals with several data integration and data processing issues that frequently occur within this context. To this end, the data processing workflow within the PROFILE project is presented, a multi-OMICS project that aims on identification of novel biomarkers and the development of new therapeutic targets for seven important liver diseases. Furthermore, a software called CrossPlatformCommander is sketched, which facilitates several steps of the proposed workflow in a semi-automatic manner. Application of the software is presented for the detection of novel biomarkers, their ranking and annotation with existing knowledge using the example of corresponding Transcriptomics and Proteomics data sets obtained from patients suffering from hepatocellular carcinoma. Additionally, a linear regression analysis of Transcriptomics vs. Proteomics data is presented and its performance assessed. It was shown, that for capturing profound relations between Transcriptomics and Proteomics data, a simple linear regression analysis is not sufficient and implementation and evaluation of alternative statistical approaches are needed. Additionally, the integration of multivariate variable selection and classification approaches is intended for further development of the software. Although this paper focuses only on the combination of data obtained from quantitative Proteomics and Transcriptomics experiments, several approaches and data integration steps are also applicable for other OMICS technologies. Keeping specific restrictions in mind the suggested workflow (or at least parts of it) may be used as a template for similar projects that make use of different high throughput techniques. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013 Elsevier B.V. All rights reserved.
A workflow learning model to improve geovisual analytics utility

PubMed Central

Roth, Robert E; MacEachren, Alan M; McCabe, Craig A

2011-01-01

Introduction This paper describes the design and implementation of the G-EX Portal Learn Module, a web-based, geocollaborative application for organizing and distributing digital learning artifacts. G-EX falls into the broader context of geovisual analytics, a new research area with the goal of supporting visually-mediated reasoning about large, multivariate, spatiotemporal information. Because this information is unprecedented in amount and complexity, GIScientists are tasked with the development of new tools and techniques to make sense of it. Our research addresses the challenge of implementing these geovisual analytics tools and techniques in a useful manner. Objectives The objective of this paper is to develop and implement a method for improving the utility of geovisual analytics software. The success of software is measured by its usability (i.e., how easy the software is to use?) and utility (i.e., how useful the software is). The usability and utility of software can be improved by refining the software, increasing user knowledge about the software, or both. It is difficult to achieve transparent usability (i.e., software that is immediately usable without training) of geovisual analytics software because of the inherent complexity of the included tools and techniques. In these situations, improving user knowledge about the software through the provision of learning artifacts is as important, if not more so, than iterative refinement of the software itself. Therefore, our approach to improving utility is focused on educating the user. Methodology The research reported here was completed in two steps. First, we developed a model for learning about geovisual analytics software. Many existing digital learning models assist only with use of the software to complete a specific task and provide limited assistance with its actual application. To move beyond task-oriented learning about software use, we propose a process-oriented approach to learning based on the concept of scientific workflows. Second, we implemented an interface in the G-EX Portal Learn Module to demonstrate the workflow learning model. The workflow interface allows users to drag learning artifacts uploaded to the G-EX Portal onto a central whiteboard and then annotate the workflow using text and drawing tools. Once completed, users can visit the assembled workflow to get an idea of the kind, number, and scale of analysis steps, view individual learning artifacts associated with each node in the workflow, and ask questions about the overall workflow or individual learning artifacts through the associated forums. An example learning workflow in the domain of epidemiology is provided to demonstrate the effectiveness of the approach. Results/Conclusions In the context of geovisual analytics, GIScientists are not only responsible for developing software to facilitate visually-mediated reasoning about large and complex spatiotemporal information, but also for ensuring that this software works. The workflow learning model discussed in this paper and demonstrated in the G-EX Portal Learn Module is one approach to improving the utility of geovisual analytics software. While development of the G-EX Portal Learn Module is ongoing, we expect to release the G-EX Portal Learn Module by Summer 2009. PMID:21983545
A workflow learning model to improve geovisual analytics utility.

PubMed

Roth, Robert E; Maceachren, Alan M; McCabe, Craig A

2009-01-01

INTRODUCTION: This paper describes the design and implementation of the G-EX Portal Learn Module, a web-based, geocollaborative application for organizing and distributing digital learning artifacts. G-EX falls into the broader context of geovisual analytics, a new research area with the goal of supporting visually-mediated reasoning about large, multivariate, spatiotemporal information. Because this information is unprecedented in amount and complexity, GIScientists are tasked with the development of new tools and techniques to make sense of it. Our research addresses the challenge of implementing these geovisual analytics tools and techniques in a useful manner. OBJECTIVES: The objective of this paper is to develop and implement a method for improving the utility of geovisual analytics software. The success of software is measured by its usability (i.e., how easy the software is to use?) and utility (i.e., how useful the software is). The usability and utility of software can be improved by refining the software, increasing user knowledge about the software, or both. It is difficult to achieve transparent usability (i.e., software that is immediately usable without training) of geovisual analytics software because of the inherent complexity of the included tools and techniques. In these situations, improving user knowledge about the software through the provision of learning artifacts is as important, if not more so, than iterative refinement of the software itself. Therefore, our approach to improving utility is focused on educating the user. METHODOLOGY: The research reported here was completed in two steps. First, we developed a model for learning about geovisual analytics software. Many existing digital learning models assist only with use of the software to complete a specific task and provide limited assistance with its actual application. To move beyond task-oriented learning about software use, we propose a process-oriented approach to learning based on the concept of scientific workflows. Second, we implemented an interface in the G-EX Portal Learn Module to demonstrate the workflow learning model. The workflow interface allows users to drag learning artifacts uploaded to the G-EX Portal onto a central whiteboard and then annotate the workflow using text and drawing tools. Once completed, users can visit the assembled workflow to get an idea of the kind, number, and scale of analysis steps, view individual learning artifacts associated with each node in the workflow, and ask questions about the overall workflow or individual learning artifacts through the associated forums. An example learning workflow in the domain of epidemiology is provided to demonstrate the effectiveness of the approach. RESULTS/CONCLUSIONS: In the context of geovisual analytics, GIScientists are not only responsible for developing software to facilitate visually-mediated reasoning about large and complex spatiotemporal information, but also for ensuring that this software works. The workflow learning model discussed in this paper and demonstrated in the G-EX Portal Learn Module is one approach to improving the utility of geovisual analytics software. While development of the G-EX Portal Learn Module is ongoing, we expect to release the G-EX Portal Learn Module by Summer 2009.
The CASA Software Package

NASA Astrophysics Data System (ADS)

Petry, Dirk

2018-03-01

CASA is the standard science data analysis package for ALMA and VLA but it can also be used for the analysis of data from other observatories. In this talk, I will give an overview of the structure and features of CASA, who develops it, and the present status and plans, and then show typical analysis workflows for ALMA data with special emphasis on the handling of single dish data and its combination with interferometric data.
Integrating the Allen Brain Institute Cell Types Database into Automated Neuroscience Workflow.

PubMed

Stockton, David B; Santamaria, Fidel

2017-10-01

We developed software tools to download, extract features, and organize the Cell Types Database from the Allen Brain Institute (ABI) in order to integrate its whole cell patch clamp characterization data into the automated modeling/data analysis cycle. To expand the potential user base we employed both Python and MATLAB. The basic set of tools downloads selected raw data and extracts cell, sweep, and spike features, using ABI's feature extraction code. To facilitate data manipulation we added a tool to build a local specialized database of raw data plus extracted features. Finally, to maximize automation, we extended our NeuroManager workflow automation suite to include these tools plus a separate investigation database. The extended suite allows the user to integrate ABI experimental and modeling data into an automated workflow deployed on heterogeneous computer infrastructures, from local servers, to high performance computing environments, to the cloud. Since our approach is focused on workflow procedures our tools can be modified to interact with the increasing number of neuroscience databases being developed to cover all scales and properties of the nervous system.
Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

NASA Astrophysics Data System (ADS)

Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.
DIaaS: Data-Intensive workflows as a service - Enabling easy composition and deployment of data-intensive workflows on Virtual Research Environments

NASA Astrophysics Data System (ADS)

Filgueira, R.; Ferreira da Silva, R.; Deelman, E.; Atkinson, M.

2016-12-01

We present the Data-Intensive workflows as a Service (DIaaS) model for enabling easy data-intensive workflow composition and deployment on clouds using containers. DIaaS model backbone is Asterism, an integrated solution for running data-intensive stream-based applications on heterogeneous systems, which combines the benefits of dispel4py with Pegasus workflow systems. The stream-based executions of an Asterism workflow are managed by dispel4py, while the data movement between different e-Infrastructures, and the coordination of the application execution are automatically managed by Pegasus. DIaaS combines Asterism framework with Docker containers to provide an integrated, complete, easy-to-use, portable approach to run data-intensive workflows on distributed platforms. Three containers integrate the DIaaS model: a Pegasus node, and an MPI and an Apache Storm clusters. Container images are described as Dockerfiles (available online at http://github.com/dispel4py/pegasus_dispel4py), linked to Docker Hub for providing continuous integration (automated image builds), and image storing and sharing. In this model, all required software (workflow systems and execution engines) for running scientific applications are packed into the containers, which significantly reduces the effort (and possible human errors) required by scientists or VRE administrators to build such systems. The most common use of DIaaS will be to act as a backend of VREs or Scientific Gateways to run data-intensive applications, deploying cloud resources upon request. We have demonstrated the feasibility of DIaaS using the data-intensive seismic ambient noise cross-correlation application (Figure 1). The application preprocesses (Phase1) and cross-correlates (Phase2) traces from several seismic stations. The application is submitted via Pegasus (Container1), and Phase1 and Phase2 are executed in the MPI (Container2) and Storm (Container3) clusters respectively. Although both phases could be executed within the same environment, this setup demonstrates the flexibility of DIaaS to run applications across e-Infrastructures. In summary, DIaaS delivers specialized software to execute data-intensive applications in a scalable, efficient, and robust manner reducing the engineering time and computational cost.
Integrating Visualizations into Modeling NEST Simulations

PubMed Central

Nowke, Christian; Zielasko, Daniel; Weyers, Benjamin; Peyser, Alexander; Hentschel, Bernd; Kuhlen, Torsten W.

2015-01-01

Modeling large-scale spiking neural networks showing realistic biological behavior in their dynamics is a complex and tedious task. Since these networks consist of millions of interconnected neurons, their simulation produces an immense amount of data. In recent years it has become possible to simulate even larger networks. However, solutions to assist researchers in understanding the simulation's complex emergent behavior by means of visualization are still lacking. While developing tools to partially fill this gap, we encountered the challenge to integrate these tools easily into the neuroscientists' daily workflow. To understand what makes this so challenging, we looked into the workflows of our collaborators and analyzed how they use the visualizations to solve their daily problems. We identified two major issues: first, the analysis process can rapidly change focus which requires to switch the visualization tool that assists in the current problem domain. Second, because of the heterogeneous data that results from simulations, researchers want to relate data to investigate these effectively. Since a monolithic application model, processing and visualizing all data modalities and reflecting all combinations of possible workflows in a holistic way, is most likely impossible to develop and to maintain, a software architecture that offers specialized visualization tools that run simultaneously and can be linked together to reflect the current workflow, is a more feasible approach. To this end, we have developed a software architecture that allows neuroscientists to integrate visualization tools more closely into the modeling tasks. In addition, it forms the basis for semantic linking of different visualizations to reflect the current workflow. In this paper, we present this architecture and substantiate the usefulness of our approach by common use cases we encountered in our collaborative work. PMID:26733860

Open source software projects of the caBIG In Vivo Imaging Workspace Software special interest group.

PubMed

Prior, Fred W; Erickson, Bradley J; Tarbox, Lawrence

2007-11-01

The Cancer Bioinformatics Grid (caBIG) program was created by the National Cancer Institute to facilitate sharing of IT infrastructure, data, and applications among the National Cancer Institute-sponsored cancer research centers. The program was launched in February 2004 and now links more than 50 cancer centers. In April 2005, the In Vivo Imaging Workspace was added to promote the use of imaging in cancer clinical trials. At the inaugural meeting, four special interest groups (SIGs) were established. The Software SIG was charged with identifying projects that focus on open-source software for image visualization and analysis. To date, two projects have been defined by the Software SIG. The eXtensible Imaging Platform project has produced a rapid application development environment that researchers may use to create targeted workflows customized for specific research projects. The Algorithm Validation Tools project will provide a set of tools and data structures that will be used to capture measurement information and associated needed to allow a gold standard to be defined for the given database against which change analysis algorithms can be tested. Through these and future efforts, the caBIG In Vivo Imaging Workspace Software SIG endeavors to advance imaging informatics and provide new open-source software tools to advance cancer research.
CamBAfx: Workflow Design, Implementation and Application for Neuroimaging

PubMed Central

Ooi, Cinly; Bullmore, Edward T.; Wink, Alle-Meije; Sendur, Levent; Barnes, Anna; Achard, Sophie; Aspden, John; Abbott, Sanja; Yue, Shigang; Kitzbichler, Manfred; Meunier, David; Maxim, Voichita; Salvador, Raymond; Henty, Julian; Tait, Roger; Subramaniam, Naresh; Suckling, John

2009-01-01

CamBAfx is a workflow application designed for both researchers who use workflows to process data (consumers) and those who design them (designers). It provides a front-end (user interface) optimized for data processing designed in a way familiar to consumers. The back-end uses a pipeline model to represent workflows since this is a common and useful metaphor used by designers and is easy to manipulate compared to other representations like programming scripts. As an Eclipse Rich Client Platform application, CamBAfx's pipelines and functions can be bundled with the software or downloaded post-installation. The user interface contains all the workflow facilities expected by consumers. Using the Eclipse Extension Mechanism designers are encouraged to customize CamBAfx for their own pipelines. CamBAfx wraps a workflow facility around neuroinformatics software without modification. CamBAfx's design, licensing and Eclipse Branding Mechanism allow it to be used as the user interface for other software, facilitating exchange of innovative computational tools between originating labs. PMID:19826470
Uav Photgrammetric Workflows: a best Practice Guideline

NASA Astrophysics Data System (ADS)

Federman, A.; Santana Quintero, M.; Kretz, S.; Gregg, J.; Lengies, M.; Ouimet, C.; Laliberte, J.

2017-08-01

The increasing commercialization of unmanned aerial vehicles (UAVs) has opened the possibility of performing low-cost aerial image acquisition for the documentation of cultural heritage sites through UAV photogrammetry. The flying of UAVs in Canada is regulated through Transport Canada and requires a Special Flight Operations Certificate (SFOC) in order to fly. Various image acquisition techniques have been explored in this review, as well as well software used to register the data. A general workflow procedure has been formulated based off of the literature reviewed. A case study example of using UAV photogrammetry at Prince of Wales Fort is discussed, specifically in relation to the data acquisition and processing. Some gaps in the literature reviewed highlight the need for streamlining the SFOC application process, and incorporating UAVs into cultural heritage documentation courses.
Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python.

PubMed

Gorgolewski, Krzysztof; Burns, Christopher D; Madison, Cindee; Clark, Dav; Halchenko, Yaroslav O; Waskom, Michael L; Ghosh, Satrajit S

2011-01-01

Current neuroimaging software offer users an incredible opportunity to analyze their data in different ways, with different underlying assumptions. Several sophisticated software packages (e.g., AFNI, BrainVoyager, FSL, FreeSurfer, Nipy, R, SPM) are used to process and analyze large and often diverse (highly multi-dimensional) data. However, this heterogeneous collection of specialized applications creates several issues that hinder replicable, efficient, and optimal use of neuroimaging analysis approaches: (1) No uniform access to neuroimaging analysis software and usage information; (2) No framework for comparative algorithm development and dissemination; (3) Personnel turnover in laboratories often limits methodological continuity and training new personnel takes time; (4) Neuroimaging software packages do not address computational efficiency; and (5) Methods sections in journal articles are inadequate for reproducing results. To address these issues, we present Nipype (Neuroimaging in Python: Pipelines and Interfaces; http://nipy.org/nipype), an open-source, community-developed, software package, and scriptable library. Nipype solves the issues by providing Interfaces to existing neuroimaging software with uniform usage semantics and by facilitating interaction between these packages using Workflows. Nipype provides an environment that encourages interactive exploration of algorithms, eases the design of Workflows within and between packages, allows rapid comparative development of algorithms and reduces the learning curve necessary to use different packages. Nipype supports both local and remote execution on multi-core machines and clusters, without additional scripting. Nipype is Berkeley Software Distribution licensed, allowing anyone unrestricted usage. An open, community-driven development philosophy allows the software to quickly adapt and address the varied needs of the evolving neuroimaging community, especially in the context of increasing demand for reproducible research.
Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python

PubMed Central

Gorgolewski, Krzysztof; Burns, Christopher D.; Madison, Cindee; Clark, Dav; Halchenko, Yaroslav O.; Waskom, Michael L.; Ghosh, Satrajit S.

2011-01-01

Current neuroimaging software offer users an incredible opportunity to analyze their data in different ways, with different underlying assumptions. Several sophisticated software packages (e.g., AFNI, BrainVoyager, FSL, FreeSurfer, Nipy, R, SPM) are used to process and analyze large and often diverse (highly multi-dimensional) data. However, this heterogeneous collection of specialized applications creates several issues that hinder replicable, efficient, and optimal use of neuroimaging analysis approaches: (1) No uniform access to neuroimaging analysis software and usage information; (2) No framework for comparative algorithm development and dissemination; (3) Personnel turnover in laboratories often limits methodological continuity and training new personnel takes time; (4) Neuroimaging software packages do not address computational efficiency; and (5) Methods sections in journal articles are inadequate for reproducing results. To address these issues, we present Nipype (Neuroimaging in Python: Pipelines and Interfaces; http://nipy.org/nipype), an open-source, community-developed, software package, and scriptable library. Nipype solves the issues by providing Interfaces to existing neuroimaging software with uniform usage semantics and by facilitating interaction between these packages using Workflows. Nipype provides an environment that encourages interactive exploration of algorithms, eases the design of Workflows within and between packages, allows rapid comparative development of algorithms and reduces the learning curve necessary to use different packages. Nipype supports both local and remote execution on multi-core machines and clusters, without additional scripting. Nipype is Berkeley Software Distribution licensed, allowing anyone unrestricted usage. An open, community-driven development philosophy allows the software to quickly adapt and address the varied needs of the evolving neuroimaging community, especially in the context of increasing demand for reproducible research. PMID:21897815
OIPAV: an integrated software system for ophthalmic image processing, analysis and visualization

NASA Astrophysics Data System (ADS)

Zhang, Lichun; Xiang, Dehui; Jin, Chao; Shi, Fei; Yu, Kai; Chen, Xinjian

2018-03-01

OIPAV (Ophthalmic Images Processing, Analysis and Visualization) is a cross-platform software which is specially oriented to ophthalmic images. It provides a wide range of functionalities including data I/O, image processing, interaction, ophthalmic diseases detection, data analysis and visualization to help researchers and clinicians deal with various ophthalmic images such as optical coherence tomography (OCT) images and color photo of fundus, etc. It enables users to easily access to different ophthalmic image data manufactured from different imaging devices, facilitate workflows of processing ophthalmic images and improve quantitative evaluations. In this paper, we will present the system design and functional modules of the platform and demonstrate various applications. With a satisfying function scalability and expandability, we believe that the software can be widely applied in ophthalmology field.
Enhancing and Customizing Laboratory Information Systems to Improve/Enhance Pathologist Workflow.

PubMed

Hartman, Douglas J

2015-06-01

Optimizing pathologist workflow can be difficult because it is affected by many variables. Surgical pathologists must complete many tasks that culminate in a final pathology report. Several software systems can be used to enhance/improve pathologist workflow. These include voice recognition software, pre-sign-out quality assurance, image utilization, and computerized provider order entry. Recent changes in the diagnostic coding and the more prominent role of centralized electronic health records represent potential areas for increased ways to enhance/improve the workflow for surgical pathologists. Additional unforeseen changes to the pathologist workflow may accompany the introduction of whole-slide imaging technology to the routine diagnostic work. Copyright © 2015 Elsevier Inc. All rights reserved.
Enhancing and Customizing Laboratory Information Systems to Improve/Enhance Pathologist Workflow.

PubMed

Hartman, Douglas J

2016-03-01

Optimizing pathologist workflow can be difficult because it is affected by many variables. Surgical pathologists must complete many tasks that culminate in a final pathology report. Several software systems can be used to enhance/improve pathologist workflow. These include voice recognition software, pre-sign-out quality assurance, image utilization, and computerized provider order entry. Recent changes in the diagnostic coding and the more prominent role of centralized electronic health records represent potential areas for increased ways to enhance/improve the workflow for surgical pathologists. Additional unforeseen changes to the pathologist workflow may accompany the introduction of whole-slide imaging technology to the routine diagnostic work. Copyright © 2016 Elsevier Inc. All rights reserved.
Reproducible Bioconductor workflows using browser-based interactive notebooks and containers.

PubMed

Almugbel, Reem; Hung, Ling-Hong; Hu, Jiaming; Almutairy, Abeer; Ortogero, Nicole; Tamta, Yashaswi; Yeung, Ka Yee

2018-01-01

Bioinformatics publications typically include complex software workflows that are difficult to describe in a manuscript. We describe and demonstrate the use of interactive software notebooks to document and distribute bioinformatics research. We provide a user-friendly tool, BiocImageBuilder, that allows users to easily distribute their bioinformatics protocols through interactive notebooks uploaded to either a GitHub repository or a private server. We present four different interactive Jupyter notebooks using R and Bioconductor workflows to infer differential gene expression, analyze cross-platform datasets, process RNA-seq data and KinomeScan data. These interactive notebooks are available on GitHub. The analytical results can be viewed in a browser. Most importantly, the software contents can be executed and modified. This is accomplished using Binder, which runs the notebook inside software containers, thus avoiding the need to install any software and ensuring reproducibility. All the notebooks were produced using custom files generated by BiocImageBuilder. BiocImageBuilder facilitates the publication of workflows with a point-and-click user interface. We demonstrate that interactive notebooks can be used to disseminate a wide range of bioinformatics analyses. The use of software containers to mirror the original software environment ensures reproducibility of results. Parameters and code can be dynamically modified, allowing for robust verification of published results and encouraging rapid adoption of new methods. Given the increasing complexity of bioinformatics workflows, we anticipate that these interactive software notebooks will become as necessary for documenting software methods as traditional laboratory notebooks have been for documenting bench protocols, and as ubiquitous. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
State of the art in pathology business process analysis, modeling, design and optimization.

PubMed

Schrader, Thomas; Blobel, Bernd; García-Rojo, Marcial; Daniel, Christel; Słodkowska, Janina

2012-01-01

For analyzing current workflows and processes, for improving them, for quality management and quality assurance, for integrating hardware and software components, but also for education, training and communication between different domains' experts, modeling business process in a pathology department is inevitable. The authors highlight three main processes in pathology: general diagnostic, cytology diagnostic, and autopsy. In this chapter, those processes are formally modeled and described in detail. Finally, specialized processes such as immunohistochemistry and frozen section have been considered.
Science Gateways, Scientific Workflows and Open Community Software

NASA Astrophysics Data System (ADS)

Pierce, M. E.; Marru, S.

2014-12-01

Science gateways and scientific workflows occupy different ends of the spectrum of user-focused cyberinfrastructure. Gateways, sometimes called science portals, provide a way for enabling large numbers of users to take advantage of advanced computing resources (supercomputers, advanced storage systems, science clouds) by providing Web and desktop interfaces and supporting services. Scientific workflows, at the other end of the spectrum, support advanced usage of cyberinfrastructure that enable "power users" to undertake computational experiments that are not easily done through the usual mechanisms (managing simulations across multiple sites, for example). Despite these different target communities, gateways and workflows share many similarities and can potentially be accommodated by the same software system. For example, pipelines to process InSAR imagery sets or to datamine GPS time series data are workflows. The results and the ability to make downstream products may be made available through a gateway, and power users may want to provide their own custom pipelines. In this abstract, we discuss our efforts to build an open source software system, Apache Airavata, that can accommodate both gateway and workflow use cases. Our approach is general, and we have applied the software to problems in a number of scientific domains. In this talk, we discuss our applications to usage scenarios specific to earth science, focusing on earthquake physics examples drawn from the QuakSim.org and GeoGateway.org efforts. We also examine the role of the Apache Software Foundation's open community model as a way to build up common commmunity codes that do not depend upon a single "owner" to sustain. Pushing beyond open source software, we also see the need to provide gateways and workflow systems as cloud services. These services centralize operations, provide well-defined programming interfaces, scale elastically, and have global-scale fault tolerance. We discuss our work providing Apache Airavata as a hosted service to provide these features.
Eleven quick tips for architecting biomedical informatics workflows with cloud computing.

PubMed

Cole, Brian S; Moore, Jason H

2018-03-01

Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world's largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction.
Eleven quick tips for architecting biomedical informatics workflows with cloud computing

PubMed Central

Moore, Jason H.

2018-01-01

Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world’s largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction. PMID:29596416
Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach.

PubMed

Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J

2012-01-01

Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.
Workflow-Based Software Development Environment

NASA Technical Reports Server (NTRS)

Izygon, Michel E.

2013-01-01

The Software Developer's Assistant (SDA) helps software teams more efficiently and accurately conduct or execute software processes associated with NASA mission-critical software. SDA is a process enactment platform that guides software teams through project-specific standards, processes, and procedures. Software projects are decomposed into all of their required process steps or tasks, and each task is assigned to project personnel. SDA orchestrates the performance of work required to complete all process tasks in the correct sequence. The software then notifies team members when they may begin work on their assigned tasks and provides the tools, instructions, reference materials, and supportive artifacts that allow users to compliantly perform the work. A combination of technology components captures and enacts any software process use to support the software lifecycle. It creates an adaptive workflow environment that can be modified as needed. SDA achieves software process automation through a Business Process Management (BPM) approach to managing the software lifecycle for mission-critical projects. It contains five main parts: TieFlow (workflow engine), Business Rules (rules to alter process flow), Common Repository (storage for project artifacts, versions, history, schedules, etc.), SOA (interface to allow internal, GFE, or COTS tools integration), and the Web Portal Interface (collaborative web environment
Agile parallel bioinformatics workflow management using Pwrake.

PubMed

Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro

2011-09-08

In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Agile parallel bioinformatics workflow management using Pwrake

PubMed Central

2011-01-01

Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows. PMID:21899774
Workflow oriented software support for image guided radiofrequency ablation of focal liver malignancies

NASA Astrophysics Data System (ADS)

Weihusen, Andreas; Ritter, Felix; Kröger, Tim; Preusser, Tobias; Zidowitz, Stephan; Peitgen, Heinz-Otto

2007-03-01

Image guided radiofrequency (RF) ablation has taken a significant part in the clinical routine as a minimally invasive method for the treatment of focal liver malignancies. Medical imaging is used in all parts of the clinical workflow of an RF ablation, incorporating treatment planning, interventional targeting and result assessment. This paper describes a software application, which has been designed to support the RF ablation workflow under consideration of the requirements of clinical routine, such as easy user interaction and a high degree of robust and fast automatic procedures, in order to keep the physician from spending too much time at the computer. The application therefore provides a collection of specialized image processing and visualization methods for treatment planning and result assessment. The algorithms are adapted to CT as well as to MR imaging. The planning support contains semi-automatic methods for the segmentation of liver tumors and the surrounding vascular system as well as an interactive virtual positioning of RF applicators and a concluding numerical estimation of the achievable heat distribution. The assessment of the ablation result is supported by the segmentation of the coagulative necrosis and an interactive registration of pre- and post-interventional image data for the comparison of tumor and necrosis segmentation masks. An automatic quantification of surface distances is performed to verify the embedding of the tumor area into the thermal lesion area. The visualization methods support representations in the commonly used orthogonal 2D view as well as in 3D scenes.
Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach

PubMed Central

Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J

2012-01-01

Abstract Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow. The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow. PMID:22859881
Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources

NASA Astrophysics Data System (ADS)

Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

2014-12-01

The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.

Bridging experiment and theory: A template for unifying NMR data and electronic structure calculations

DOE PAGES

Brown, David M. L.; Cho, Herman; de Jong, Wibe A.

2016-02-09

Here, the testing of theoretical models with experimental data is an integral part of the scientific method, and a logical place to search for new ways of stimulating scientific productivity. Often experiment/theory comparisons may be viewed as a workflow comprised of well-defined, rote operations distributed over several distinct computers, as exemplified by the way in which predictions from electronic structure theories are evaluated with results from spectroscopic experiments. For workflows such as this, which may be laborious and time consuming to perform manually, software that could orchestrate the operations and transfer results between computers in a seamless and automated fashionmore » would offer major efficiency gains. Such tools also promise to alter how researchers interact with data outside their field of specialization by, e.g., making raw experimental results more accessible to theorists, and the outputs of theoretical calculations more readily comprehended by experimentalists.« less
Bridging experiment and theory: A template for unifying NMR data and electronic structure calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, David M. L.; Cho, Herman; de Jong, Wibe A.

Here, the testing of theoretical models with experimental data is an integral part of the scientific method, and a logical place to search for new ways of stimulating scientific productivity. Often experiment/theory comparisons may be viewed as a workflow comprised of well-defined, rote operations distributed over several distinct computers, as exemplified by the way in which predictions from electronic structure theories are evaluated with results from spectroscopic experiments. For workflows such as this, which may be laborious and time consuming to perform manually, software that could orchestrate the operations and transfer results between computers in a seamless and automated fashionmore » would offer major efficiency gains. Such tools also promise to alter how researchers interact with data outside their field of specialization by, e.g., making raw experimental results more accessible to theorists, and the outputs of theoretical calculations more readily comprehended by experimentalists.« less
Developing a Taxonomy of Characteristics and Features of Collaboration Tools for Teams in Distributed Environments

DTIC Science & Technology

2007-09-01

Motion URL: http://www.blackberry.com/products/blackberry/index.shtml Software Name: Bricolage Company: Bricolage URL: http://www.bricolage.cc...Workflow Customizable control over editorial content. Bricolage Bricolage Feature Description Software Company Workflow Allows development...content for Nuxeo Collaborative Portal projects. Nuxeo Workspace Add, edit, delete, content through web interface. Bricolage Bricolage
Workflow in interventional radiology: nerve blocks and facet blocks

NASA Astrophysics Data System (ADS)

Siddoway, Donald; Ingeholm, Mary Lou; Burgert, Oliver; Neumuth, Thomas; Watson, Vance; Cleary, Kevin

2006-03-01

Workflow analysis has the potential to dramatically improve the efficiency and clinical outcomes of medical procedures. In this study, we recorded the workflow for nerve block and facet block procedures in the interventional radiology suite at Georgetown University Hospital in Washington, DC, USA. We employed a custom client/server software architecture developed by the Innovation Center for Computer Assisted Surgery (ICCAS) at the University of Leipzig, Germany. This software runs in an internet browser, and allows the user to record the actions taken by the physician during a procedure. The data recorded during the procedure is stored as an XML document, which can then be further processed. We have successfully gathered data on a number if cases using a tablet PC, and these preliminary results show the feasibility of using this software in an interventional radiology setting. We are currently accruing additional cases and when more data has been collected we will analyze the workflow of these procedures to look for inefficiencies and potential improvements.
Big data analytics workflow management for eScience

NASA Astrophysics Data System (ADS)

Fiore, Sandro; D'Anca, Alessandro; Palazzo, Cosimo; Elia, Donatello; Mariello, Andrea; Nassisi, Paola; Aloisio, Giovanni

2015-04-01

In many domains such as climate and astrophysics, scientific data is often n-dimensional and requires tools that support specialized data types and primitives if it is to be properly stored, accessed, analysed and visualized. Currently, scientific data analytics relies on domain-specific software and libraries providing a huge set of operators and functionalities. However, most of these software fail at large scale since they: (i) are desktop based, rely on local computing capabilities and need the data locally; (ii) cannot benefit from available multicore/parallel machines since they are based on sequential codes; (iii) do not provide declarative languages to express scientific data analysis tasks, and (iv) do not provide newer or more scalable storage models to better support the data multidimensionality. Additionally, most of them: (v) are domain-specific, which also means they support a limited set of data formats, and (vi) do not provide a workflow support, to enable the construction, execution and monitoring of more complex "experiments". The Ophidia project aims at facing most of the challenges highlighted above by providing a big data analytics framework for eScience. Ophidia provides several parallel operators to manipulate large datasets. Some relevant examples include: (i) data sub-setting (slicing and dicing), (ii) data aggregation, (iii) array-based primitives (the same operator applies to all the implemented UDF extensions), (iv) data cube duplication, (v) data cube pivoting, (vi) NetCDF-import and export. Metadata operators are available too. Additionally, the Ophidia framework provides array-based primitives to perform data sub-setting, data aggregation (i.e. max, min, avg), array concatenation, algebraic expressions and predicate evaluation on large arrays of scientific data. Bit-oriented plugins have also been implemented to manage binary data cubes. Defining processing chains and workflows with tens, hundreds of data analytics operators is the real challenge in many practical scientific use cases. This talk will specifically address the main needs, requirements and challenges regarding data analytics workflow management applied to large scientific datasets. Three real use cases concerning analytics workflows for sea situational awareness, fire danger prevention, climate change and biodiversity will be discussed in detail.
A Model of Workflow Composition for Emergency Management

NASA Astrophysics Data System (ADS)

Xin, Chen; Bin-ge, Cui; Feng, Zhang; Xue-hui, Xu; Shan-shan, Fu

The common-used workflow technology is not flexible enough in dealing with concurrent emergency situations. The paper proposes a novel model for defining emergency plans, in which workflow segments appear as a constituent part. A formal abstraction, which contains four operations, is defined to compose workflow segments under constraint rule. The software system of the business process resources construction and composition is implemented and integrated into Emergency Plan Management Application System.
A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation.

PubMed

Välikangas, Tommi; Suomi, Tomi; Elo, Laura L

2017-05-31

Label-free mass spectrometry (MS) has developed into an important tool applied in various fields of biological and life sciences. Several software exist to process the raw MS data into quantified protein abundances, including open source and commercial solutions. Each software includes a set of unique algorithms for different tasks of the MS data processing workflow. While many of these algorithms have been compared separately, a thorough and systematic evaluation of their overall performance is missing. Moreover, systematic information is lacking about the amount of missing values produced by the different proteomics software and the capabilities of different data imputation methods to account for them.In this study, we evaluated the performance of five popular quantitative label-free proteomics software workflows using four different spike-in data sets. Our extensive testing included the number of proteins quantified and the number of missing values produced by each workflow, the accuracy of detecting differential expression and logarithmic fold change and the effect of different imputation and filtering methods on the differential expression results. We found that the Progenesis software performed consistently well in the differential expression analysis and produced few missing values. The missing values produced by the other software decreased their performance, but this difference could be mitigated using proper data filtering or imputation methods. Among the imputation methods, we found that the local least squares (lls) regression imputation consistently increased the performance of the software in the differential expression analysis, and a combination of both data filtering and local least squares imputation increased performance the most in the tested data sets. © The Author 2017. Published by Oxford University Press.
AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsai, Yingssu; Stanford University, 333 Campus Drive, Mudd Building, Stanford, CA 94305-5080; McPhillips, Scott E.

New software has been developed for automating the experimental and data-processing stages of fragment-based drug discovery at a macromolecular crystallography beamline. A new workflow-automation framework orchestrates beamline-control and data-analysis software while organizing results from multiple samples. AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data,more » performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully demonstrated. This workflow was run once on the same 96 samples that the group had examined manually and the workflow cycled successfully through all of the samples, collected data from the same samples that were selected manually and located the same peaks of unmodeled density in the resulting difference Fourier maps.« less
Observing System Simulation Experiment (OSSE) for the HyspIRI Spectrometer Mission

NASA Technical Reports Server (NTRS)

Turmon, Michael J.; Block, Gary L.; Green, Robert O.; Hua, Hook; Jacob, Joseph C.; Sobel, Harold R.; Springer, Paul L.; Zhang, Qingyuan

2010-01-01

The OSSE software provides an integrated end-to-end environment to simulate an Earth observing system by iteratively running a distributed modeling workflow based on the HyspIRI Mission, including atmospheric radiative transfer, surface albedo effects, detection, and retrieval for agile exploration of the mission design space. The software enables an Observing System Simulation Experiment (OSSE) and can be used for design trade space exploration of science return for proposed instruments by modeling the whole ground truth, sensing, and retrieval chain and to assess retrieval accuracy for a particular instrument and algorithm design. The OSSE in fra struc ture is extensible to future National Research Council (NRC) Decadal Survey concept missions where integrated modeling can improve the fidelity of coupled science and engineering analyses for systematic analysis and science return studies. This software has a distributed architecture that gives it a distinct advantage over other similar efforts. The workflow modeling components are typically legacy computer programs implemented in a variety of programming languages, including MATLAB, Excel, and FORTRAN. Integration of these diverse components is difficult and time-consuming. In order to hide this complexity, each modeling component is wrapped as a Web Service, and each component is able to pass analysis parameterizations, such as reflectance or radiance spectra, on to the next component downstream in the service workflow chain. In this way, the interface to each modeling component becomes uniform and the entire end-to-end workflow can be run using any existing or custom workflow processing engine. The architecture lets users extend workflows as new modeling components become available, chain together the components using any existing or custom workflow processing engine, and distribute them across any Internet-accessible Web Service endpoints. The workflow components can be hosted on any Internet-accessible machine. This has the advantages that the computations can be distributed to make best use of the available computing resources, and each workflow component can be hosted and maintained by their respective domain experts.
Flexible Workflow Software enables the Management of an Increased Volume and Heterogeneity of Sensors, and evolves with the Expansion of Complex Ocean Observatory Infrastructures.

NASA Astrophysics Data System (ADS)

Tomlin, M. C.; Jenkyns, R.

2015-12-01

Ocean Networks Canada (ONC) collects data from observatories in the northeast Pacific, Salish Sea, Arctic Ocean, Atlantic Ocean, and land-based sites in British Columbia. Data are streamed, collected autonomously, or transmitted via satellite from a variety of instruments. The Software Engineering group at ONC develops and maintains Oceans 2.0, an in-house software system that acquires and archives data from sensors, and makes data available to scientists, the public, government and non-government agencies. The Oceans 2.0 workflow tool was developed by ONC to manage a large volume of tasks and processes required for instrument installation, recovery and maintenance activities. Since 2013, the workflow tool has supported 70 expeditions and grown to include 30 different workflow processes for the increasing complexity of infrastructures at ONC. The workflow tool strives to keep pace with an increasing heterogeneity of sensors, connections and environments by supporting versioning of existing workflows, and allowing the creation of new processes and tasks. Despite challenges in training and gaining mutual support from multidisciplinary teams, the workflow tool has become invaluable in project management in an innovative setting. It provides a collective place to contribute to ONC's diverse projects and expeditions and encourages more repeatable processes, while promoting interactions between the multidisciplinary teams who manage various aspects of instrument development and the data they produce. The workflow tool inspires documentation of terminologies and procedures, and effectively links to other tools at ONC such as JIRA, Alfresco and Wiki. Motivated by growing sensor schemes, modes of collecting data, archiving, and data distribution at ONC, the workflow tool ensures that infrastructure is managed completely from instrument purchase to data distribution. It integrates all areas of expertise and helps fulfill ONC's mandate to offer quality data to users.
Progress in digital color workflow understanding in the International Color Consortium (ICC) Workflow WG

NASA Astrophysics Data System (ADS)

McCarthy, Ann

2006-01-01

The ICC Workflow WG serves as the bridge between ICC color management technologies and use of those technologies in real world color production applications. ICC color management is applicable to and is used in a wide range of color systems, from highly specialized digital cinema color special effects to high volume publications printing to home photography. The ICC Workflow WG works to align ICC technologies so that the color management needs of these diverse use case systems are addressed in an open, platform independent manner. This report provides a high level summary of the ICC Workflow WG objectives and work to date, focusing on the ways in which workflow can impact image quality and color systems performance. The 'ICC Workflow Primitives' and 'ICC Workflow Patterns and Dimensions' workflow models are covered in some detail. Consider the questions, "How much of dissatisfaction with color management today is the result of 'the wrong color transformation at the wrong time' and 'I can't get to the right conversion at the right point in my work process'?" Put another way, consider how image quality through a workflow can be negatively affected when the coordination and control level of the color management system is not sufficient.
Pathview Web: user friendly pathway visualization and data integration

PubMed Central

Pant, Gaurav; Bhavnasi, Yeshvant K.; Blanchard, Steven G.; Brouwer, Cory

2017-01-01

Abstract Pathway analysis is widely used in omics studies. Pathway-based data integration and visualization is a critical component of the analysis. To address this need, we recently developed a novel R package called Pathview. Pathview maps, integrates and renders a large variety of biological data onto molecular pathway graphs. Here we developed the Pathview Web server, as to make pathway visualization and data integration accessible to all scientists, including those without the special computing skills or resources. Pathview Web features an intuitive graphical web interface and a user centered design. The server not only expands the core functions of Pathview, but also provides many useful features not available in the offline R package. Importantly, the server presents a comprehensive workflow for both regular and integrated pathway analysis of multiple omics data. In addition, the server also provides a RESTful API for programmatic access and conveniently integration in third-party software or workflows. Pathview Web is openly and freely accessible at https://pathview.uncc.edu/. PMID:28482075
LHCb migration from Subversion to Git

NASA Astrophysics Data System (ADS)

Clemencic, M.; Couturier, B.; Closier, J.; Cattaneo, M.

2017-10-01

Due to user demand and to support new development workflows based on code review and multiple development streams, LHCb decided to port the source code management from Subversion to Git, using the CERN GitLab hosting service. Although tools exist for this kind of migration, LHCb specificities and development models required careful planning of the migration, development of migration tools, changes to the development model, and redefinition of the release procedures. Moreover we had to support a hybrid situation with some software projects hosted in Git and others still in Subversion, or even branches of one projects hosted in different systems. We present the way we addressed the special LHCb requirements, the technical details of migrating large non standard Subversion repositories, and how we managed to smoothly migrate the software projects following the schedule of each project manager.
CONNJUR Workflow Builder: A software integration environment for spectral reconstruction

PubMed Central

Fenwick, Matthew; Weatherby, Gerard; Vyas, Jay; Sesanker, Colbert; Martyn, Timothy O.; Ellis, Heidi J.C.; Gryk, Michael R.

2015-01-01

CONNJUR Workflow Builder (WB) is an open-source software integration environment that leverages existing spectral reconstruction tools to create a synergistic, coherent platform for converting biomolecular NMR data from the time domain to the frequency domain. WB provides data integration of primary data and metadata using a relational database, and includes a library of pre-built workflows for processing time domain data. WB simplifies maximum entropy reconstruction, facilitating the processing of non-uniformly sampled time domain data. As will be shown in the paper, the unique features of WB provide it with novel abilities to enhance the quality, accuracy, and fidelity of the spectral reconstruction process. WB also provides features which promote collaboration, education, parameterization, and non-uniform data sets along with processing integrated with the Rowland NMR Toolkit (RNMRTK) and NMRPipe software packages. WB is available free of charge in perpetuity, dual-licensed under the MIT and GPL open source licenses. PMID:26066803
CONNJUR Workflow Builder: a software integration environment for spectral reconstruction.

PubMed

Fenwick, Matthew; Weatherby, Gerard; Vyas, Jay; Sesanker, Colbert; Martyn, Timothy O; Ellis, Heidi J C; Gryk, Michael R

2015-07-01

CONNJUR Workflow Builder (WB) is an open-source software integration environment that leverages existing spectral reconstruction tools to create a synergistic, coherent platform for converting biomolecular NMR data from the time domain to the frequency domain. WB provides data integration of primary data and metadata using a relational database, and includes a library of pre-built workflows for processing time domain data. WB simplifies maximum entropy reconstruction, facilitating the processing of non-uniformly sampled time domain data. As will be shown in the paper, the unique features of WB provide it with novel abilities to enhance the quality, accuracy, and fidelity of the spectral reconstruction process. WB also provides features which promote collaboration, education, parameterization, and non-uniform data sets along with processing integrated with the Rowland NMR Toolkit (RNMRTK) and NMRPipe software packages. WB is available free of charge in perpetuity, dual-licensed under the MIT and GPL open source licenses.
VisTrails SAHM: visualization and workflow management for species habitat modeling

USGS Publications Warehouse

Morisette, Jeffrey T.; Jarnevich, Catherine S.; Holcombe, Tracy R.; Talbert, Colin B.; Ignizio, Drew A.; Talbert, Marian; Silva, Claudio; Koop, David; Swanson, Alan; Young, Nicholas E.

2013-01-01

The Software for Assisted Habitat Modeling (SAHM) has been created to both expedite habitat modeling and help maintain a record of the various input data, pre- and post-processing steps and modeling options incorporated in the construction of a species distribution model through the established workflow management and visualization VisTrails software. This paper provides an overview of the VisTrails:SAHM software including a link to the open source code, a table detailing the current SAHM modules, and a simple example modeling an invasive weed species in Rocky Mountain National Park, USA.
Analog to digital workflow improvement: a quantitative study.

PubMed

Wideman, Catherine; Gallet, Jacqueline

2006-01-01

This study tracked a radiology department's conversion from utilization of a Kodak Amber analog system to a Kodak DirectView DR 5100 digital system. Through the use of ProModel Optimization Suite, a workflow simulation software package, significant quantitative information was derived from workflow process data measured before and after the change to a digital system. Once the digital room was fully operational and the radiology staff comfortable with the new system, average patient examination time was reduced from 9.24 to 5.28 min, indicating that a higher patient throughput could be achieved. Compared to the analog system, chest examination time for modality specific activities was reduced by 43%. The percentage of repeat examinations experienced with the digital system also decreased to 8% vs. the level of 9.5% experienced with the analog system. The study indicated that it is possible to quantitatively study clinical workflow and productivity by using commercially available software.
Workflow based framework for life science informatics.

PubMed

Tiwari, Abhishek; Sekhar, Arvind K T

2007-10-01

Workflow technology is a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services) which facilitate knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. In this article, we have discussed the existing workflow systems and the trends in applications of workflow based systems.
Common Data Models and Efficient Reproducible Workflows for Distributed Ocean Model Skill Assessment

NASA Astrophysics Data System (ADS)

Signell, R. P.; Snowden, D. P.; Howlett, E.; Fernandes, F. A.

2014-12-01

Model skill assessment requires discovery, access, analysis, and visualization of information from both sensors and models, and traditionally has been possible only by a few experts. The US Integrated Ocean Observing System (US-IOOS) consists of 17 Federal Agencies and 11 Regional Associations that produce data from various sensors and numerical models; exactly the information required for model skill assessment. US-IOOS is seeking to develop documented skill assessment workflows that are standardized, efficient, and reproducible so that a much wider community can participate in the use and assessment of model results. Standardization requires common data models for observational and model data. US-IOOS relies on the CF Conventions for observations and structured grid data, and on the UGRID Conventions for unstructured (e.g. triangular) grid data. This allows applications to obtain only the data they require in a uniform and parsimonious way using web services: OPeNDAP for model output and OGC Sensor Observation Service (SOS) for observed data. Reproducibility is enabled with IPython Notebooks shared on GitHub (http://github.com/ioos). These capture the entire skill assessment workflow, including user input, search, access, analysis, and visualization, ensuring that workflows are self-documenting and reproducible by anyone, using free software. Python packages for common data models are Pyugrid and the British Met Office Iris package. Python packages required to run the workflows (pyugrid, pyoos, and the British Met Office Iris package) are also available on GitHub and on Binstar.org so that users can run scenarios using the free Anaconda Python distribution. Hosted services such as Wakari enable anyone to reproduce these workflows for free, without installing any software locally, using just their web browser. We are also experimenting with Wakari Enterprise, which allows multi-user access from a web browser to an IPython Server running where large quantities of model output reside, increasing the efficiency. The open development and distribution of these workflows, and the software on which they depend, is an educational resource for those new to the field and a center of focus where practitioners can contribute new software and ideas.
A software tool to analyze clinical workflows from direct observations.

PubMed

Schweitzer, Marco; Lasierra, Nelia; Hoerbst, Alexander

2015-01-01

Observational data of clinical processes need to be managed in a convenient way, so that process information is reliable, valid and viable for further analysis. However, existing tools for allocating observations fail in systematic data collection of specific workflow recordings. We present a software tool which was developed to facilitate the analysis of clinical process observations. The tool was successfully used in the project OntoHealth, to build, store and analyze observations of diabetes routine consultations.

[Guided and computer-assisted implant surgery and prosthetic: The continuous digital workflow].

PubMed

Pascual, D; Vaysse, J

2016-02-01

New continuous digital workflow protocols of guided and computer-assisted implant surgery improve accuracy of implant positioning. The design of the future prosthesis is based on the available prosthetic space, gingival height and occlusal relationship with the opposing and adjacent teeth. The implant position and length depend on volume, density and bone quality, gingival height, tooth-implant and implant-implant distances, implant parallelism, axis and type of the future prosthesis. The crown modeled on the software will therefore serve as a guide to the future implant axis and not the reverse. The guide is made by 3D printing. The software determines surgical protocol with the drilling sequences. The unitary or plural prosthesis, modeled on the software and built before surgery, is loaded directly after implant placing, if needed. These protocols allow for a full continuity of the digital workflow. The software provides the surgeon and the dental technician a total freedom for the prosthetic-surgery guide design and the position of the implants. The prosthetic project, occlusal and aesthetic, taking the bony and surgical constraints into account, is optimized. The implant surgery is simplified and becomes less "stressful" for the patient and the surgeon. Guided and computer-assisted surgery with continuous digital workflow is becoming the technique of choice to improve the accuracy and quality of implant rehabilitation. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
LFQProfiler and RNP(xl): Open-Source Tools for Label-Free Quantification and Protein-RNA Cross-Linking Integrated into Proteome Discoverer.

PubMed

Veit, Johannes; Sachsenberg, Timo; Chernev, Aleksandar; Aicheler, Fabian; Urlaub, Henning; Kohlbacher, Oliver

2016-09-02

Modern mass spectrometry setups used in today's proteomics studies generate vast amounts of raw data, calling for highly efficient data processing and analysis tools. Software for analyzing these data is either monolithic (easy to use, but sometimes too rigid) or workflow-driven (easy to customize, but sometimes complex). Thermo Proteome Discoverer (PD) is a powerful software for workflow-driven data analysis in proteomics which, in our eyes, achieves a good trade-off between flexibility and usability. Here, we present two open-source plugins for PD providing additional functionality: LFQProfiler for label-free quantification of peptides and proteins, and RNP(xl) for UV-induced peptide-RNA cross-linking data analysis. LFQProfiler interacts with existing PD nodes for peptide identification and validation and takes care of the entire quantitative part of the workflow. We show that it performs at least on par with other state-of-the-art software solutions for label-free quantification in a recently published benchmark ( Ramus, C.; J. Proteomics 2016 , 132 , 51 - 62 ). The second workflow, RNP(xl), represents the first software solution to date for identification of peptide-RNA cross-links including automatic localization of the cross-links at amino acid resolution and localization scoring. It comes with a customized integrated cross-link fragment spectrum viewer for convenient manual inspection and validation of the results.
Randomized controlled within-subject evaluation of digital and conventional workflows for the fabrication of lithium disilicate single crowns. Part II: CAD-CAM versus conventional laboratory procedures.

PubMed

Sailer, Irena; Benic, Goran I; Fehmer, Vincent; Hämmerle, Christoph H F; Mühlemann, Sven

2017-07-01

Clinical studies are needed to evaluate the entire digital and conventional workflows in prosthetic dentistry. The purpose of the second part of this clinical study was to compare the laboratory production time for tooth-supported single crowns made with 4 different digital workflows and 1 conventional workflow and to compare these crowns clinically. For each of 10 participants, a monolithic crown was fabricated in lithium disilicate-reinforced glass ceramic (IPS e.max CAD). The computer-aided design and computer-aided manufacturing (CAD-CAM) systems were Lava C.O.S. CAD software and centralized CAM (group L), Cares CAD software and centralized CAM (group iT), Cerec Connect CAD software and lab side CAM (group CiL), and Cerec Connect CAD software with centralized CAM (group CiD). The conventional fabrication (group K) included a wax pattern of the crown and heat pressing according to the lost-wax technique (IPS e.max Press). The time for the fabrication of the casts and the crowns was recorded. Subsequently, the crowns were clinically evaluated and the corresponding treatment times were recorded. The Paired Wilcoxon test with the Bonferroni correction was applied to detect differences among treatment groups (α=.05). The total mean (±standard deviation) active working time for the dental technician was 88 ±6 minutes in group L, 74 ±12 minutes in group iT, 74 ±5 minutes in group CiL, 92 ±8 minutes in group CiD, and 148 ±11 minutes in group K. The dental technician spent significantly more working time for the conventional workflow than for the digital workflows (P<.001). No statistically significant differences were found between group L and group CiD or between group iT and group CiL. No statistical differences in time for the clinical evaluation were found among groups, indicating similar outcomes (P>.05). Irrespective of the CAD-CAM system, the overall laboratory working time for a digital workflow was significantly shorter than for the conventional workflow, since the dental technician needed less active working time. Copyright © 2016 Editorial Council for the Journal of Prosthetic Dentistry. Published by Elsevier Inc. All rights reserved.
Extending software repository hosting to code review and testing

NASA Astrophysics Data System (ADS)

Gonzalez Alvarez, A.; Aparicio Cotarelo, B.; Lossent, A.; Andersen, T.; Trzcinska, A.; Asbury, D.; Hłimyr, N.; Meinhard, H.

2015-12-01

We will describe how CERN's services around Issue Tracking and Version Control have evolved, and what the plans for the future are. We will describe the services main design, integration and structure, giving special attention to the new requirements from the community of users in terms of collaboration and integration tools and how we address this challenge when defining new services based on GitLab for collaboration to replace our current Gitolite service and Code Review and Jenkins for Continuous Integration. These new services complement the existing ones to create a new global "development tool stack" where each working group can place its particular development work-flow.
Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework

PubMed Central

Easterly, Caleb; Gruening, Bjoern; Johnson, James; Kolmeder, Carolin A.; Kumar, Praveen; May, Damon; Mehta, Subina; Mesuere, Bart; Brown, Zachary; Elias, Joshua E.; Hervey, W. Judson; McGowan, Thomas; Muth, Thilo; Rudney, Joel; Griffin, Timothy J.

2018-01-01

The impact of microbial communities, also known as the microbiome, on human health and the environment is receiving increased attention. Studying translated gene products (proteins) and comparing metaproteomic profiles may elucidate how microbiomes respond to specific environmental stimuli, and interact with host organisms. Characterizing proteins expressed by a complex microbiome and interpreting their functional signature requires sophisticated informatics tools and workflows tailored to metaproteomics. Additionally, there is a need to disseminate these informatics resources to researchers undertaking metaproteomic studies, who could use them to make new and important discoveries in microbiome research. The Galaxy for proteomics platform (Galaxy-P) offers an open source, web-based bioinformatics platform for disseminating metaproteomics software and workflows. Within this platform, we have developed easily-accessible and documented metaproteomic software tools and workflows aimed at training researchers in their operation and disseminating the tools for more widespread use. The modular workflows encompass the core requirements of metaproteomic informatics: (a) database generation; (b) peptide spectral matching; (c) taxonomic analysis and (d) functional analysis. Much of the software available via the Galaxy-P platform was selected, packaged and deployed through an online metaproteomics “Contribution Fest“ undertaken by a unique consortium of expert software developers and users from the metaproteomics research community, who have co-authored this manuscript. These resources are documented on GitHub and freely available through the Galaxy Toolshed, as well as a publicly accessible metaproteomics gateway Galaxy instance. These documented workflows are well suited for the training of novice metaproteomics researchers, through online resources such as the Galaxy Training Network, as well as hands-on training workshops. Here, we describe the metaproteomics tools available within these Galaxy-based resources, as well as the process by which they were selected and implemented in our community-based work. We hope this description will increase access to and utilization of metaproteomics tools, as well as offer a framework for continued community-based development and dissemination of cutting edge metaproteomics software. PMID:29385081
Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework.

PubMed

Blank, Clemens; Easterly, Caleb; Gruening, Bjoern; Johnson, James; Kolmeder, Carolin A; Kumar, Praveen; May, Damon; Mehta, Subina; Mesuere, Bart; Brown, Zachary; Elias, Joshua E; Hervey, W Judson; McGowan, Thomas; Muth, Thilo; Nunn, Brook; Rudney, Joel; Tanca, Alessandro; Griffin, Timothy J; Jagtap, Pratik D

2018-01-31

The impact of microbial communities, also known as the microbiome, on human health and the environment is receiving increased attention. Studying translated gene products (proteins) and comparing metaproteomic profiles may elucidate how microbiomes respond to specific environmental stimuli, and interact with host organisms. Characterizing proteins expressed by a complex microbiome and interpreting their functional signature requires sophisticated informatics tools and workflows tailored to metaproteomics. Additionally, there is a need to disseminate these informatics resources to researchers undertaking metaproteomic studies, who could use them to make new and important discoveries in microbiome research. The Galaxy for proteomics platform (Galaxy-P) offers an open source, web-based bioinformatics platform for disseminating metaproteomics software and workflows. Within this platform, we have developed easily-accessible and documented metaproteomic software tools and workflows aimed at training researchers in their operation and disseminating the tools for more widespread use. The modular workflows encompass the core requirements of metaproteomic informatics: (a) database generation; (b) peptide spectral matching; (c) taxonomic analysis and (d) functional analysis. Much of the software available via the Galaxy-P platform was selected, packaged and deployed through an online metaproteomics "Contribution Fest" undertaken by a unique consortium of expert software developers and users from the metaproteomics research community, who have co-authored this manuscript. These resources are documented on GitHub and freely available through the Galaxy Toolshed, as well as a publicly accessible metaproteomics gateway Galaxy instance. These documented workflows are well suited for the training of novice metaproteomics researchers, through online resources such as the Galaxy Training Network, as well as hands-on training workshops. Here, we describe the metaproteomics tools available within these Galaxy-based resources, as well as the process by which they were selected and implemented in our community-based work. We hope this description will increase access to and utilization of metaproteomics tools, as well as offer a framework for continued community-based development and dissemination of cutting edge metaproteomics software.
Software Project Management and Measurement on the World-Wide-Web (WWW)

NASA Technical Reports Server (NTRS)

Callahan, John; Ramakrishnan, Sudhaka

1996-01-01

We briefly describe a system for forms-based, work-flow management that helps members of a software development team overcome geographical barriers to collaboration. Our system, called the Web Integrated Software Environment (WISE), is implemented as a World-Wide-Web service that allows for management and measurement of software development projects based on dynamic analysis of change activity in the workflow. WISE tracks issues in a software development process, provides informal communication between the users with different roles, supports to-do lists, and helps in software process improvement. WISE minimizes the time devoted to metrics collection and analysis by providing implicit delivery of messages between users based on the content of project documents. The use of a database in WISE is hidden from the users who view WISE as maintaining a personal 'to-do list' of tasks related to the many projects on which they may play different roles.
Software for rapid prototyping in the pharmaceutical and biotechnology industries.

PubMed

Kappler, Michael A

2008-05-01

The automation of drug discovery methods continues to develop, especially techniques that process information, represent workflow and facilitate decision-making. The magnitude of data and the plethora of questions in pharmaceutical and biotechnology research give rise to the need for rapid prototyping software. This review describes the advantages and disadvantages of three solutions: Competitive Workflow, Taverna and Pipeline Pilot. Each of these systems processes large amounts of data, integrates diverse systems and assists novice programmers and human experts in critical decision-making steps.
BPELPower—A BPEL execution engine for geospatial web services

NASA Astrophysics Data System (ADS)

Yu, Genong (Eugene); Zhao, Peisheng; Di, Liping; Chen, Aijun; Deng, Meixia; Bai, Yuqi

2012-10-01

The Business Process Execution Language (BPEL) has become a popular choice for orchestrating and executing workflows in the Web environment. As one special kind of scientific workflow, geospatial Web processing workflows are data-intensive, deal with complex structures in data and geographic features, and execute automatically with limited human intervention. To enable the proper execution and coordination of geospatial workflows, a specially enhanced BPEL execution engine is required. BPELPower was designed, developed, and implemented as a generic BPEL execution engine with enhancements for executing geospatial workflows. The enhancements are especially in its capabilities in handling Geography Markup Language (GML) and standard geospatial Web services, such as the Web Processing Service (WPS) and the Web Feature Service (WFS). BPELPower has been used in several demonstrations over the decade. Two scenarios were discussed in detail to demonstrate the capabilities of BPELPower. That study showed a standard-compliant, Web-based approach for properly supporting geospatial processing, with the only enhancement at the implementation level. Pattern-based evaluation and performance improvement of the engine are discussed: BPELPower directly supports 22 workflow control patterns and 17 workflow data patterns. In the future, the engine will be enhanced with high performance parallel processing and broad Web paradigms.
What Not To Do: Anti-patterns for Developing Scientific Workflow Software Components

NASA Astrophysics Data System (ADS)

Futrelle, J.; Maffei, A. R.; Sosik, H. M.; Gallager, S. M.; York, A.

2013-12-01

Scientific workflows promise to enable efficient scaling-up of researcher code to handle large datasets and workloads, as well as documentation of scientific processing via standardized provenance records, etc. Workflow systems and related frameworks for coordinating the execution of otherwise separate components are limited, however, in their ability to overcome software engineering design problems commonly encountered in pre-existing components, such as scripts developed externally by scientists in their laboratories. In practice, this often means that components must be rewritten or replaced in a time-consuming, expensive process. In the course of an extensive workflow development project involving large-scale oceanographic image processing, we have begun to identify and codify 'anti-patterns'--problematic design characteristics of software--that make components fit poorly into complex automated workflows. We have gone on to develop and document low-effort solutions and best practices that efficiently address the anti-patterns we have identified. The issues, solutions, and best practices can be used to evaluate and improve existing code, as well as guiding the development of new components. For example, we have identified a common anti-pattern we call 'batch-itis' in which a script fails and then cannot perform more work, even if that work is not precluded by the failure. The solution we have identified--removing unnecessary looping over independent units of work--is often easier to code than the anti-pattern, as it eliminates the need for complex control flow logic in the component. Other anti-patterns we have identified are similarly easy to identify and often easy to fix. We have drawn upon experience working with three science teams at Woods Hole Oceanographic Institution, each of which has designed novel imaging instruments and associated image analysis code. By developing use cases and prototypes within these teams, we have undertaken formal evaluations of software components developed by programmers with widely varying levels of expertise, and have been able to discover and characterize a number of anti-patterns. Our evaluation methodology and testbed have also enabled us to assess the efficacy of strategies to address these anti-patterns according to scientifically relevant metrics, such as ability of algorithms to perform faster than the rate of data acquisition and the accuracy of workflow component output relative to ground truth. The set of anti-patterns and solutions we have identified augments of the body of more well-known software engineering anti-patterns by addressing additional concerns that obtain when a software component has to function as part of a workflow assembled out of independently-developed codebases. Our experience shows that identifying and resolving these anti-patterns reduces development time and improves performance without reducing component reusability.
SISYPHUS: A high performance seismic inversion factory

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sheu, R; Ghafar, R; Powers, A

Purpose: Demonstrate the effectiveness of in-house software in ensuring EMR workflow efficiency and safety. Methods: A web-based dashboard system (WBDS) was developed to monitor clinical workflow in real time using web technology (WAMP) through ODBC (Open Database Connectivity). Within Mosaiq (Elekta Inc), operational workflow is driven and indicated by Quality Check Lists (QCLs), which is triggered by automation software IQ Scripts (Elekta Inc); QCLs rely on user completion to propagate. The WBDS retrieves data directly from the Mosaig SQL database and tracks clinical events in real time. For example, the necessity of a physics initial chart check can be determinedmore » by screening all patients on treatment who have received their first fraction and who have not yet had their first chart check. Monitoring similar “real” events with our in-house software creates a safety net as its propagation does not rely on individual users input. Results: The WBDS monitors the following: patient care workflow (initial consult to end of treatment), daily treatment consistency (scheduling, technique, charges), physics chart checks (initial, EOT, weekly), new starts, missing treatments (>3 warning/>5 fractions, action required), and machine overrides. The WBDS can be launched from any web browser which allows the end user complete transparency and timely information. Since the creation of the dashboards, workflow interruptions due to accidental deletion or completion of QCLs were eliminated. Additionally, all physics chart checks were completed timely. Prompt notifications of treatment record inconsistency and machine overrides have decreased the amount of time between occurrence and execution of corrective action. Conclusion: Our clinical workflow relies primarily on QCLs and IQ Scripts; however, this functionality is not the panacea of safety and efficiency. The WBDS creates a more thorough system of checks to provide a safer and near error-less working environment.« less
Clinical, information and business process modeling to promote development of safe and flexible software.

PubMed

Liaw, Siaw-Teng; Deveny, Elizabeth; Morrison, Iain; Lewis, Bryn

2006-09-01

Using a factorial vignette survey and modeling methodology, we developed clinical and information models - incorporating evidence base, key concepts, relevant terms, decision-making and workflow needed to practice safely and effectively - to guide the development of an integrated rule-based knowledge module to support prescribing decisions in asthma. We identified workflows, decision-making factors, factor use, and clinician information requirements. The Unified Modeling Language (UML) and public domain software and knowledge engineering tools (e.g. Protégé) were used, with the Australian GP Data Model as the starting point for expressing information needs. A Web Services service-oriented architecture approach was adopted within which to express functional needs, and clinical processes and workflows were expressed in the Business Process Execution Language (BPEL). This formal analysis and modeling methodology to define and capture the process and logic of prescribing best practice in a reference implementation is fundamental to tackling deficiencies in prescribing decision support software.
SimpleITK Image-Analysis Notebooks: a Collaborative Environment for Education and Reproducible Research.

PubMed

Yaniv, Ziv; Lowekamp, Bradley C; Johnson, Hans J; Beare, Richard

2018-06-01

Modern scientific endeavors increasingly require team collaborations to construct and interpret complex computational workflows. This work describes an image-analysis environment that supports the use of computational tools that facilitate reproducible research and support scientists with varying levels of software development skills. The Jupyter notebook web application is the basis of an environment that enables flexible, well-documented, and reproducible workflows via literate programming. Image-analysis software development is made accessible to scientists with varying levels of programming experience via the use of the SimpleITK toolkit, a simplified interface to the Insight Segmentation and Registration Toolkit. Additional features of the development environment include user friendly data sharing using online data repositories and a testing framework that facilitates code maintenance. SimpleITK provides a large number of examples illustrating educational and research-oriented image analysis workflows for free download from GitHub under an Apache 2.0 license: github.com/InsightSoftwareConsortium/SimpleITK-Notebooks .
Interactive reconstructions of cranial 3D implants under MeVisLab as an alternative to commercial planning software.

PubMed

Egger, Jan; Gall, Markus; Tax, Alois; Ücal, Muammer; Zefferer, Ulrike; Li, Xing; von Campe, Gord; Schäfer, Ute; Schmalstieg, Dieter; Chen, Xiaojun

2017-01-01

In this publication, the interactive planning and reconstruction of cranial 3D Implants under the medical prototyping platform MeVisLab as alternative to commercial planning software is introduced. In doing so, a MeVisLab prototype consisting of a customized data-flow network and an own C++ module was set up. As a result, the Computer-Aided Design (CAD) software prototype guides a user through the whole workflow to generate an implant. Therefore, the workflow begins with loading and mirroring the patients head for an initial curvature of the implant. Then, the user can perform an additional Laplacian smoothing, followed by a Delaunay triangulation. The result is an aesthetic looking and well-fitting 3D implant, which can be stored in a CAD file format, e.g. STereoLithography (STL), for 3D printing. The 3D printed implant can finally be used for an in-depth pre-surgical evaluation or even as a real implant for the patient. In a nutshell, our research and development shows that a customized MeVisLab software prototype can be used as an alternative to complex commercial planning software, which may also not be available in every clinic. Finally, not to conform ourselves directly to available commercial software and look for other options that might improve the workflow.
Interactive reconstructions of cranial 3D implants under MeVisLab as an alternative to commercial planning software

PubMed Central

Egger, Jan; Gall, Markus; Tax, Alois; Ücal, Muammer; Zefferer, Ulrike; Li, Xing; von Campe, Gord; Schäfer, Ute; Schmalstieg, Dieter; Chen, Xiaojun

2017-01-01

In this publication, the interactive planning and reconstruction of cranial 3D Implants under the medical prototyping platform MeVisLab as alternative to commercial planning software is introduced. In doing so, a MeVisLab prototype consisting of a customized data-flow network and an own C++ module was set up. As a result, the Computer-Aided Design (CAD) software prototype guides a user through the whole workflow to generate an implant. Therefore, the workflow begins with loading and mirroring the patients head for an initial curvature of the implant. Then, the user can perform an additional Laplacian smoothing, followed by a Delaunay triangulation. The result is an aesthetic looking and well-fitting 3D implant, which can be stored in a CAD file format, e.g. STereoLithography (STL), for 3D printing. The 3D printed implant can finally be used for an in-depth pre-surgical evaluation or even as a real implant for the patient. In a nutshell, our research and development shows that a customized MeVisLab software prototype can be used as an alternative to complex commercial planning software, which may also not be available in every clinic. Finally, not to conform ourselves directly to available commercial software and look for other options that might improve the workflow. PMID:28264062
SU-E-T-419: Workflow and FMEA in a New Proton Therapy (PT) Facility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cheng, C; Wessels, B; Hamilton, H

2014-06-01

Purpose: Workflow is an important component in the operational planning of a new proton facility. By integrating the concept of failure mode and effect analysis (FMEA) and traditional QA requirements, a workflow for a proton therapy treatment course is set up. This workflow serves as the blue print for the planning of computer hardware/software requirements and network flow. A slight modification of the workflow generates a process map(PM) for FMEA and the planning of QA program in PT. Methods: A flowchart is first developed outlining the sequence of processes involved in a PT treatment course. Each process consists of amore » number of sub-processes to encompass a broad scope of treatment and QA procedures. For each subprocess, the personnel involved, the equipment needed and the computer hardware/software as well as network requirements are defined by a team of clinical staff, administrators and IT personnel. Results: Eleven intermediate processes with a total of 70 sub-processes involved in a PT treatment course are identified. The number of sub-processes varies, ranging from 2-12. The sub-processes within each process are used for the operational planning. For example, in the CT-Sim process, there are 12 sub-processes: three involve data entry/retrieval from a record-and-verify system, two controlled by the CT computer, two require department/hospital network, and the other five are setup procedures. IT then decides the number of computers needed and the software and network requirement. By removing the traditional QA procedures from the workflow, a PM is generated for FMEA analysis to design a QA program for PT. Conclusion: Significant efforts are involved in the development of the workflow in a PT treatment course. Our hybrid model of combining FMEA and traditional QA program serves a duo purpose of efficient operational planning and designing of a QA program in PT.« less
Modernizing Earth and Space Science Modeling Workflows in the Big Data Era

NASA Astrophysics Data System (ADS)

Kinter, J. L.; Feigelson, E.; Walker, R. J.; Tino, C.

2017-12-01

Modeling is a major aspect of the Earth and space science research. The development of numerical models of the Earth system, planetary systems or astrophysical systems is essential to linking theory with observations. Optimal use of observations that are quite expensive to obtain and maintain typically requires data assimilation that involves numerical models. In the Earth sciences, models of the physical climate system are typically used for data assimilation, climate projection, and inter-disciplinary research, spanning applications from analysis of multi-sensor data sets to decision-making in climate-sensitive sectors with applications to ecosystems, hazards, and various biogeochemical processes. In space physics, most models are from first principles, require considerable expertise to run and are frequently modified significantly for each case study. The volume and variety of model output data from modeling Earth and space systems are rapidly increasing and have reached a scale where human interaction with data is prohibitively inefficient. A major barrier to progress is that modeling workflows isn't deemed by practitioners to be a design problem. Existing workflows have been created by a slow accretion of software, typically based on undocumented, inflexible scripts haphazardly modified by a succession of scientists and students not trained in modern software engineering methods. As a result, existing modeling workflows suffer from an inability to onboard new datasets into models; an inability to keep pace with accelerating data production rates; and irreproducibility, among other problems. These factors are creating an untenable situation for those conducting and supporting Earth system and space science. Improving modeling workflows requires investments in hardware, software and human resources. This paper describes the critical path issues that must be targeted to accelerate modeling workflows, including script modularization, parallelization, and automation in the near term, and longer term investments in virtualized environments for improved scalability, tolerance for lossy data compression, novel data-centric memory and storage technologies, and tools for peer reviewing, preserving and sharing workflows, as well as fundamental statistical and machine learning algorithms.
Temporary Shell Proof-of-Concept Technique: Digital-Assisted Workflow to Enable Customized Immediate Function in Two Visits in Partially Edentulous Patients

PubMed

Pozzi, Alessandro; Arcuri, Lorenzo; Moy, Peter K

2018-03-01

The growing interest in minimally invasive implant placement and delivery of a prefabricated provisional prosthesis immediately, thus minimizing "time to teeth," has led to the development of numerous 3-dimensional (3D) planning software programs. Given the enhancements associated with fully digital workflows, such as better 3D soft-tissue visualization and virtual tooth rendering, computer-guided implant surgery and immediate function has become an effective and reliable procedure. This article describes how modern implant planning software programs provide a comprehensive digital platform that enables efficient interplay between the surgical and restorative aspects of implant treatment. These new technologies that streamline the overall digital workflow allow transformation of the digital wax-up into a personalized, CAD/CAM-milled provisional restoration. Thus, collaborative digital workflows provide a novel approach for time-efficient delivery of a customized, screw-retained provisional restoration on the day of implant surgery, resulting in improved predictability for immediate function in the partially edentate patient.
The MPO system for automatic workflow documentation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abla, G.; Coviello, E. N.; Flanagan, S. M.

Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. Here, this article presents the Metadata, Provenance, and Ontology (MPO) System, the softwaremore » that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.« less

The MPO system for automatic workflow documentation

DOE PAGES

Abla, G.; Coviello, E. N.; Flanagan, S. M.; ...

2016-04-18

Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. Here, this article presents the Metadata, Provenance, and Ontology (MPO) System, the softwaremore » that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.« less
A Computational Workflow for the Automated Generation of Models of Genetic Designs.

PubMed

Misirli, Göksel; Nguyen, Tramy; McLaughlin, James Alastair; Vaidyanathan, Prashant; Jones, Timothy S; Densmore, Douglas; Myers, Chris; Wipat, Anil

2018-06-05

Computational models are essential to engineer predictable biological systems and to scale up this process for complex systems. Computational modeling often requires expert knowledge and data to build models. Clearly, manual creation of models is not scalable for large designs. Despite several automated model construction approaches, computational methodologies to bridge knowledge in design repositories and the process of creating computational models have still not been established. This paper describes a workflow for automatic generation of computational models of genetic circuits from data stored in design repositories using existing standards. This workflow leverages the software tool SBOLDesigner to build structural models that are then enriched by the Virtual Parts Repository API using Systems Biology Open Language (SBOL) data fetched from the SynBioHub design repository. The iBioSim software tool is then utilized to convert this SBOL description into a computational model encoded using the Systems Biology Markup Language (SBML). Finally, this SBML model can be simulated using a variety of methods. This workflow provides synthetic biologists with easy to use tools to create predictable biological systems, hiding away the complexity of building computational models. This approach can further be incorporated into other computational workflows for design automation.
The roles of the AAS Journals' Data Editors

NASA Astrophysics Data System (ADS)

Muench, August; NASA/SAO ADS, CERN/Zenodo.org, Harvard/CfA Wolbach Library

2018-01-01

I will summarize the community services provided by the AAS Journals' Data Editors to support authors’ when citing and preserving the software and data used in the published literature. In addition I will describe the life of a piece of code as it passes through the current workflows for software citation in astronomy. Using this “lifecycle” I will detail the ongoing work funded by a grant from the Alfred P. Sloan Foundation to the American Astronomical Society to improve the citation of software in the literature. The funded development team and advisory boards, made up of non-profit publishers, literature indexers, and preservation archives, is implementing the Force11 Software citation principles for astronomy Journals. The outcome of this work will be new workflows for authors and developers that fit in their current practices while enabling versioned citation of software and granular credit for its creators.
Pathview Web: user friendly pathway visualization and data integration.

PubMed

Luo, Weijun; Pant, Gaurav; Bhavnasi, Yeshvant K; Blanchard, Steven G; Brouwer, Cory

2017-07-03

Pathway analysis is widely used in omics studies. Pathway-based data integration and visualization is a critical component of the analysis. To address this need, we recently developed a novel R package called Pathview. Pathview maps, integrates and renders a large variety of biological data onto molecular pathway graphs. Here we developed the Pathview Web server, as to make pathway visualization and data integration accessible to all scientists, including those without the special computing skills or resources. Pathview Web features an intuitive graphical web interface and a user centered design. The server not only expands the core functions of Pathview, but also provides many useful features not available in the offline R package. Importantly, the server presents a comprehensive workflow for both regular and integrated pathway analysis of multiple omics data. In addition, the server also provides a RESTful API for programmatic access and conveniently integration in third-party software or workflows. Pathview Web is openly and freely accessible at https://pathview.uncc.edu/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Implementation of workflow engine technology to deliver basic clinical decision support functionality.

PubMed

Huser, Vojtech; Rasmussen, Luke V; Oberg, Ryan; Starren, Justin B

2011-04-10

Workflow engine technology represents a new class of software with the ability to graphically model step-based knowledge. We present application of this novel technology to the domain of clinical decision support. Successful implementation of decision support within an electronic health record (EHR) remains an unsolved research challenge. Previous research efforts were mostly based on healthcare-specific representation standards and execution engines and did not reach wide adoption. We focus on two challenges in decision support systems: the ability to test decision logic on retrospective data prior prospective deployment and the challenge of user-friendly representation of clinical logic. We present our implementation of a workflow engine technology that addresses the two above-described challenges in delivering clinical decision support. Our system is based on a cross-industry standard of XML (extensible markup language) process definition language (XPDL). The core components of the system are a workflow editor for modeling clinical scenarios and a workflow engine for execution of those scenarios. We demonstrate, with an open-source and publicly available workflow suite, that clinical decision support logic can be executed on retrospective data. The same flowchart-based representation can also function in a prospective mode where the system can be integrated with an EHR system and respond to real-time clinical events. We limit the scope of our implementation to decision support content generation (which can be EHR system vendor independent). We do not focus on supporting complex decision support content delivery mechanisms due to lack of standardization of EHR systems in this area. We present results of our evaluation of the flowchart-based graphical notation as well as architectural evaluation of our implementation using an established evaluation framework for clinical decision support architecture. We describe an implementation of a free workflow technology software suite (available at http://code.google.com/p/healthflow) and its application in the domain of clinical decision support. Our implementation seamlessly supports clinical logic testing on retrospective data and offers a user-friendly knowledge representation paradigm. With the presented software implementation, we demonstrate that workflow engine technology can provide a decision support platform which evaluates well against an established clinical decision support architecture evaluation framework. Due to cross-industry usage of workflow engine technology, we can expect significant future functionality enhancements that will further improve the technology's capacity to serve as a clinical decision support platform.
Big Data Challenges in Global Seismic 'Adjoint Tomography' (Invited)

NASA Astrophysics Data System (ADS)

Tromp, J.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Smith, J.

2013-12-01

The challenge of imaging Earth's interior on a global scale is closely linked to the challenge of handling large data sets. The related iterative workflow involves five distinct phases, namely, 1) data gathering and culling, 2) synthetic seismogram calculations, 3) pre-processing (time-series analysis and time-window selection), 4) data assimilation and adjoint calculations, 5) post-processing (pre-conditioning, regularization, model update). In order to implement this workflow on modern high-performance computing systems, a new seismic data format is being developed. The Adaptable Seismic Data Format (ASDF) is designed to replace currently used data formats with a more flexible format that allows for fast parallel I/O. The metadata is divided into abstract categories, such as "source" and "receiver", along with provenance information for complete reproducibility. The structure of ASDF is designed keeping in mind three distinct applications: earthquake seismology, seismic interferometry, and exploration seismology. Existing time-series analysis tool kits, such as SAC and ObsPy, can be easily interfaced with ASDF so that seismologists can use robust, previously developed software packages. ASDF accommodates an automated, efficient workflow for global adjoint tomography. Manually managing the large number of simulations associated with the workflow can rapidly become a burden, especially with increasing numbers of earthquakes and stations. Therefore, it is of importance to investigate the possibility of automating the entire workflow. Scientific Workflow Management Software (SWfMS) allows users to execute workflows almost routinely. SWfMS provides additional advantages. In particular, it is possible to group independent simulations in a single job to fit the available computational resources. They also give a basic level of fault resilience as the workflow can be resumed at the correct state preceding a failure. Some of the best candidates for our particular workflow are Kepler and Swift, and the latter appears to be the most serious candidate for a large-scale workflow on a single supercomputer, remaining sufficiently simple to accommodate further modifications and improvements.
Overview of geostationary ocean color imager (GOCI) and GOCI data processing system (GDPS)

NASA Astrophysics Data System (ADS)

Ryu, Joo-Hyung; Han, Hee-Jeong; Cho, Seongick; Park, Young-Je; Ahn, Yu-Hwan

2012-09-01

GOCI, the world's first geostationary ocean color satellite, provides images with a spatial resolution of 500 m at hourly intervals up to 8 times a day, allowing observations of short-term changes in the Northeast Asian region. The GOCI Data Processing System (GDPS), a specialized data processing software for GOCI, was developed for real-time generation of various products. This paper describes GOCI characteristics and GDPS workflow/products, so as to enable the efficient utilization of GOCI. To provide quality images and data, atmospheric correction and data analysis algorithms must be improved through continuous Cal/Val. GOCI-II will be developed by 2018 to facilitate in-depth studies on geostationary ocean color satellites.
Computer-aided dental prostheses construction using reverse engineering.

PubMed

Solaberrieta, E; Minguez, R; Barrenetxea, L; Sierra, E; Etxaniz, O

2014-01-01

The implementation of computer-aided design/computer-aided manufacturing (CAD/CAM) systems with virtual articulators, which take into account the kinematics, constitutes a breakthrough in the construction of customised dental prostheses. This paper presents a multidisciplinary protocol involving CAM techniques to produce dental prostheses. This protocol includes a step-by-step procedure using innovative reverse engineering technologies to transform completely virtual design processes into customised prostheses. A special emphasis is placed on a novel method that permits a virtual location of the models. The complete workflow includes the optical scanning of the patient, the use of reverse engineering software and, if necessary, the use of rapid prototyping to produce CAD temporary prostheses.
Monitoring of computing resource use of active software releases at ATLAS

NASA Astrophysics Data System (ADS)

Limosani, Antonio; ATLAS Collaboration

2017-10-01

The LHC is the world’s most powerful particle accelerator, colliding protons at centre of mass energy of 13 TeV. As the energy and frequency of collisions has grown in the search for new physics, so too has demand for computing resources needed for event reconstruction. We will report on the evolution of resource usage in terms of CPU and RAM in key ATLAS offline reconstruction workflows at the TierO at CERN and on the WLCG. Monitoring of workflows is achieved using the ATLAS PerfMon package, which is the standard ATLAS performance monitoring system running inside Athena jobs. Systematic daily monitoring has recently been expanded to include all workflows beginning at Monte Carlo generation through to end-user physics analysis, beyond that of event reconstruction. Moreover, the move to a multiprocessor mode in production jobs has facilitated the use of tools, such as “MemoryMonitor”, to measure the memory shared across processors in jobs. Resource consumption is broken down into software domains and displayed in plots generated using Python visualization libraries and collected into pre-formatted auto-generated Web pages, which allow the ATLAS developer community to track the performance of their algorithms. This information is however preferentially filtered to domain leaders and developers through the use of JIRA and via reports given at ATLAS software meetings. Finally, we take a glimpse of the future by reporting on the expected CPU and RAM usage in benchmark workflows associated with the High Luminosity LHC and anticipate the ways performance monitoring will evolve to understand and benchmark future workflows.
Remote Sensing Image Analysis Without Expert Knowledge - A Web-Based Classification Tool On Top of Taverna Workflow Management System

NASA Astrophysics Data System (ADS)

Selsam, Peter; Schwartze, Christian

2016-10-01

Providing software solutions via internet has been known for quite some time and is now an increasing trend marketed as "software as a service". A lot of business units accept the new methods and streamlined IT strategies by offering web-based infrastructures for external software usage - but geospatial applications featuring very specialized services or functionalities on demand are still rare. Originally applied in desktop environments, the ILMSimage tool for remote sensing image analysis and classification was modified in its communicating structures and enabled for running on a high-power server and benefiting from Tavema software. On top, a GIS-like and web-based user interface guides the user through the different steps in ILMSimage. ILMSimage combines object oriented image segmentation with pattern recognition features. Basic image elements form a construction set to model for large image objects with diverse and complex appearance. There is no need for the user to set up detailed object definitions. Training is done by delineating one or more typical examples (templates) of the desired object using a simple vector polygon. The template can be large and does not need to be homogeneous. The template is completely independent from the segmentation. The object definition is done completely by the software.
Development of a user customizable imaging informatics-based intelligent workflow engine system to enhance rehabilitation clinical trials

NASA Astrophysics Data System (ADS)

Wang, Ximing; Martinez, Clarisa; Wang, Jing; Liu, Ye; Liu, Brent

2014-03-01

Clinical trials usually have a demand to collect, track and analyze multimedia data according to the workflow. Currently, the clinical trial data management requirements are normally addressed with custom-built systems. Challenges occur in the workflow design within different trials. The traditional pre-defined custom-built system is usually limited to a specific clinical trial and normally requires time-consuming and resource-intensive software development. To provide a solution, we present a user customizable imaging informatics-based intelligent workflow engine system for managing stroke rehabilitation clinical trials with intelligent workflow. The intelligent workflow engine provides flexibility in building and tailoring the workflow in various stages of clinical trials. By providing a solution to tailor and automate the workflow, the system will save time and reduce errors for clinical trials. Although our system is designed for clinical trials for rehabilitation, it may be extended to other imaging based clinical trials as well.
Scientific Workflow Management in Proteomics

PubMed Central

de Bruin, Jeroen S.; Deelder, André M.; Palmblad, Magnus

2012-01-01

Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond. PMID:22411703
Comprehensive, powerful, efficient, intuitive: a new software framework for clinical imaging applications

NASA Astrophysics Data System (ADS)

Augustine, Kurt E.; Holmes, David R., III; Hanson, Dennis P.; Robb, Richard A.

2006-03-01

One of the greatest challenges for a software engineer is to create a complex application that is comprehensive enough to be useful to a diverse set of users, yet focused enough for individual tasks to be carried out efficiently with minimal training. This "powerful yet simple" paradox is particularly prevalent in advanced medical imaging applications. Recent research in the Biomedical Imaging Resource (BIR) at Mayo Clinic has been directed toward development of an imaging application framework that provides powerful image visualization/analysis tools in an intuitive, easy-to-use interface. It is based on two concepts very familiar to physicians - Cases and Workflows. Each case is associated with a unique patient and a specific set of routine clinical tasks, or a workflow. Each workflow is comprised of an ordered set of general-purpose modules which can be re-used for each unique workflow. Clinicians help describe and design the workflows, and then are provided with an intuitive interface to both patient data and analysis tools. Since most of the individual steps are common to many different workflows, the use of general-purpose modules reduces development time and results in applications that are consistent, stable, and robust. While the development of individual modules may reflect years of research by imaging scientists, new customized workflows based on the new modules can be developed extremely fast. If a powerful, comprehensive application is difficult to learn and complicated to use, it will be unacceptable to most clinicians. Clinical image analysis tools must be intuitive and effective or they simply will not be used.
3D Printing of CT Dataset: Validation of an Open Source and Consumer-Available Workflow.

PubMed

Bortolotto, Chandra; Eshja, Esmeralda; Peroni, Caterina; Orlandi, Matteo A; Bizzotto, Nicola; Poggi, Paolo

2016-02-01

The broad availability of cheap three-dimensional (3D) printing equipment has raised the need for a thorough analysis on its effects on clinical accuracy. Our aim is to determine whether the accuracy of 3D printing process is affected by the use of a low-budget workflow based on open source software and consumer's commercially available 3D printers. A group of test objects was scanned with a 64-slice computed tomography (CT) in order to build their 3D copies. CT datasets were elaborated using a software chain based on three free and open source software. Objects were printed out with a commercially available 3D printer. Both the 3D copies and the test objects were measured using a digital professional caliper. Overall, the objects' mean absolute difference between test objects and 3D copies is 0.23 mm and the mean relative difference amounts to 0.55 %. Our results demonstrate that the accuracy of 3D printing process remains high despite the use of a low-budget workflow.
Software and the Scientist: Coding and Citation Practices in Geodynamics

NASA Astrophysics Data System (ADS)

Hwang, Lorraine; Fish, Allison; Soito, Laura; Smith, MacKenzie; Kellogg, Louise H.

2017-11-01

In geodynamics as in other scientific areas, computation has become a core component of research, complementing field observation, laboratory analysis, experiment, and theory. Computational tools for data analysis, mapping, visualization, modeling, and simulation are essential for all aspects of the scientific workflow. Specialized scientific software is often developed by geodynamicists for their own use, and this effort represents a distinctive intellectual contribution. Drawing on a geodynamics community that focuses on developing and disseminating scientific software, we assess the current practices of software development and attribution, as well as attitudes about the need and best practices for software citation. We analyzed publications by participants in the Computational Infrastructure for Geodynamics and conducted mixed method surveys of the solid earth geophysics community. From this we learned that coding skills are typically learned informally. Participants considered good code as trusted, reusable, readable, and not overly complex and considered a good coder as one that participates in the community in an open and reasonable manor contributing to both long- and short-term community projects. Participants strongly supported citing software reflected by the high rate a software package was named in the literature and the high rate of citations in the references. However, lacking are clear instructions from developers on how to cite and education of users on what to cite. In addition, citations did not always lead to discoverability of the resource. A unique identifier to the software package itself, community education, and citation tools would contribute to better attribution practices.
SYRMEP Tomo Project: a graphical user interface for customizing CT reconstruction workflows.

PubMed

Brun, Francesco; Massimi, Lorenzo; Fratini, Michela; Dreossi, Diego; Billé, Fulvio; Accardo, Agostino; Pugliese, Roberto; Cedola, Alessia

2017-01-01

When considering the acquisition of experimental synchrotron radiation (SR) X-ray CT data, the reconstruction workflow cannot be limited to the essential computational steps of flat fielding and filtered back projection (FBP). More refined image processing is often required, usually to compensate artifacts and enhance the quality of the reconstructed images. In principle, it would be desirable to optimize the reconstruction workflow at the facility during the experiment (beamtime). However, several practical factors affect the image reconstruction part of the experiment and users are likely to conclude the beamtime with sub-optimal reconstructed images. Through an example of application, this article presents SYRMEP Tomo Project (STP), an open-source software tool conceived to let users design custom CT reconstruction workflows. STP has been designed for post-beamtime (off-line use) and for a new reconstruction of past archived data at user's home institution where simple computing resources are available. Releases of the software can be downloaded at the Elettra Scientific Computing group GitHub repository https://github.com/ElettraSciComp/STP-Gui.
MassCascade: Visual Programming for LC-MS Data Processing in Metabolomics.

PubMed

Beisken, Stephan; Earll, Mark; Portwood, David; Seymour, Mark; Steinbeck, Christoph

2014-04-01

Liquid chromatography coupled to mass spectrometry (LC-MS) is commonly applied to investigate the small molecule complement of organisms. Several software tools are typically joined in custom pipelines to semi-automatically process and analyse the resulting data. General workflow environments like the Konstanz Information Miner (KNIME) offer the potential of an all-in-one solution to process LC-MS data by allowing easy integration of different tools and scripts. We describe MassCascade and its workflow plug-in for processing LC-MS data. The Java library integrates frequently used algorithms in a modular fashion, thus enabling it to serve as back-end for graphical front-ends. The functions available in MassCascade have been encapsulated in a plug-in for the workflow environment KNIME, allowing combined use with e.g. statistical workflow nodes from other providers and making the tool intuitive to use without knowledge of programming. The design of the software guarantees a high level of modularity where processing functions can be quickly replaced or concatenated. MassCascade is an open-source library for LC-MS data processing in metabolomics. It embraces the concept of visual programming through its KNIME plug-in, simplifying the process of building complex workflows. The library was validated using open data.
The development and implementation of MOSAIQ Integration Platform (MIP) based on the radiotherapy workflow

NASA Astrophysics Data System (ADS)

Yang, Xin; He, Zhen-yu; Jiang, Xiao-bo; Lin, Mao-sheng; Zhong, Ning-shan; Hu, Jiang; Qi, Zhen-yu; Bao, Yong; Li, Qiao-qiao; Li, Bao-yue; Hu, Lian-ying; Lin, Cheng-guang; Gao, Yuan-hong; Liu, Hui; Huang, Xiao-yan; Deng, Xiao-wu; Xia, Yun-fei; Liu, Meng-zhong; Sun, Ying

2017-03-01

To meet the special demands in China and the particular needs for the radiotherapy department, a MOSAIQ Integration Platform CHN (MIP) based on the workflow of radiation therapy (RT) has been developed, as a supplement system to the Elekta MOSAIQ. The MIP adopts C/S (client-server) structure mode, and its database is based on the Treatment Planning System (TPS) and MOSAIQ SQL Server 2008, running on the hospital local network. Five network servers, as a core hardware, supply data storage and network service based on the cloud services. The core software, using C# programming language, is developed based on Microsoft Visual Studio Platform. The MIP server could offer network service, including entry, query, statistics and print information for about 200 workstations at the same time. The MIP was implemented in the past one and a half years, and some practical patient-oriented functions were developed. And now the MIP is almost covering the whole workflow of radiation therapy. There are 15 function modules, such as: Notice, Appointment, Billing, Document Management (application/execution), System Management, and so on. By June of 2016, recorded data in the MIP are as following: 13546 patients, 13533 plan application, 15475 RT records, 14656 RT summaries, 567048 billing records and 506612 workload records, etc. The MIP based on the RT workflow has been successfully developed and clinically implemented with real-time performance, data security, stable operation. And it is demonstrated to be user-friendly and is proven to significantly improve the efficiency of the department. It is a key to facilitate the information sharing and department management. More functions can be added or modified for further enhancement its potentials in research and clinical practice.
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

PubMed

Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B

2016-02-04

Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Extending DICOM imaging to new clinical specialties in the healthcare enterprise

NASA Astrophysics Data System (ADS)

Kuzmak, Peter M.; Dayhoff, Ruth E.

2002-05-01

DICOM is a success for radiology and cardiology and it is now starting to be used for the other clinical specialties. The US Department of Veterans Affairs has been instrumental in promoting this advancement. We have worked with a number of non-radiology clinical speciality imaging vendors over the past two years, encouraging them to support DICOM, providing requirement specifications. Validating their implementations, using their products, and integrating their systems with the VA healthcare enterprise. We require each new clinical speciality vendor to support the DICOM Modality Worklist and Storage services and insist that they perform validation testing with us over the Internet. Two years ago we began working with two commercial DICOM image acquisition applications in ophthalmology and endoscopy. We are now dealing with over a dozen: five in ophthalmology, two in endoscopy, and six in dental. This has been a very productive endeavor. Because mature software development toolkits now exist, the vendors can quickly integrate DICOM with their existing imaging products. Each of the dental vendors, for example, was able to accomplish this task in less than three months. Getting the imaging modality vendors to support DICOM is only part of the story, however. We are also working on getting the VistA hospital information system to properly handle DICOM interfaces to various clinical specialties. This has been more difficult than expected because the workflow in clinical specialties is much more varied than that in radiology. This required us to develop software that is much more flexible than that used for radiology. Fortunately, the standard DICOM Modality Worklist and Storage services can be used without change. In addition to a more variable workflow, the use of structured reporting is much more advanced in clinical specialties than in radiology, and significant work is needed to define templates and communicate this data using DICOM. Since some speciality modules of our hospital information system currently store only report text, we also have to figure out how to store and display the discrete structured report data. The work involved in extending DICOM to the clinical specialties, and in integrating them with the hospital information system is an ongoing and worthwhile challenge. Our goal is to incorporate al of the patient's data into the electronic record, and DICOM is making this easier for everyone. Considerable investment, however, has to be made in the hospital information system software to accrue the full benefit.

Supporting metabolomics with adaptable software: design architectures for the end-user.

PubMed

Sarpe, Vladimir; Schriemer, David C

2017-02-01

Large and disparate sets of LC-MS data are generated by modern metabolomics profiling initiatives, and while useful software tools are available to annotate and quantify compounds, the field requires continued software development in order to sustain methodological innovation. Advances in software development practices allow for a new paradigm in tool development for metabolomics, where increasingly the end-user can develop or redeploy utilities ranging from simple algorithms to complex workflows. Resources that provide an organized framework for development are described and illustrated with LC-MS processing packages that have leveraged their design tools. Full access to these resources depends in part on coding experience, but the emergence of workflow builders and pluggable frameworks strongly reduces the skill level required. Developers in the metabolomics community are encouraged to use these resources and design content for uptake and reuse. Copyright © 2016 Elsevier Ltd. All rights reserved.
Workflow in interventional radiology: uterine fibroid embolization (UFE)

NASA Astrophysics Data System (ADS)

Lindisch, David; Neumuth, Thomas; Burgert, Oliver; Spies, James; Cleary, Kevin

2008-03-01

Workflow analysis can be used to record the steps taken during clinical interventions with the goal of identifying bottlenecks and streamlining the procedure efficiency. In this study, we recorded the workflow for uterine fibroid embolization (UFE) procedures in the interventional radiology suite at Georgetown University Hospital in Washington, DC, USA. We employed a custom client/server software architecture developed by the Innovation Center for Computer Assisted Surgery (ICCAS) at the University of Leipzig, Germany. This software runs in a JAVA environment and enables an observer to record the actions taken by the physician and surgical team during these interventions. The data recorded is stored as an XML document, which can then be further processed. We recorded data from 30 patients and found a mean intervention time of 01:49:46 (+/- 16:04) minutes. The critical intervention step, the embolization, had a mean time of 00:15:42 (+/- 05:49) minutes, which was only 15% of the total intervention time.
Workflow for high-content, individual cell quantification of fluorescent markers from universal microscope data, supported by open source software.

PubMed

Stockwell, Simon R; Mittnacht, Sibylle

2014-12-16

Advances in understanding the control mechanisms governing the behavior of cells in adherent mammalian tissue culture models are becoming increasingly dependent on modes of single-cell analysis. Methods which deliver composite data reflecting the mean values of biomarkers from cell populations risk losing subpopulation dynamics that reflect the heterogeneity of the studied biological system. In keeping with this, traditional approaches are being replaced by, or supported with, more sophisticated forms of cellular assay developed to allow assessment by high-content microscopy. These assays potentially generate large numbers of images of fluorescent biomarkers, which enabled by accompanying proprietary software packages, allows for multi-parametric measurements per cell. However, the relatively high capital costs and overspecialization of many of these devices have prevented their accessibility to many investigators. Described here is a universally applicable workflow for the quantification of multiple fluorescent marker intensities from specific subcellular regions of individual cells suitable for use with images from most fluorescent microscopes. Key to this workflow is the implementation of the freely available Cell Profiler software(1) to distinguish individual cells in these images, segment them into defined subcellular regions and deliver fluorescence marker intensity values specific to these regions. The extraction of individual cell intensity values from image data is the central purpose of this workflow and will be illustrated with the analysis of control data from a siRNA screen for G1 checkpoint regulators in adherent human cells. However, the workflow presented here can be applied to analysis of data from other means of cell perturbation (e.g., compound screens) and other forms of fluorescence based cellular markers and thus should be useful for a wide range of laboratories.
Coupled RipCAS-DFLOW (CoRD) Software and Data Management System for Reproducible Floodplain Vegetation Succession Modeling

NASA Astrophysics Data System (ADS)

Turner, M. A.; Miller, S.; Gregory, A.; Cadol, D. D.; Stone, M. C.; Sheneman, L.

2016-12-01

We present the Coupled RipCAS-DFLOW (CoRD) modeling system created to encapsulate the workflow to analyze the effects of stream flooding on vegetation succession. CoRD provides an intuitive command-line and web interface to run DFLOW and RipCAS in succession over many years automatically, which is a challenge because, for our application, DFLOW must be run on a supercomputing cluster via the PBS job scheduler. RipCAS is a vegetation succession model, and DFLOW is a 2D open channel flow model. Data adaptors have been developed to seamlessly connect DFLOW output data to be RipCAS inputs, and vice-versa. CoRD provides automated statistical analysis and visualization, plus automatic syncing of input and output files and model run metadata to the hydrological data management system HydroShare using its excellent Python REST client. This combination of technologies and data management techniques allows the results to be shared with collaborators and eventually published. Perhaps most importantly, it allows results to be easily reproduced via either the command-line or web user interface. This system is a result of collaboration between software developers and hydrologists participating in the Western Consortium for Watershed Analysis, Visualization, and Exploration (WC-WAVE). Because of the computing-intensive nature of this particular workflow, including automating job submission/monitoring and data adaptors, software engineering expertise is required. However, the hydrologists provide the software developers with a purpose and ensure a useful, intuitive tool is developed. Our hydrologists contribute software, too: RipCAS was developed from scratch by hydrologists on the team as a specialized, open-source version of the Computer Aided Simulation Model for Instream Flow and Riparia (CASiMiR) vegetation model; our hydrologists running DFLOW provided numerous examples and help with the supercomputing system. This project is written in Python, a popular language in the geosciences and a good beginner programming language, and is completely open source. It can be accessed at https://github.com/VirtualWatershed/CoRD with documentation available at http://virtualwatershed.github.io/CoRD. These facts enable continued development and use beyond the involvement of the current authors.
Implementation of workflow engine technology to deliver basic clinical decision support functionality

PubMed Central

2011-01-01

Background Workflow engine technology represents a new class of software with the ability to graphically model step-based knowledge. We present application of this novel technology to the domain of clinical decision support. Successful implementation of decision support within an electronic health record (EHR) remains an unsolved research challenge. Previous research efforts were mostly based on healthcare-specific representation standards and execution engines and did not reach wide adoption. We focus on two challenges in decision support systems: the ability to test decision logic on retrospective data prior prospective deployment and the challenge of user-friendly representation of clinical logic. Results We present our implementation of a workflow engine technology that addresses the two above-described challenges in delivering clinical decision support. Our system is based on a cross-industry standard of XML (extensible markup language) process definition language (XPDL). The core components of the system are a workflow editor for modeling clinical scenarios and a workflow engine for execution of those scenarios. We demonstrate, with an open-source and publicly available workflow suite, that clinical decision support logic can be executed on retrospective data. The same flowchart-based representation can also function in a prospective mode where the system can be integrated with an EHR system and respond to real-time clinical events. We limit the scope of our implementation to decision support content generation (which can be EHR system vendor independent). We do not focus on supporting complex decision support content delivery mechanisms due to lack of standardization of EHR systems in this area. We present results of our evaluation of the flowchart-based graphical notation as well as architectural evaluation of our implementation using an established evaluation framework for clinical decision support architecture. Conclusions We describe an implementation of a free workflow technology software suite (available at http://code.google.com/p/healthflow) and its application in the domain of clinical decision support. Our implementation seamlessly supports clinical logic testing on retrospective data and offers a user-friendly knowledge representation paradigm. With the presented software implementation, we demonstrate that workflow engine technology can provide a decision support platform which evaluates well against an established clinical decision support architecture evaluation framework. Due to cross-industry usage of workflow engine technology, we can expect significant future functionality enhancements that will further improve the technology's capacity to serve as a clinical decision support platform. PMID:21477364
A Tool Supporting Collaborative Data Analytics Workflow Design and Management

NASA Astrophysics Data System (ADS)

Zhang, J.; Bao, Q.; Lee, T. J.

2016-12-01

Collaborative experiment design could significantly enhance the sharing and adoption of the data analytics algorithms and models emerged in Earth science. Existing data-oriented workflow tools, however, are not suitable to support collaborative design of such a workflow, to name a few, to support real-time co-design; to track how a workflow evolves over time based on changing designs contributed by multiple Earth scientists; and to capture and retrieve collaboration knowledge on workflow design (discussions that lead to a design). To address the aforementioned challenges, we have designed and developed a technique supporting collaborative data-oriented workflow composition and management, as a key component toward supporting big data collaboration through the Internet. Reproducibility and scalability are two major targets demanding fundamental infrastructural support. One outcome of the project os a software tool, supporting an elastic number of groups of Earth scientists to collaboratively design and compose data analytics workflows through the Internet. Instead of recreating the wheel, we have extended an existing workflow tool VisTrails into an online collaborative environment as a proof of concept.
Experiences of Nursing Personnel Using PDAs in Home Health Care Services in Norwegian Municipalities

PubMed Central

Hansen, Linda M.; Fossum, Mariann; Söderhamn, Olle; Fruhling, Ann

2012-01-01

Although nursing personnel have used personal digital assistants (PDAs) to support home health care services for the past ten years, little is known about their experiences. This study was conducted to examine experiences of nursing personnel using a specialized home health care computer software application called Gerica. In addition, this research analyzed how well this application aligned with the workflow of the nursing personnel in their daily care of patients. The evaluation methods included user observations and learnability testing. Nursing personnel from two different municipalities were observed while performing real tasks in natural settings. This study shows that the nursing personnel were satisfied with the PDA user interface and the Gerica software; however, they identified areas for improvement. For example, the nursing personnel were concerned about trusting the reliability of the PDA in order to eliminate the need for handwritten documentation. Solutions to meet these shortcomings for nursing managers and vendors are discussed. PMID:24199073
Sustaining an Online, Shared Community Resource for Models, Robust Open source Software Tools and Data for Volcanology - the Vhub Experience

NASA Astrophysics Data System (ADS)

Patra, A. K.; Valentine, G. A.; Bursik, M. I.; Connor, C.; Connor, L.; Jones, M.; Simakov, N.; Aghakhani, H.; Jones-Ivey, R.; Kosar, T.; Zhang, B.

2015-12-01

Over the last 5 years we have created a community collaboratory Vhub.org [Palma et al, J. App. Volc. 3:2 doi:10.1186/2191-5040-3-2] as a place to find volcanology-related resources, and a venue for users to disseminate tools, teaching resources, data, and an online platform to support collaborative efforts. As the community (current active users > 6000 from an estimated community of comparable size) embeds the tools in the collaboratory into educational and research workflows it became imperative to: a) redesign tools into robust, open source reusable software for online and offline usage/enhancement; b) share large datasets with remote collaborators and other users seamlessly with security; c) support complex workflows for uncertainty analysis, validation and verification and data assimilation with large data. The focus on tool development/redevelopment has been twofold - firstly to use best practices in software engineering and new hardware like multi-core and graphic processing units. Secondly we wish to enhance capabilities to support inverse modeling, uncertainty quantification using large ensembles and design of experiments, calibration, validation. Among software engineering practices we practice are open source facilitating community contributions, modularity and reusability. Our initial targets are four popular tools on Vhub - TITAN2D, TEPHRA2, PUFF and LAVA. Use of tools like these requires many observation driven data sets e.g. digital elevation models of topography, satellite imagery, field observations on deposits etc. These data are often maintained in private repositories that are privately shared by "sneaker-net". As a partial solution to this we tested mechanisms using irods software for online sharing of private data with public metadata and access limits. Finally, we adapted use of workflow engines (e.g. Pegasus) to support the complex data and computing workflows needed for usage like uncertainty quantification for hazard analysis using physical models.
Using Kepler for Tool Integration in Microarray Analysis Workflows.

PubMed

Gan, Zhuohui; Stowe, Jennifer C; Altintas, Ilkay; McCulloch, Andrew D; Zambon, Alexander C

Increasing numbers of genomic technologies are leading to massive amounts of genomic data, all of which requires complex analysis. More and more bioinformatics analysis tools are being developed by scientist to simplify these analyses. However, different pipelines have been developed using different software environments. This makes integrations of these diverse bioinformatics tools difficult. Kepler provides an open source environment to integrate these disparate packages. Using Kepler, we integrated several external tools including Bioconductor packages, AltAnalyze, a python-based open source tool, and R-based comparison tool to build an automated workflow to meta-analyze both online and local microarray data. The automated workflow connects the integrated tools seamlessly, delivers data flow between the tools smoothly, and hence improves efficiency and accuracy of complex data analyses. Our workflow exemplifies the usage of Kepler as a scientific workflow platform for bioinformatics pipelines.
The Perfect Neuroimaging-Genetics-Computation Storm: Collision of Petabytes of Data, Millions of Hardware Devices and Thousands of Software Tools

PubMed Central

Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Zamanyan, Alen; Torri, Federica; Macciardi, Fabio; Hobel, Sam; Moon, Seok Woo; Sung, Young Hee; Jiang, Zhiguo; Labus, Jennifer; Kurth, Florian; Ashe-McNalley, Cody; Mayer, Emeran; Vespa, Paul M.; Van Horn, John D.; Toga, Arthur W.

2013-01-01

The volume, diversity and velocity of biomedical data are exponentially increasing providing petabytes of new neuroimaging and genetics data every year. At the same time, tens-of-thousands of computational algorithms are developed and reported in the literature along with thousands of software tools and services. Users demand intuitive, quick and platform-agnostic access to data, software tools, and infrastructure from millions of hardware devices. This explosion of information, scientific techniques, computational models, and technological advances leads to enormous challenges in data analysis, evidence-based biomedical inference and reproducibility of findings. The Pipeline workflow environment provides a crowd-based distributed solution for consistent management of these heterogeneous resources. The Pipeline allows multiple (local) clients and (remote) servers to connect, exchange protocols, control the execution, monitor the states of different tools or hardware, and share complete protocols as portable XML workflows. In this paper, we demonstrate several advanced computational neuroimaging and genetics case-studies, and end-to-end pipeline solutions. These are implemented as graphical workflow protocols in the context of analyzing imaging (sMRI, fMRI, DTI), phenotypic (demographic, clinical), and genetic (SNP) data. PMID:23975276
The CARMEN software as a service infrastructure.

PubMed

Weeks, Michael; Jessop, Mark; Fletcher, Martyn; Hodge, Victoria; Jackson, Tom; Austin, Jim

2013-01-28

The CARMEN platform allows neuroscientists to share data, metadata, services and workflows, and to execute these services and workflows remotely via a Web portal. This paper describes how we implemented a service-based infrastructure into the CARMEN Virtual Laboratory. A Software as a Service framework was developed to allow generic new and legacy code to be deployed as services on a heterogeneous execution framework. Users can submit analysis code typically written in Matlab, Python, C/C++ and R as non-interactive standalone command-line applications and wrap them as services in a form suitable for deployment on the platform. The CARMEN Service Builder tool enables neuroscientists to quickly wrap their analysis software for deployment to the CARMEN platform, as a service without knowledge of the service framework or the CARMEN system. A metadata schema describes each service in terms of both system and user requirements. The search functionality allows services to be quickly discovered from the many services available. Within the platform, services may be combined into more complicated analyses using the workflow tool. CARMEN and the service infrastructure are targeted towards the neuroscience community; however, it is a generic platform, and can be targeted towards any discipline.
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience

PubMed Central

Stockton, David B.; Santamaria, Fidel

2015-01-01

We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project. PMID:26528175
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience.

PubMed

Stockton, David B; Santamaria, Fidel

2015-01-01

We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project.
ProcessGene-Connect: SOA Integration between Business Process Models and Enactment Transactions of Enterprise Software Systems

NASA Astrophysics Data System (ADS)

Wasser, Avi; Lincoln, Maya

In recent years, both practitioners and applied researchers have become increasingly interested in methods for integrating business process models and enterprise software systems through the deployment of enabling middleware. Integrative BPM research has been mainly focusing on the conversion of workflow notations into enacted application procedures, and less effort has been invested in enhancing the connectivity between design level, non-workflow business process models and related enactment systems such as: ERP, SCM and CRM. This type of integration is useful at several stages of an IT system lifecycle, from design and implementation through change management, upgrades and rollout. The paper presents an integration method that utilizes SOA for connecting business process models with corresponding enterprise software systems. The method is then demonstrated through an Oracle E-Business Suite procurement process and its ERP transactions.
ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry

PubMed Central

2011-01-01

Background Since its inception, proteomics has essentially operated in a discovery mode with the goal of identifying and quantifying the maximal number of proteins in a sample. Increasingly, proteomic measurements are also supporting hypothesis-driven studies, in which a predetermined set of proteins is consistently detected and quantified in multiple samples. Selected reaction monitoring (SRM) is a targeted mass spectrometric technique that supports the detection and quantification of specific proteins in complex samples at high sensitivity and reproducibility. Here, we describe ATAQS, an integrated software platform that supports all stages of targeted, SRM-based proteomics experiments including target selection, transition optimization and post acquisition data analysis. This software will significantly facilitate the use of targeted proteomic techniques and contribute to the generation of highly sensitive, reproducible and complete datasets that are particularly critical for the discovery and validation of targets in hypothesis-driven studies in systems biology. Result We introduce a new open source software pipeline, ATAQS (Automated and Targeted Analysis with Quantitative SRM), which consists of a number of modules that collectively support the SRM assay development workflow for targeted proteomic experiments (project management and generation of protein, peptide and transitions and the validation of peptide detection by SRM). ATAQS provides a flexible pipeline for end-users by allowing the workflow to start or end at any point of the pipeline, and for computational biologists, by enabling the easy extension of java algorithm classes for their own algorithm plug-in or connection via an external web site. This integrated system supports all steps in a SRM-based experiment and provides a user-friendly GUI that can be run by any operating system that allows the installation of the Mozilla Firefox web browser. Conclusions Targeted proteomics via SRM is a powerful new technique that enables the reproducible and accurate identification and quantification of sets of proteins of interest. ATAQS is the first open-source software that supports all steps of the targeted proteomics workflow. ATAQS also provides software API (Application Program Interface) documentation that enables the addition of new algorithms to each of the workflow steps. The software, installation guide and sample dataset can be found in http://tools.proteomecenter.org/ATAQS/ATAQS.html PMID:21414234
Separating Business Logic from Medical Knowledge in Digital Clinical Workflows Using Business Process Model and Notation and Arden Syntax.

PubMed

de Bruin, Jeroen S; Adlassnig, Klaus-Peter; Leitich, Harald; Rappelsberger, Andrea

2018-01-01

Evidence-based clinical guidelines have a major positive effect on the physician's decision-making process. Computer-executable clinical guidelines allow for automated guideline marshalling during a clinical diagnostic process, thus improving the decision-making process. Implementation of a digital clinical guideline for the prevention of mother-to-child transmission of hepatitis B as a computerized workflow, thereby separating business logic from medical knowledge and decision-making. We used the Business Process Model and Notation language system Activiti for business logic and workflow modeling. Medical decision-making was performed by an Arden-Syntax-based medical rule engine, which is part of the ARDENSUITE software. We succeeded in creating an electronic clinical workflow for the prevention of mother-to-child transmission of hepatitis B, where institution-specific medical decision-making processes could be adapted without modifying the workflow business logic. Separation of business logic and medical decision-making results in more easily reusable electronic clinical workflows.
Implementation of a software for REmote COMparison of PARticlE and photon treatment plans: ReCompare.

PubMed

Löck, Steffen; Roth, Klaus; Skripcak, Tomas; Worbs, Mario; Helmbrecht, Stephan; Jakobi, Annika; Just, Uwe; Krause, Mechthild; Baumann, Michael; Enghardt, Wolfgang; Lühr, Armin

2015-09-01

To guarantee equal access to optimal radiotherapy, a concept of patient assignment to photon or particle radiotherapy using remote treatment plan exchange and comparison - ReCompare - was proposed. We demonstrate the implementation of this concept and present its clinical applicability. The ReCompare concept was implemented using a client-server based software solution. A clinical workflow for the remote treatment plan exchange and comparison was defined. The steps required by the user and performed by the software for a complete plan transfer were described and an additional module for dose-response modeling was added. The ReCompare software was successfully tested in cooperation with three external partner clinics and worked meeting all required specifications. It was compatible with several standard treatment planning systems, ensured patient data protection, and integrated in the clinical workflow. The ReCompare software can be applied to support non-particle radiotherapy institutions with the patient-specific treatment decision on the optimal irradiation modality by remote treatment plan exchange and comparison. Copyright © 2015. Published by Elsevier GmbH.
Building Digital Audio Preservation Infrastructure and Workflows

ERIC Educational Resources Information Center

Young, Anjanette; Olivieri, Blynne; Eckler, Karl; Gerontakos, Theodore

2010-01-01

In 2009 the University of Washington (UW) Libraries special collections received funding for the digital preservation of its audio indigenous language holdings. The university libraries, where the authors work in various capacities, had begun digitizing image and text collections in 1997. Because of this, at the onset of the project, workflows (a…
A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control

PubMed Central

Mathew, Cherian; Obst, Matthias; Vicario, Saverio; Haines, Robert; Williams, Alan R.; de Jong, Yde; Goble, Carole

2014-01-01

Abstract The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users. PMID:25535486
Data processing workflow for time of flight polarized neutrons inelastic measurements

DOE PAGES

Savici, Andrei T.; Zaliznyak, Igor A.; Ovidiu Garlea, V.; ...

2017-06-01

We discuss the data processing workflow for polarized neutron scattering measurements performed at HYSPEC spectrometer at the Spallation Neutron Source, Oak Ridge National Laboratory. The effects of the focusing Heusler crystal polarizer and the wide-angle supermirror transmission polarization analyzer are added to the data processing flow of the non-polarized case. The implementation is done using the Mantid software package.

A Scalable, Open Source Platform for Data Processing, Archiving and Dissemination

DTIC Science & Technology

2016-01-01

Object Oriented Data Technology (OODT) big data toolkit developed by NASA and the Work-flow INstance Generation and Selection (WINGS) scientific work...to several challenge big data problems and demonstrated the utility of OODT-WINGS in addressing them. Specific demonstrated analyses address i...source software, Apache, Object Oriented Data Technology, OODT, semantic work-flows, WINGS, big data , work- flow management 16. SECURITY CLASSIFICATION OF
Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset.

PubMed

Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

2016-01-30

Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Applying Content Management to Automated Provenance Capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schuchardt, Karen L.; Gibson, Tara D.; Stephan, Eric G.

2008-04-10

Workflows and data pipelines are becoming increasingly valuable in both computational and experimen-tal sciences. These automated systems are capable of generating significantly more data within the same amount of time than their manual counterparts. Automatically capturing and recording data prove-nance and annotation as part of these workflows is critical for data management, verification, and dis-semination. Our goal in addressing the provenance challenge was to develop and end-to-end system that demonstrates real-time capture, persistent content management, and ad-hoc searches of both provenance and metadata using open source software and standard protocols. We describe our prototype, which extends the Kepler workflow toolsmore » for the execution environment, the Scientific Annotation Middleware (SAM) content management software for data services, and an existing HTTP-based query protocol. Our implementation offers several unique capabilities, and through the use of standards, is able to pro-vide access to the provenance record to a variety of commonly available client tools.« less
Future Software Sizing Metrics and Estimation Challenges

DTIC Science & Technology

2011-07-01

systems 4. Ultrahigh software system assurance 5. Legacy maintenance and Brownfield development 6. Agile and Lean/ Kanban development. This paper...refined as the design of the maintenance modifications or Brownfield re-engineering is determined. VII. 6. AGILE AND LEAN/ KANBAN DEVELOPMENT The...difficulties of software maintenance estimation can often be mitigated by using lean workflow management techniques such as Kanban [25]. In Kanban
Integrative Lifecourse and Genetic Analysis of Military Working Dogs

DTIC Science & Technology

2012-10-01

Recognition), ICR (Intelligent Character Recognition) and HWR ( Handwriting Recognition). A number of various software packages were evaluated and we have...the third-party software is able to recognize check-boxes and columns and do a reasonable job with handwriting – which is does. This workflow will
Integrative Lifecourse and Genetic Analysis of Military Working Dogs

DTIC Science & Technology

2012-10-01

Intelligent Character Recognition) and HWR ( Handwriting Recognition). A number of various software packages were evaluated and we have settled on a...third-party software is able to recognize check-boxes and columns and do a reasonable job with handwriting – which is does. This workflow will
NMRbox: A Resource for Biomolecular NMR Computation.

PubMed

Maciejewski, Mark W; Schuyler, Adam D; Gryk, Michael R; Moraru, Ion I; Romero, Pedro R; Ulrich, Eldon L; Eghbalnia, Hamid R; Livny, Miron; Delaglio, Frank; Hoch, Jeffrey C

2017-04-25

Advances in computation have been enabling many recent advances in biomolecular applications of NMR. Due to the wide diversity of applications of NMR, the number and variety of software packages for processing and analyzing NMR data is quite large, with labs relying on dozens, if not hundreds of software packages. Discovery, acquisition, installation, and maintenance of all these packages is a burdensome task. Because the majority of software packages originate in academic labs, persistence of the software is compromised when developers graduate, funding ceases, or investigators turn to other projects. To simplify access to and use of biomolecular NMR software, foster persistence, and enhance reproducibility of computational workflows, we have developed NMRbox, a shared resource for NMR software and computation. NMRbox employs virtualization to provide a comprehensive software environment preconfigured with hundreds of software packages, available as a downloadable virtual machine or as a Platform-as-a-Service supported by a dedicated compute cloud. Ongoing development includes a metadata harvester to regularize, annotate, and preserve workflows and facilitate and enhance data depositions to BioMagResBank, and tools for Bayesian inference to enhance the robustness and extensibility of computational analyses. In addition to facilitating use and preservation of the rich and dynamic software environment for biomolecular NMR, NMRbox fosters the development and deployment of a new class of metasoftware packages. NMRbox is freely available to not-for-profit users. Copyright © 2017 Biophysical Society. All rights reserved.
Modeling stroke rehabilitation processes using the Unified Modeling Language (UML).

PubMed

Ferrante, Simona; Bonacina, Stefano; Pinciroli, Francesco

2013-10-01

In organising and providing rehabilitation procedures for stroke patients, the usual need for many refinements makes it inappropriate to attempt rigid standardisation, but greater detail is required concerning workflow. The aim of this study was to build a model of the post-stroke rehabilitation process. The model, implemented in the Unified Modeling Language, was grounded on international guidelines and refined following the clinical pathway adopted at local level by a specialized rehabilitation centre. The model describes the organisation of the rehabilitation delivery and it facilitates the monitoring of recovery during the process. Indeed, a system software was developed and tested to support clinicians in the digital administration of clinical scales. The model flexibility assures easy updating after process evolution. Copyright © 2013 Elsevier Ltd. All rights reserved.
Comparison of peak-picking workflows for untargeted liquid chromatography/high-resolution mass spectrometry metabolomics data analysis.

PubMed

Rafiei, Atefeh; Sleno, Lekha

2015-01-15

Data analysis is a key step in mass spectrometry based untargeted metabolomics, starting with the generation of generic peak lists from raw liquid chromatography/mass spectrometry (LC/MS) data. Due to the use of various algorithms by different workflows, the results of different peak-picking strategies often differ widely. Raw LC/HRMS data from two types of biological samples (bile and urine), as well as a standard mixture of 84 metabolites, were processed with four peak-picking softwares: Peakview®, Markerview™, MetabolitePilot™ and XCMS Online. The overlaps between the results of each peak-generating method were then investigated. To gauge the relevance of peak lists, a database search using the METLIN online database was performed to determine which features had accurate masses matching known metabolites as well as a secondary filtering based on MS/MS spectral matching. In this study, only a small proportion of all peaks (less than 10%) were common to all four software programs. Comparison of database searching results showed peaks found uniquely by one workflow have less chance of being found in the METLIN metabolomics database and are even less likely to be confirmed by MS/MS. It was shown that the performance of peak-generating workflows has a direct impact on untargeted metabolomics results. As it was demonstrated that the peaks found in more than one peak detection workflow have higher potential to be identified by accurate mass as well as MS/MS spectrum matching, it is suggested to use the overlap of different peak-picking workflows as preliminary peak lists for more rugged statistical analysis in global metabolomics investigations. Copyright © 2014 John Wiley & Sons, Ltd.
Developing a Collection of Composable Data Translation Software Units to Improve Efficiency and Reproducibility in Ecohydrologic Modeling Workflows

NASA Astrophysics Data System (ADS)

Olschanowsky, C.; Flores, A. N.; FitzGerald, K.; Masarik, M. T.; Rudisill, W. J.; Aguayo, M.

2017-12-01

Dynamic models of the spatiotemporal evolution of water, energy, and nutrient cycling are important tools to assess impacts of climate and other environmental changes on ecohydrologic systems. These models require spatiotemporally varying environmental forcings like precipitation, temperature, humidity, windspeed, and solar radiation. These input data originate from a variety of sources, including global and regional weather and climate models, global and regional reanalysis products, and geostatistically interpolated surface observations. Data translation measures, often subsetting in space and/or time and transforming and converting variable units, represent a seemingly mundane, but critical step in the application workflows. Translation steps can introduce errors, misrepresentations of data, slow execution time, and interrupt data provenance. We leverage a workflow that subsets a large regional dataset derived from the Weather Research and Forecasting (WRF) model and prepares inputs to the Parflow integrated hydrologic model to demonstrate the impact translation tool software quality on scientific workflow results and performance. We propose that such workflows will benefit from a community approved collection of data transformation components. The components should be self-contained composable units of code. This design pattern enables automated parallelization and software verification, improving performance and reliability. Ensuring that individual translation components are self-contained and target minute tasks increases reliability. The small code size of each component enables effective unit and regression testing. The components can be automatically composed for efficient execution. An efficient data translation framework should be written to minimize data movement. Composing components within a single streaming process reduces data movement. Each component will typically have a low arithmetic intensity, meaning that it requires about the same number of bytes to be read as the number of computations it performs. When several components' executions are coordinated the overall arithmetic intensity increases, leading to increased efficiency.
ESO Reflex: a graphical workflow engine for data reduction

NASA Astrophysics Data System (ADS)

Hook, Richard; Ullgrén, Marko; Romaniello, Martino; Maisala, Sami; Oittinen, Tero; Solin, Otto; Savolainen, Ville; Järveläinen, Pekka; Tyynelä, Jani; Péron, Michèle; Ballester, Pascal; Gabasch, Armin; Izzo, Carlo

ESO Reflex is a prototype software tool that provides a novel approach to astronomical data reduction by integrating a modern graphical workflow system (Taverna) with existing legacy data reduction algorithms. Most of the raw data produced by instruments at the ESO Very Large Telescope (VLT) in Chile are reduced using recipes. These are compiled C applications following an ESO standard and utilising routines provided by the Common Pipeline Library (CPL). Currently these are run in batch mode as part of the data flow system to generate the input to the ESO/VLT quality control process and are also exported for use offline. ESO Reflex can invoke CPL-based recipes in a flexible way through a general purpose graphical interface. ESO Reflex is based on the Taverna system that was originally developed within the UK life-sciences community. Workflows have been created so far for three VLT/VLTI instruments, and the GUI allows the user to make changes to these or create workflows of their own. Python scripts or IDL procedures can be easily brought into workflows and a variety of visualisation and display options, including custom product inspection and validation steps, are available. Taverna is intended for use with web services and experiments using ESO Reflex to access Virtual Observatory web services have been successfully performed. ESO Reflex is the main product developed by Sampo, a project led by ESO and conducted by a software development team from Finland as an in-kind contribution to joining ESO. The goal was to look into the needs of the ESO community in the area of data reduction environments and to create pilot software products that illustrate critical steps along the road to a new system. Sampo concluded early in 2008. This contribution will describe ESO Reflex and show several examples of its use both locally and using Virtual Observatory remote web services. ESO Reflex is expected to be released to the community in early 2009.
Developing Software to “Track and Catch” Missed Follow-up of Abnormal Test Results in a Complex Sociotechnical Environment

PubMed Central

Smith, M.; Murphy, D.; Laxmisan, A.; Sittig, D.; Reis, B.; Esquivel, A.; Singh, H.

2013-01-01

Summary Background Abnormal test results do not always receive timely follow-up, even when providers are notified through electronic health record (EHR)-based alerts. High workload, alert fatigue, and other demands on attention disrupt a provider’s prospective memory for tasks required to initiate follow-up. Thus, EHR-based tracking and reminding functionalities are needed to improve follow-up. Objectives The purpose of this study was to develop a decision-support software prototype enabling individual and system-wide tracking of abnormal test result alerts lacking follow-up, and to conduct formative evaluations, including usability testing. Methods We developed a working prototype software system, the Alert Watch And Response Engine (AWARE), to detect abnormal test result alerts lacking documented follow-up, and to present context-specific reminders to providers. Development and testing took place within the VA’s EHR and focused on four cancer-related abnormal test results. Design concepts emphasized mitigating the effects of high workload and alert fatigue while being minimally intrusive. We conducted a multifaceted formative evaluation of the software, addressing fit within the larger socio-technical system. Evaluations included usability testing with the prototype and interview questions about organizational and workflow factors. Participants included 23 physicians, 9 clinical information technology specialists, and 8 quality/safety managers. Results Evaluation results indicated that our software prototype fit within the technical environment and clinical workflow, and physicians were able to use it successfully. Quality/safety managers reported that the tool would be useful in future quality assurance activities to detect patients who lack documented follow-up. Additionally, we successfully installed the software on the local facility’s “test” EHR system, thus demonstrating technical compatibility. Conclusion To address the factors involved in missed test results, we developed a software prototype to account for technical, usability, organizational, and workflow needs. Our evaluation has shown the feasibility of the prototype as a means of facilitating better follow-up for cancer-related abnormal test results. PMID:24155789
Developing software to "track and catch" missed follow-up of abnormal test results in a complex sociotechnical environment.

PubMed

Smith, M; Murphy, D; Laxmisan, A; Sittig, D; Reis, B; Esquivel, A; Singh, H

2013-01-01

Abnormal test results do not always receive timely follow-up, even when providers are notified through electronic health record (EHR)-based alerts. High workload, alert fatigue, and other demands on attention disrupt a provider's prospective memory for tasks required to initiate follow-up. Thus, EHR-based tracking and reminding functionalities are needed to improve follow-up. The purpose of this study was to develop a decision-support software prototype enabling individual and system-wide tracking of abnormal test result alerts lacking follow-up, and to conduct formative evaluations, including usability testing. We developed a working prototype software system, the Alert Watch And Response Engine (AWARE), to detect abnormal test result alerts lacking documented follow-up, and to present context-specific reminders to providers. Development and testing took place within the VA's EHR and focused on four cancer-related abnormal test results. Design concepts emphasized mitigating the effects of high workload and alert fatigue while being minimally intrusive. We conducted a multifaceted formative evaluation of the software, addressing fit within the larger socio-technical system. Evaluations included usability testing with the prototype and interview questions about organizational and workflow factors. Participants included 23 physicians, 9 clinical information technology specialists, and 8 quality/safety managers. Evaluation results indicated that our software prototype fit within the technical environment and clinical workflow, and physicians were able to use it successfully. Quality/safety managers reported that the tool would be useful in future quality assurance activities to detect patients who lack documented follow-up. Additionally, we successfully installed the software on the local facility's "test" EHR system, thus demonstrating technical compatibility. To address the factors involved in missed test results, we developed a software prototype to account for technical, usability, organizational, and workflow needs. Our evaluation has shown the feasibility of the prototype as a means of facilitating better follow-up for cancer-related abnormal test results.
Talkoot Portals: Discover, Tag, Share, and Reuse Collaborative Science Workflows

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Ramachandran, R.; Lynnes, C.

2009-05-01

A small but growing number of scientists are beginning to harness Web 2.0 technologies, such as wikis, blogs, and social tagging, as a transformative way of doing science. These technologies provide researchers easy mechanisms to critique, suggest and share ideas, data and algorithms. At the same time, large suites of algorithms for science analysis are being made available as remotely-invokable Web Services, which can be chained together to create analysis workflows. This provides the research community an unprecedented opportunity to collaborate by sharing their workflows with one another, reproducing and analyzing research results, and leveraging colleagues' expertise to expedite the process of scientific discovery. However, wikis and similar technologies are limited to text, static images and hyperlinks, providing little support for collaborative data analysis. A team of information technology and Earth science researchers from multiple institutions have come together to improve community collaboration in science analysis by developing a customizable "software appliance" to build collaborative portals for Earth Science services and analysis workflows. The critical requirement is that researchers (not just information technologists) be able to build collaborative sites around service workflows within a few hours. We envision online communities coming together, much like Finnish "talkoot" (a barn raising), to build a shared research space. Talkoot extends a freely available, open source content management framework with a series of modules specific to Earth Science for registering, creating, managing, discovering, tagging and sharing Earth Science web services and workflows for science data processing, analysis and visualization. Users will be able to author a "science story" in shareable web notebooks, including plots or animations, backed up by an executable workflow that directly reproduces the science analysis. New services and workflows of interest will be discoverable using tag search, and advertised using "service casts" and "interest casts" (Atom feeds). Multiple science workflow systems will be plugged into the system, with initial support for UAH's Mining Workflow Composer and the open-source Active BPEL engine, and JPL's SciFlo engine and the VizFlow visual programming interface. With the ability to share and execute analysis workflows, Talkoot portals can be used to do collaborative science in addition to communicate ideas and results. It will be useful for different science domains, mission teams, research projects and organizations. Thus, it will help to solve the "sociological" problem of bringing together disparate groups of researchers, and the technical problem of advertising, discovering, developing, documenting, and maintaining inter-agency science workflows. The presentation will discuss the goals of and barriers to Science 2.0, the social web technologies employed in the Talkoot software appliance (e.g. CMS, social tagging, personal presence, advertising by feeds, etc.), illustrate the resulting collaborative capabilities, and show early prototypes of the web interfaces (e.g. embedded workflows).
Technology-driven dietary assessment: a software developer’s perspective

PubMed Central

Buday, Richard; Tapia, Ramsey; Maze, Gary R.

2015-01-01

Dietary researchers need new software to improve nutrition data collection and analysis, but creating information technology is difficult. Software development projects may be unsuccessful due to inadequate understanding of needs, management problems, technology barriers or legal hurdles. Cost overruns and schedule delays are common. Barriers facing scientific researchers developing software include workflow, cost, schedule, and team issues. Different methods of software development and the role that intellectual property rights play are discussed. A dietary researcher must carefully consider multiple issues to maximize the likelihood of success when creating new software. PMID:22591224
Towards a New Paradigm of Software Development: an Ambassador Driven Process in Distributed Software Companies

NASA Astrophysics Data System (ADS)

Kumlander, Deniss

The globalization of companies operations and competitor between software vendors demand improving quality of delivered software and decreasing the overall cost. The same in fact introduce a lot of problem into software development process as produce distributed organization breaking the co-location rule of modern software development methodologies. Here we propose a reformulation of the ambassador position increasing its productivity in order to bridge communication and workflow gap by managing the entire communication process rather than concentrating purely on the communication result.
Formal verification of software-based medical devices considering medical guidelines.

PubMed

Daw, Zamira; Cleaveland, Rance; Vetter, Marcus

2014-01-01

Software-based devices have increasingly become an important part of several clinical scenarios. Due to their critical impact on human life, medical devices have very strict safety requirements. It is therefore necessary to apply verification methods to ensure that the safety requirements are met. Verification of software-based devices is commonly limited to the verification of their internal elements without considering the interaction that these elements have with other devices as well as the application environment in which they are used. Medical guidelines define clinical procedures, which contain the necessary information to completely verify medical devices. The objective of this work was to incorporate medical guidelines into the verification process in order to increase the reliability of the software-based medical devices. Medical devices are developed using the model-driven method deterministic models for signal processing of embedded systems (DMOSES). This method uses unified modeling language (UML) models as a basis for the development of medical devices. The UML activity diagram is used to describe medical guidelines as workflows. The functionality of the medical devices is abstracted as a set of actions that is modeled within these workflows. In this paper, the UML models are verified using the UPPAAL model-checker. For this purpose, a formalization approach for the UML models using timed automaton (TA) is presented. A set of requirements is verified by the proposed approach for the navigation-guided biopsy. This shows the capability for identifying errors or optimization points both in the workflow and in the system design of the navigation device. In addition to the above, an open source eclipse plug-in was developed for the automated transformation of UML models into TA models that are automatically verified using UPPAAL. The proposed method enables developers to model medical devices and their clinical environment using clinical workflows as one UML diagram. Additionally, the system design can be formally verified automatically.
Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis.

PubMed

Cormier, Nathan; Kolisnik, Tyler; Bieda, Mark

2016-07-05

There has been an enormous expansion of use of chromatin immunoprecipitation followed by sequencing (ChIP-seq) technologies. Analysis of large-scale ChIP-seq datasets involves a complex series of steps and production of several specialized graphical outputs. A number of systems have emphasized custom development of ChIP-seq pipelines. These systems are primarily based on custom programming of a single, complex pipeline or supply libraries of modules and do not produce the full range of outputs commonly produced for ChIP-seq datasets. It is desirable to have more comprehensive pipelines, in particular ones addressing common metadata tasks, such as pathway analysis, and pipelines producing standard complex graphical outputs. It is advantageous if these are highly modular systems, available as both turnkey pipelines and individual modules, that are easily comprehensible, modifiable and extensible to allow rapid alteration in response to new analysis developments in this growing area. Furthermore, it is advantageous if these pipelines allow data provenance tracking. We present a set of 20 ChIP-seq analysis software modules implemented in the Kepler workflow system; most (18/20) were also implemented as standalone, fully functional R scripts. The set consists of four full turnkey pipelines and 16 component modules. The turnkey pipelines in Kepler allow data provenance tracking. Implementation emphasized use of common R packages and widely-used external tools (e.g., MACS for peak finding), along with custom programming. This software presents comprehensive solutions and easily repurposed code blocks for ChIP-seq analysis and pipeline creation. Tasks include mapping raw reads, peakfinding via MACS, summary statistics, peak location statistics, summary plots centered on the transcription start site (TSS), gene ontology, pathway analysis, and de novo motif finding, among others. These pipelines range from those performing a single task to those performing full analyses of ChIP-seq data. The pipelines are supplied as both Kepler workflows, which allow data provenance tracking, and, in the majority of cases, as standalone R scripts. These pipelines are designed for ease of modification and repurposing.
Standardizing clinical trials workflow representation in UML for international site comparison.

PubMed

de Carvalho, Elias Cesar Araujo; Jayanti, Madhav Kishore; Batilana, Adelia Portero; Kozan, Andreia M O; Rodrigues, Maria J; Shah, Jatin; Loures, Marco R; Patil, Sunita; Payne, Philip; Pietrobon, Ricardo

2010-11-09

With the globalization of clinical trials, a growing emphasis has been placed on the standardization of the workflow in order to ensure the reproducibility and reliability of the overall trial. Despite the importance of workflow evaluation, to our knowledge no previous studies have attempted to adapt existing modeling languages to standardize the representation of clinical trials. Unified Modeling Language (UML) is a computational language that can be used to model operational workflow, and a UML profile can be developed to standardize UML models within a given domain. This paper's objective is to develop a UML profile to extend the UML Activity Diagram schema into the clinical trials domain, defining a standard representation for clinical trial workflow diagrams in UML. Two Brazilian clinical trial sites in rheumatology and oncology were examined to model their workflow and collect time-motion data. UML modeling was conducted in Eclipse, and a UML profile was developed to incorporate information used in discrete event simulation software. Ethnographic observation revealed bottlenecks in workflow: these included tasks requiring full commitment of CRCs, transferring notes from paper to computers, deviations from standard operating procedures, and conflicts between different IT systems. Time-motion analysis revealed that nurses' activities took up the most time in the workflow and contained a high frequency of shorter duration activities. Administrative assistants performed more activities near the beginning and end of the workflow. Overall, clinical trial tasks had a greater frequency than clinic routines or other general activities. This paper describes a method for modeling clinical trial workflow in UML and standardizing these workflow diagrams through a UML profile. In the increasingly global environment of clinical trials, the standardization of workflow modeling is a necessary precursor to conducting a comparative analysis of international clinical trials workflows.
Standardizing Clinical Trials Workflow Representation in UML for International Site Comparison

PubMed Central

de Carvalho, Elias Cesar Araujo; Jayanti, Madhav Kishore; Batilana, Adelia Portero; Kozan, Andreia M. O.; Rodrigues, Maria J.; Shah, Jatin; Loures, Marco R.; Patil, Sunita; Payne, Philip; Pietrobon, Ricardo

2010-01-01

Background With the globalization of clinical trials, a growing emphasis has been placed on the standardization of the workflow in order to ensure the reproducibility and reliability of the overall trial. Despite the importance of workflow evaluation, to our knowledge no previous studies have attempted to adapt existing modeling languages to standardize the representation of clinical trials. Unified Modeling Language (UML) is a computational language that can be used to model operational workflow, and a UML profile can be developed to standardize UML models within a given domain. This paper's objective is to develop a UML profile to extend the UML Activity Diagram schema into the clinical trials domain, defining a standard representation for clinical trial workflow diagrams in UML. Methods Two Brazilian clinical trial sites in rheumatology and oncology were examined to model their workflow and collect time-motion data. UML modeling was conducted in Eclipse, and a UML profile was developed to incorporate information used in discrete event simulation software. Results Ethnographic observation revealed bottlenecks in workflow: these included tasks requiring full commitment of CRCs, transferring notes from paper to computers, deviations from standard operating procedures, and conflicts between different IT systems. Time-motion analysis revealed that nurses' activities took up the most time in the workflow and contained a high frequency of shorter duration activities. Administrative assistants performed more activities near the beginning and end of the workflow. Overall, clinical trial tasks had a greater frequency than clinic routines or other general activities. Conclusions This paper describes a method for modeling clinical trial workflow in UML and standardizing these workflow diagrams through a UML profile. In the increasingly global environment of clinical trials, the standardization of workflow modeling is a necessary precursor to conducting a comparative analysis of international clinical trials workflows. PMID:21085484

Development of a novel imaging informatics-based system with an intelligent workflow engine (IWEIS) to support imaging-based clinical trials

PubMed Central

Wang, Ximing; Liu, Brent J; Martinez, Clarisa; Zhang, Xuejun; Winstein, Carolee J

2015-01-01

Imaging based clinical trials can benefit from a solution to efficiently collect, analyze, and distribute multimedia data at various stages within the workflow. Currently, the data management needs of these trials are typically addressed with custom-built systems. However, software development of the custom- built systems for versatile workflows can be resource-consuming. To address these challenges, we present a system with a workflow engine for imaging based clinical trials. The system enables a project coordinator to build a data collection and management system specifically related to study protocol workflow without programming. Web Access to DICOM Objects (WADO) module with novel features is integrated to further facilitate imaging related study. The system was initially evaluated by an imaging based rehabilitation clinical trial. The evaluation shows that the cost of the development of system can be much reduced compared to the custom-built system. By providing a solution to customize a system and automate the workflow, the system will save on development time and reduce errors especially for imaging clinical trials. PMID:25870169
GUEST EDITOR'S INTRODUCTION: Guest Editor's introduction

NASA Astrophysics Data System (ADS)

Chrysanthis, Panos K.

1996-12-01

Computer Science Department, University of Pittsburgh, Pittsburgh, PA 15260, USA This special issue focuses on current efforts to represent and support workflows that integrate information systems and human resources within a business or manufacturing enterprise. Workflows may also be viewed as an emerging computational paradigm for effective structuring of cooperative applications involving human users and access to diverse data types not necessarily maintained by traditional database management systems. A workflow is an automated organizational process (also called business process) which consists of a set of activities or tasks that need to be executed in a particular controlled order over a combination of heterogeneous database systems and legacy systems. Within workflows, tasks are performed cooperatively by either human or computational agents in accordance with their roles in the organizational hierarchy. The challenge in facilitating the implementation of workflows lies in developing efficient workflow management systems. A workflow management system (also called workflow server, workflow engine or workflow enactment system) provides the necessary interfaces for coordination and communication among human and computational agents to execute the tasks involved in a workflow and controls the execution orderings of tasks as well as the flow of data that these tasks manipulate. That is, the workflow management system is responsible for correctly and reliably supporting the specification, execution, and monitoring of workflows. The six papers selected (out of the twenty-seven submitted for this special issue of Distributed Systems Engineering) address different aspects of these three functional components of a workflow management system. In the first paper, `Correctness issues in workflow management', Kamath and Ramamritham discuss the important issue of correctness in workflow management that constitutes a prerequisite for the use of workflows in the automation of the critical organizational/business processes. In particular, this paper examines the issues of execution atomicity and failure atomicity, differentiating between correctness requirements of system failures and logical failures, and surveys techniques that can be used to ensure data consistency in workflow management systems. While the first paper is concerned with correctness assuming transactional workflows in which selective transactional properties are associated with individual tasks or the entire workflow, the second paper, `Scheduling workflows by enforcing intertask dependencies' by Attie et al, assumes that the tasks can be either transactions or other activities involving legacy systems. This second paper describes the modelling and specification of conditions involving events and dependencies among tasks within a workflow using temporal logic and finite state automata. It also presents a scheduling algorithm that enforces all stated dependencies by executing at any given time only those events that are allowed by all the dependency automata and in an order as specified by the dependencies. In any system with decentralized control, there is a need to effectively cope with the tension that exists between autonomy and consistency requirements. In `A three-level atomicity model for decentralized workflow management systems', Ben-Shaul and Heineman focus on the specific requirement of enforcing failure atomicity in decentralized, autonomous and interacting workflow management systems. Their paper describes a model in which each workflow manager must be able to specify the sequence of tasks that comprise an atomic unit for the purposes of correctness, and the degrees of local and global atomicity for the purpose of cooperation with other workflow managers. The paper also discusses a realization of this model in which treaties and summits provide an agreement mechanism, while underlying transaction managers are responsible for maintaining failure atomicity. The fourth and fifth papers are experience papers describing a workflow management system and a large scale workflow application, respectively. Schill and Mittasch, in `Workflow management systems on top of OSF DCE and OMG CORBA', describe a decentralized workflow management system and discuss its implementation using two standardized middleware platforms, namely, OSF DCE and OMG CORBA. The system supports a new approach to workflow management, introducing several new concepts such as data type management for integrating various types of data and quality of service for various services provided by servers. A problem common to both database applications and workflows is the handling of missing and incomplete information. This is particularly pervasive in an `electronic market' with a huge number of retail outlets producing and exchanging volumes of data, the application discussed in `Information flow in the DAMA project beyond database managers: information flow managers'. Motivated by the need for a method that allows a task to proceed in a timely manner if not all data produced by other tasks are available by its deadline, Russell et al propose an architectural framework and a language that can be used to detect, approximate and, later on, to adjust missing data if necessary. The final paper, `The evolution towards flexible workflow systems' by Nutt, is complementary to the other papers and is a survey of issues and of work related to both workflow and computer supported collaborative work (CSCW) areas. In particular, the paper provides a model and a categorization of the dimensions which workflow management and CSCW systems share. Besides summarizing the recent advancements towards efficient workflow management, the papers in this special issue suggest areas open to investigation and it is our hope that they will also provide the stimulus for further research and development in the area of workflow management systems.
Geometric Calibration and Validation of Ultracam Aerial Sensors

NASA Astrophysics Data System (ADS)

Gruber, Michael; Schachinger, Bernhard; Muick, Marc; Neuner, Christian; Tschemmernegg, Helfried

2016-03-01

We present details of the calibration and validation procedure of UltraCam Aerial Camera systems. Results from the laboratory calibration and from validation flights are presented for both, the large format nadir cameras and the oblique cameras as well. Thus in this contribution we show results from the UltraCam Eagle and the UltraCam Falcon, both nadir mapping cameras, and the UltraCam Osprey, our oblique camera system. This sensor offers a mapping grade nadir component together with the four oblique camera heads. The geometric processing after the flight mission is being covered by the UltraMap software product. Thus we present details about the workflow as well. The first part consists of the initial post-processing which combines image information as well as camera parameters derived from the laboratory calibration. The second part, the traditional automated aerial triangulation (AAT) is the step from single images to blocks and enables an additional optimization process. We also present some special features of our software, which are designed to better support the operator to analyze large blocks of aerial images and to judge the quality of the photogrammetric set-up.
ibex: An open infrastructure software platform to facilitate collaborative work in radiomics

PubMed Central

Zhang, Lifei; Fried, David V.; Fave, Xenia J.; Hunter, Luke A.; Court, Laurence E.

2015-01-01

Purpose: Radiomics, which is the high-throughput extraction and analysis of quantitative image features, has been shown to have considerable potential to quantify the tumor phenotype. However, at present, a lack of software infrastructure has impeded the development of radiomics and its applications. Therefore, the authors developed the imaging biomarker explorer (ibex), an open infrastructure software platform that flexibly supports common radiomics workflow tasks such as multimodality image data import and review, development of feature extraction algorithms, model validation, and consistent data sharing among multiple institutions. Methods: The ibex software package was developed using the matlab and c/c++ programming languages. The software architecture deploys the modern model-view-controller, unit testing, and function handle programming concepts to isolate each quantitative imaging analysis task, to validate if their relevant data and algorithms are fit for use, and to plug in new modules. On one hand, ibex is self-contained and ready to use: it has implemented common data importers, common image filters, and common feature extraction algorithms. On the other hand, ibex provides an integrated development environment on top of matlab and c/c++, so users are not limited to its built-in functions. In the ibex developer studio, users can plug in, debug, and test new algorithms, extending ibex’s functionality. ibex also supports quality assurance for data and feature algorithms: image data, regions of interest, and feature algorithm-related data can be reviewed, validated, and/or modified. More importantly, two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the ibex workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions. Results: Researchers with a variety of technical skill levels, including radiation oncologists, physicists, and computer scientists, have found the ibex software to be intuitive, powerful, and easy to use. ibex can be run at any computer with the windows operating system and 1GB RAM. The authors fully validated the implementation of all importers, preprocessing algorithms, and feature extraction algorithms. Windows version 1.0 beta of stand-alone ibex and ibex’s source code can be downloaded. Conclusions: The authors successfully implemented ibex, an open infrastructure software platform that streamlines common radiomics workflow tasks. Its transparency, flexibility, and portability can greatly accelerate the pace of radiomics research and pave the way toward successful clinical translation. PMID:25735289
IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics.

PubMed

Zhang, Lifei; Fried, David V; Fave, Xenia J; Hunter, Luke A; Yang, Jinzhong; Court, Laurence E

2015-03-01

Radiomics, which is the high-throughput extraction and analysis of quantitative image features, has been shown to have considerable potential to quantify the tumor phenotype. However, at present, a lack of software infrastructure has impeded the development of radiomics and its applications. Therefore, the authors developed the imaging biomarker explorer (IBEX), an open infrastructure software platform that flexibly supports common radiomics workflow tasks such as multimodality image data import and review, development of feature extraction algorithms, model validation, and consistent data sharing among multiple institutions. The IBEX software package was developed using the MATLAB and c/c++ programming languages. The software architecture deploys the modern model-view-controller, unit testing, and function handle programming concepts to isolate each quantitative imaging analysis task, to validate if their relevant data and algorithms are fit for use, and to plug in new modules. On one hand, IBEX is self-contained and ready to use: it has implemented common data importers, common image filters, and common feature extraction algorithms. On the other hand, IBEX provides an integrated development environment on top of MATLAB and c/c++, so users are not limited to its built-in functions. In the IBEX developer studio, users can plug in, debug, and test new algorithms, extending IBEX's functionality. IBEX also supports quality assurance for data and feature algorithms: image data, regions of interest, and feature algorithm-related data can be reviewed, validated, and/or modified. More importantly, two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the IBEX workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions. Researchers with a variety of technical skill levels, including radiation oncologists, physicists, and computer scientists, have found the IBEX software to be intuitive, powerful, and easy to use. IBEX can be run at any computer with the windows operating system and 1GB RAM. The authors fully validated the implementation of all importers, preprocessing algorithms, and feature extraction algorithms. Windows version 1.0 beta of stand-alone IBEX and IBEX's source code can be downloaded. The authors successfully implemented IBEX, an open infrastructure software platform that streamlines common radiomics workflow tasks. Its transparency, flexibility, and portability can greatly accelerate the pace of radiomics research and pave the way toward successful clinical translation.
A standard-enabled workflow for synthetic biology.

PubMed

Myers, Chris J; Beal, Jacob; Gorochowski, Thomas E; Kuwahara, Hiroyuki; Madsen, Curtis; McLaughlin, James Alastair; Mısırlı, Göksel; Nguyen, Tramy; Oberortner, Ernst; Samineni, Meher; Wipat, Anil; Zhang, Michael; Zundel, Zach

2017-06-15

A synthetic biology workflow is composed of data repositories that provide information about genetic parts, sequence-level design tools to compose these parts into circuits, visualization tools to depict these designs, genetic design tools to select parts to create systems, and modeling and simulation tools to evaluate alternative design choices. Data standards enable the ready exchange of information within such a workflow, allowing repositories and tools to be connected from a diversity of sources. The present paper describes one such workflow that utilizes, among others, the Synthetic Biology Open Language (SBOL) to describe genetic designs, the Systems Biology Markup Language to model these designs, and SBOL Visual to visualize these designs. We describe how a standard-enabled workflow can be used to produce types of design information, including multiple repositories and software tools exchanging information using a variety of data standards. Recently, the ACS Synthetic Biology journal has recommended the use of SBOL in their publications. © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.
Clinical Decision Support Systems (CDSS) for preventive management of COPD patients.

PubMed

Velickovski, Filip; Ceccaroni, Luigi; Roca, Josep; Burgos, Felip; Galdiz, Juan B; Marina, Nuria; Lluch-Ariet, Magí

2014-11-28

The use of information and communication technologies to manage chronic diseases allows the application of integrated care pathways, and the optimization and standardization of care processes. Decision support tools can assist in the adherence to best-practice medicine in critical decision points during the execution of a care pathway. The objectives are to design, develop, and assess a clinical decision support system (CDSS) offering a suite of services for the early detection and assessment of chronic obstructive pulmonary disease (COPD), which can be easily integrated into a healthcare providers' work-flow. The software architecture model for the CDSS, interoperable clinical-knowledge representation, and inference engine were designed and implemented to form a base CDSS framework. The CDSS functionalities were iteratively developed through requirement-adjustment/development/validation cycles using enterprise-grade software-engineering methodologies and technologies. Within each cycle, clinical-knowledge acquisition was performed by a health-informatics engineer and a clinical-expert team. A suite of decision-support web services for (i) COPD early detection and diagnosis, (ii) spirometry quality-control support, (iii) patient stratification, was deployed in a secured environment on-line. The CDSS diagnostic performance was assessed using a validation set of 323 cases with 90% specificity, and 96% sensitivity. Web services were integrated in existing health information system platforms. Specialized decision support can be offered as a complementary service to existing policies of integrated care for chronic-disease management. The CDSS was able to issue recommendations that have a high degree of accuracy to support COPD case-finding. Integration into healthcare providers' work-flow can be achieved seamlessly through the use of a modular design and service-oriented architecture that connect to existing health information systems.
Clinical Decision Support Systems (CDSS) for preventive management of COPD patients

PubMed Central

2014-01-01

Background The use of information and communication technologies to manage chronic diseases allows the application of integrated care pathways, and the optimization and standardization of care processes. Decision support tools can assist in the adherence to best-practice medicine in critical decision points during the execution of a care pathway. Objectives The objectives are to design, develop, and assess a clinical decision support system (CDSS) offering a suite of services for the early detection and assessment of chronic obstructive pulmonary disease (COPD), which can be easily integrated into a healthcare providers' work-flow. Methods The software architecture model for the CDSS, interoperable clinical-knowledge representation, and inference engine were designed and implemented to form a base CDSS framework. The CDSS functionalities were iteratively developed through requirement-adjustment/development/validation cycles using enterprise-grade software-engineering methodologies and technologies. Within each cycle, clinical-knowledge acquisition was performed by a health-informatics engineer and a clinical-expert team. Results A suite of decision-support web services for (i) COPD early detection and diagnosis, (ii) spirometry quality-control support, (iii) patient stratification, was deployed in a secured environment on-line. The CDSS diagnostic performance was assessed using a validation set of 323 cases with 90% specificity, and 96% sensitivity. Web services were integrated in existing health information system platforms. Conclusions Specialized decision support can be offered as a complementary service to existing policies of integrated care for chronic-disease management. The CDSS was able to issue recommendations that have a high degree of accuracy to support COPD case-finding. Integration into healthcare providers' work-flow can be achieved seamlessly through the use of a modular design and service-oriented architecture that connect to existing health information systems. PMID:25471545
VDJServer: A Cloud-Based Analysis Portal and Data Commons for Immune Repertoire Sequences and Rearrangements.

PubMed

Christley, Scott; Scarborough, Walter; Salinas, Eddie; Rounds, William H; Toby, Inimary T; Fonner, John M; Levin, Mikhail K; Kim, Min; Mock, Stephen A; Jordan, Christopher; Ostmeyer, Jared; Buntzman, Adam; Rubelt, Florian; Davila, Marco L; Monson, Nancy L; Scheuermann, Richard H; Cowell, Lindsay G

2018-01-01

Recent technological advances in immune repertoire sequencing have created tremendous potential for advancing our understanding of adaptive immune response dynamics in various states of health and disease. Immune repertoire sequencing produces large, highly complex data sets, however, which require specialized methods and software tools for their effective analysis and interpretation. VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provide access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene segment assignment, repertoire characterization, and repertoire comparison. VDJServer also provides sophisticated visualizations for exploratory analysis. It is accessible through a standard web browser via a graphical user interface designed for use by immunologists, clinicians, and bioinformatics researchers. VDJServer provides a data commons for public sharing of repertoire sequencing data, as well as private sharing of data between users. We describe the main functionality and architecture of VDJServer and demonstrate its capabilities with use cases from cancer immunology and autoimmunity. VDJServer provides a complete analysis suite for human and mouse T-cell and B-cell receptor repertoire sequencing data. The combination of its user-friendly interface and high-performance computing allows large immune repertoire sequencing projects to be analyzed with no programming or software installation required. VDJServer is a web-accessible cloud platform that provides access through a graphical user interface to a data management infrastructure, a collection of analysis tools covering all steps in an analysis, and an infrastructure for sharing data along with workflows, results, and computational provenance. VDJServer is a free, publicly available, and open-source licensed resource.
Planning bioinformatics workflows using an expert system.

PubMed

Chen, Xiaoling; Chang, Jeffrey T

2017-04-15

Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. https://github.com/jefftc/changlab. jeffrey.t.chang@uth.tmc.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Planning bioinformatics workflows using an expert system

PubMed Central

Chen, Xiaoling; Chang, Jeffrey T.

2017-01-01

Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928
Reconfigurable Software for Mission Operations

NASA Technical Reports Server (NTRS)

Trimble, Jay

2014-01-01

We developed software that provides flexibility to mission organizations through modularity and composability. Modularity enables removal and addition of functionality through the installation of plug-ins. Composability enables users to assemble software from pre-built reusable objects, thus reducing or eliminating the walls associated with traditional application architectures and enabling unique combinations of functionality. We have used composable objects to reduce display build time, create workflows, and build scenarios to test concepts for lunar roving operations. The software is open source, and may be downloaded from https:github.comnasamct.
A novel control software that improves the experimental workflow of scanning photostimulation experiments.

PubMed

Bendels, Michael H K; Beed, Prateep; Leibold, Christian; Schmitz, Dietmar; Johenning, Friedrich W

2008-10-30

Optical uncaging of caged compounds is a well-established method to study the functional anatomy of a brain region on the circuit level. We present an alternative approach to existing experimental setups. Using a low-magnification objective we acquire images for planning the spatial patterns of stimulation. Then high-magnification objectives are used during laser stimulation providing a laser spot between 2 microm and 20 microm size. The core of this system is a video-based control software that monitors and controls the connected devices, allows for planning of the experiment, coordinates the stimulation process and manages automatic data storage. This combines a high-resolution analysis of neuronal circuits with flexible and efficient online planning and execution of a grid of spatial stimulation patterns on a larger scale. The software offers special optical features that enable the system to achieve a maximum degree of spatial reliability. The hardware is mainly built upon standard laboratory devices and thus ideally suited to cost-effectively complement existing electrophysiological setups with a minimal amount of additional equipment. Finally, we demonstrate the performance of the system by mapping the excitatory and inhibitory connections of entorhinal cortex layer II stellate neurons and present an approach for the analysis of photo-induced synaptic responses in high spontaneous activity.
Prevention of gross setup errors in radiotherapy with an efficient automatic patient safety system.

PubMed

Yan, Guanghua; Mittauer, Kathryn; Huang, Yin; Lu, Bo; Liu, Chihray; Li, Jonathan G

2013-11-04

Treatment of the wrong body part due to incorrect setup is among the leading types of errors in radiotherapy. The purpose of this paper is to report an efficient automatic patient safety system (PSS) to prevent gross setup errors. The system consists of a pair of charge-coupled device (CCD) cameras mounted in treatment room, a single infrared reflective marker (IRRM) affixed on patient or immobilization device, and a set of in-house developed software. Patients are CT scanned with a CT BB placed over their surface close to intended treatment site. Coordinates of the CT BB relative to treatment isocenter are used as reference for tracking. The CT BB is replaced with an IRRM before treatment starts. PSS evaluates setup accuracy by comparing real-time IRRM position with reference position. To automate system workflow, PSS synchronizes with the record-and-verify (R&V) system in real time and automatically loads in reference data for patient under treatment. Special IRRMs, which can permanently stick to patient face mask or body mold throughout the course of treatment, were designed to minimize therapist's workload. Accuracy of the system was examined on an anthropomorphic phantom with a designed end-to-end test. Its performance was also evaluated on head and neck as well as abdominalpelvic patients using cone-beam CT (CBCT) as standard. The PSS system achieved a seamless clinic workflow by synchronizing with the R&V system. By permanently mounting specially designed IRRMs on patient immobilization devices, therapist intervention is eliminated or minimized. Overall results showed that the PSS system has sufficient accuracy to catch gross setup errors greater than 1 cm in real time. An efficient automatic PSS with sufficient accuracy has been developed to prevent gross setup errors in radiotherapy. The system can be applied to all treatment sites for independent positioning verification. It can be an ideal complement to complex image-guidance systems due to its advantages of continuous tracking ability, no radiation dose, and fully automated clinic workflow.
Pathology economic model tool: a novel approach to workflow and budget cost analysis in an anatomic pathology laboratory.

PubMed

Muirhead, David; Aoun, Patricia; Powell, Michael; Juncker, Flemming; Mollerup, Jens

2010-08-01

The need for higher efficiency, maximum quality, and faster turnaround time is a continuous focus for anatomic pathology laboratories and drives changes in work scheduling, instrumentation, and management control systems. To determine the costs of generating routine, special, and immunohistochemical microscopic slides in a large, academic anatomic pathology laboratory using a top-down approach. The Pathology Economic Model Tool was used to analyze workflow processes at The Nebraska Medical Center's anatomic pathology laboratory. Data from the analysis were used to generate complete cost estimates, which included not only materials, consumables, and instrumentation but also specific labor and overhead components for each of the laboratory's subareas. The cost data generated by the Pathology Economic Model Tool were compared with the cost estimates generated using relative value units. Despite the use of automated systems for different processes, the workflow in the laboratory was found to be relatively labor intensive. The effect of labor and overhead on per-slide costs was significantly underestimated by traditional relative-value unit calculations when compared with the Pathology Economic Model Tool. Specific workflow defects with significant contributions to the cost per slide were identified. The cost of providing routine, special, and immunohistochemical slides may be significantly underestimated by traditional methods that rely on relative value units. Furthermore, a comprehensive analysis may identify specific workflow processes requiring improvement.
wft4galaxy: a workflow testing tool for galaxy.

PubMed

Piras, Marco Enrico; Pireddu, Luca; Zanetti, Gianluigi

2017-12-01

Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. marcoenrico.piras@crs4.it. © The Author 2017. Published by Oxford University Press.
A recommended workflow methodology in the creation of an educational and training application incorporating a digital reconstruction of the cerebral ventricular system and cerebrospinal fluid circulation to aid anatomical understanding.

PubMed

Manson, Amy; Poyade, Matthieu; Rea, Paul

2015-10-19

The use of computer-aided learning in education can be advantageous, especially when interactive three-dimensional (3D) models are used to aid learning of complex 3D structures. The anatomy of the ventricular system of the brain is difficult to fully understand as it is seldom seen in 3D, as is the flow of cerebrospinal fluid (CSF). This article outlines a workflow for the creation of an interactive training tool for the cerebral ventricular system, an educationally challenging area of anatomy. This outline is based on the use of widely available computer software packages. Using MR images of the cerebral ventricular system and several widely available commercial and free software packages, the techniques of 3D modelling, texturing, sculpting, image editing and animations were combined to create a workflow in the creation of an interactive educational and training tool. This was focussed on cerebral ventricular system anatomy, and the flow of cerebrospinal fluid. We have successfully created a robust methodology by using key software packages in the creation of an interactive education and training tool. This has resulted in an application being developed which details the anatomy of the ventricular system, and flow of cerebrospinal fluid using an anatomically accurate 3D model. In addition to this, our established workflow pattern presented here also shows how tutorials, animations and self-assessment tools can also be embedded into the training application. Through our creation of an established workflow in the generation of educational and training material for demonstrating cerebral ventricular anatomy and flow of cerebrospinal fluid, it has enormous potential to be adopted into student training in this field. With the digital age advancing rapidly, this has the potential to be used as an innovative tool alongside other methodologies for the training of future healthcare practitioners and scientists. This workflow could be used in the creation of other tools, which could be developed for use not only on desktop and laptop computers but also smartphones, tablets and fully immersive stereoscopic environments. It also could form the basis on which to build surgical simulations enhanced with haptic interaction.
Seamless online science workflow development and collaboration using IDL and the ENVI Services Engine

NASA Astrophysics Data System (ADS)

Harris, A. T.; Ramachandran, R.; Maskey, M.

2013-12-01

The Exelis-developed IDL and ENVI software are ubiquitous tools in Earth science research environments. The IDL Workbench is used by the Earth science community for programming custom data analysis and visualization modules. ENVI is a software solution for processing and analyzing geospatial imagery that combines support for multiple Earth observation scientific data types (optical, thermal, multi-spectral, hyperspectral, SAR, LiDAR) with advanced image processing and analysis algorithms. The ENVI & IDL Services Engine (ESE) is an Earth science data processing engine that allows researchers to use open standards to rapidly create, publish and deploy advanced Earth science data analytics within any existing enterprise infrastructure. Although powerful in many ways, the tools lack collaborative features out-of-box. Thus, as part of the NASA funded project, Collaborative Workbench to Accelerate Science Algorithm Development, researchers at the University of Alabama in Huntsville and Exelis have developed plugins that allow seamless research collaboration from within IDL workbench. Such additional features within IDL workbench are possible because IDL workbench is built using the Eclipse Rich Client Platform (RCP). RCP applications allow custom plugins to be dropped in for extended functionalities. Specific functionalities of the plugins include creating complex workflows based on IDL application source code, submitting workflows to be executed by ESE in the cloud, and sharing and cloning of workflows among collaborators. All these functionalities are available to scientists without leaving their IDL workbench. Because ESE can interoperate with any middleware, scientific programmers can readily string together IDL processing tasks (or tasks written in other languages like C++, Java or Python) to create complex workflows for deployment within their current enterprise architecture (e.g. ArcGIS Server, GeoServer, Apache ODE or SciFlo from JPL). Using the collaborative IDL Workbench, coupled with ESE for execution in the cloud, asynchronous workflows could be executed in batch mode on large data in the cloud. We envision that a scientist will initially develop a scientific workflow locally on a small set of data. Once tested, the scientist will deploy the workflow to the cloud for execution. Depending on the results, the scientist may share the workflow and results, allowing them to be stored in a community catalog and instantly loaded into the IDL Workbench of other scientists. Thereupon, scientists can clone and modify or execute the workflow with different input parameters. The Collaborative Workbench will provide a platform for collaboration in the cloud, helping Earth scientists solve big-data problems in the Earth and planetary sciences.
Talkoot Portals: Discover, Tag, Share, and Reuse Collaborative Science Workflows (Invited)

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Ramachandran, R.; Lynnes, C.

2009-12-01

A small but growing number of scientists are beginning to harness Web 2.0 technologies, such as wikis, blogs, and social tagging, as a transformative way of doing science. These technologies provide researchers easy mechanisms to critique, suggest and share ideas, data and algorithms. At the same time, large suites of algorithms for science analysis are being made available as remotely-invokable Web Services, which can be chained together to create analysis workflows. This provides the research community an unprecedented opportunity to collaborate by sharing their workflows with one another, reproducing and analyzing research results, and leveraging colleagues’ expertise to expedite the process of scientific discovery. However, wikis and similar technologies are limited to text, static images and hyperlinks, providing little support for collaborative data analysis. A team of information technology and Earth science researchers from multiple institutions have come together to improve community collaboration in science analysis by developing a customizable “software appliance” to build collaborative portals for Earth Science services and analysis workflows. The critical requirement is that researchers (not just information technologists) be able to build collaborative sites around service workflows within a few hours. We envision online communities coming together, much like Finnish “talkoot” (a barn raising), to build a shared research space. Talkoot extends a freely available, open source content management framework with a series of modules specific to Earth Science for registering, creating, managing, discovering, tagging and sharing Earth Science web services and workflows for science data processing, analysis and visualization. Users will be able to author a “science story” in shareable web notebooks, including plots or animations, backed up by an executable workflow that directly reproduces the science analysis. New services and workflows of interest will be discoverable using tag search, and advertised using “service casts” and “interest casts” (Atom feeds). Multiple science workflow systems will be plugged into the system, with initial support for UAH’s Mining Workflow Composer and the open-source Active BPEL engine, and JPL’s SciFlo engine and the VizFlow visual programming interface. With the ability to share and execute analysis workflows, Talkoot portals can be used to do collaborative science in addition to communicate ideas and results. It will be useful for different science domains, mission teams, research projects and organizations. Thus, it will help to solve the “sociological” problem of bringing together disparate groups of researchers, and the technical problem of advertising, discovering, developing, documenting, and maintaining inter-agency science workflows. The presentation will discuss the goals of and barriers to Science 2.0, the social web technologies employed in the Talkoot software appliance (e.g. CMS, social tagging, personal presence, advertising by feeds, etc.), illustrate the resulting collaborative capabilities, and show early prototypes of the web interfaces (e.g. embedded workflows).
Evaluation of DICOM viewer software for workflow integration in clinical trials

NASA Astrophysics Data System (ADS)

Haak, Daniel; Page, Charles E.; Kabino, Klaus; Deserno, Thomas M.

2015-03-01

The digital imaging and communications in medicine (DICOM) protocol is nowadays the leading standard for capture, exchange and storage of image data in medical applications. A broad range of commercial, free, and open source software tools supporting a variety of DICOM functionality exists. However, different from patient's care in hospital, DICOM has not yet arrived in electronic data capture systems (EDCS) for clinical trials. Due to missing integration, even just the visualization of patient's image data in electronic case report forms (eCRFs) is impossible. Four increasing levels for integration of DICOM components into EDCS are conceivable, raising functionality but also demands on interfaces with each level. Hence, in this paper, a comprehensive evaluation of 27 DICOM viewer software projects is performed, investigating viewing functionality as well as interfaces for integration. Concerning general, integration, and viewing requirements the survey involves the criteria (i) license, (ii) support, (iii) platform, (iv) interfaces, (v) two-dimensional (2D) and (vi) three-dimensional (3D) image viewing functionality. Optimal viewers are suggested for applications in clinical trials for 3D imaging, hospital communication, and workflow. Focusing on open source solutions, the viewers ImageJ and MicroView are superior for 3D visualization, whereas GingkoCADx is advantageous for hospital integration. Concerning workflow optimization in multi-centered clinical trials, we suggest the open source viewer Weasis. Covering most use cases, an EDCS and PACS interconnection with Weasis is suggested.

Software Productivity of Field Experiments Using the Mobile Agents Open Architecture with Workflow Interoperability

NASA Technical Reports Server (NTRS)

Clancey, William J.; Lowry, Michael R.; Nado, Robert Allen; Sierhuis, Maarten

2011-01-01

We analyzed a series of ten systematically developed surface exploration systems that integrated a variety of hardware and software components. Design, development, and testing data suggest that incremental buildup of an exploration system for long-duration capabilities is facilitated by an open architecture with appropriate-level APIs, specifically designed to facilitate integration of new components. This improves software productivity by reducing changes required for reconfiguring an existing system.
Radiology information system: a workflow-based approach.

PubMed

Zhang, Jinyan; Lu, Xudong; Nie, Hongchao; Huang, Zhengxing; van der Aalst, W M P

2009-09-01

Introducing workflow management technology in healthcare seems to be prospective in dealing with the problem that the current healthcare Information Systems cannot provide sufficient support for the process management, although several challenges still exist. The purpose of this paper is to study the method of developing workflow-based information system in radiology department as a use case. First, a workflow model of typical radiology process was established. Second, based on the model, the system could be designed and implemented as a group of loosely coupled components. Each component corresponded to one task in the process and could be assembled by the workflow management system. The legacy systems could be taken as special components, which also corresponded to the tasks and were integrated through transferring non-work- flow-aware interfaces to the standard ones. Finally, a workflow dashboard was designed and implemented to provide an integral view of radiology processes. The workflow-based Radiology Information System was deployed in the radiology department of Zhejiang Chinese Medicine Hospital in China. The results showed that it could be adjusted flexibly in response to the needs of changing process, and enhance the process management in the department. It can also provide a more workflow-aware integration method, comparing with other methods such as IHE-based ones. The workflow-based approach is a new method of developing radiology information system with more flexibility, more functionalities of process management and more workflow-aware integration. The work of this paper is an initial endeavor for introducing workflow management technology in healthcare.
An Accessible Proteogenomics Informatics Resource for Cancer Researchers.

PubMed

Chambers, Matthew C; Jagtap, Pratik D; Johnson, James E; McGowan, Thomas; Kumar, Praveen; Onsongo, Getiria; Guerrero, Candace R; Barsnes, Harald; Vaudel, Marc; Martens, Lennart; Grüning, Björn; Cooke, Ira R; Heydarian, Mohammad; Reddy, Karen L; Griffin, Timothy J

2017-11-01

Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub. Cancer Res; 77(21); e43-46. ©2017 AACR . ©2017 American Association for Cancer Research.
MPA Portable: A Stand-Alone Software Package for Analyzing Metaproteome Samples on the Go.

PubMed

Muth, Thilo; Kohrs, Fabian; Heyer, Robert; Benndorf, Dirk; Rapp, Erdmann; Reichl, Udo; Martens, Lennart; Renard, Bernhard Y

2018-01-02

Metaproteomics, the mass spectrometry-based analysis of proteins from multispecies samples faces severe challenges concerning data analysis and results interpretation. To overcome these shortcomings, we here introduce the MetaProteomeAnalyzer (MPA) Portable software. In contrast to the original server-based MPA application, this newly developed tool no longer requires computational expertise for installation and is now independent of any relational database system. In addition, MPA Portable now supports state-of-the-art database search engines and a convenient command line interface for high-performance data processing tasks. While search engine results can easily be combined to increase the protein identification yield, an additional two-step workflow is implemented to provide sufficient analysis resolution for further postprocessing steps, such as protein grouping as well as taxonomic and functional annotation. Our new application has been developed with a focus on intuitive usability, adherence to data standards, and adaptation to Web-based workflow platforms. The open source software package can be found at https://github.com/compomics/meta-proteome-analyzer .
Selecting Advanced Software Technology in Two Small Manufacturing Enterprises

DTIC Science & Technology

2004-05-01

improving workflow to further reduce delivery times, enhance customer service, and obtain a competitive advantage . The company wanted help... environment , stakeholders’ needs, ecommerce , shop floor visualization, and collaboration capability. These statements are not significantly different...for the purpose of describing a software environment . This identification does not imply any recommendation or endorsement by NIST, the SEI, CMU, or
A simple tool for stereological assessment of digital images: the STEPanizer.

PubMed

Tschanz, S A; Burri, P H; Weibel, E R

2011-07-01

STEPanizer is an easy-to-use computer-based software tool for the stereological assessment of digitally captured images from all kinds of microscopical (LM, TEM, LSM) and macroscopical (radiology, tomography) imaging modalities. The program design focuses on providing the user a defined workflow adapted to most basic stereological tasks. The software is compact, that is user friendly without being bulky. STEPanizer comprises the creation of test systems, the appropriate display of digital images with superimposed test systems, a scaling facility, a counting module and an export function for the transfer of results to spreadsheet programs. Here we describe the major workflow of the tool illustrating the application on two examples from transmission electron microscopy and light microscopy, respectively. © 2011 The Authors Journal of Microscopy © 2011 Royal Microscopical Society.
Integrating Remote and Social Sensing Data for a Scenario on Secure Societies in Big Data Platform

NASA Astrophysics Data System (ADS)

Albani, Sergio; Lazzarini, Michele; Koubarakis, Manolis; Taniskidou, Efi Karra; Papadakis, George; Karkaletsis, Vangelis; Giannakopoulos, George

2016-08-01

In the framework of the Horizon 2020 project BigDataEurope (Integrating Big Data, Software & Communities for Addressing Europe's Societal Challenges), a pilot for the Secure Societies Societal Challenge was designed considering the requirements coming from relevant stakeholders. The pilot is focusing on the integration in a Big Data platform of data coming from remote and social sensing.The information on land changes coming from the Copernicus Sentinel 1A sensor (Change Detection workflow) is integrated with information coming from selected Twitter and news agencies accounts (Event Detection workflow) in order to provide the user with multiple sources of information.The Change Detection workflow implements a processing chain in a distributed parallel manner, exploiting the Big Data capabilities in place; the Event Detection workflow implements parallel and distributed social media and news agencies monitoring as well as suitable mechanisms to detect and geo-annotate the related events.
Rapid analysis and exploration of fluorescence microscopy images.

PubMed

Pavie, Benjamin; Rajaram, Satwik; Ouyang, Austin; Altschuler, Jason M; Steininger, Robert J; Wu, Lani F; Altschuler, Steven J

2014-03-19

Despite rapid advances in high-throughput microscopy, quantitative image-based assays still pose significant challenges. While a variety of specialized image analysis tools are available, most traditional image-analysis-based workflows have steep learning curves (for fine tuning of analysis parameters) and result in long turnaround times between imaging and analysis. In particular, cell segmentation, the process of identifying individual cells in an image, is a major bottleneck in this regard. Here we present an alternate, cell-segmentation-free workflow based on PhenoRipper, an open-source software platform designed for the rapid analysis and exploration of microscopy images. The pipeline presented here is optimized for immunofluorescence microscopy images of cell cultures and requires minimal user intervention. Within half an hour, PhenoRipper can analyze data from a typical 96-well experiment and generate image profiles. Users can then visually explore their data, perform quality control on their experiment, ensure response to perturbations and check reproducibility of replicates. This facilitates a rapid feedback cycle between analysis and experiment, which is crucial during assay optimization. This protocol is useful not just as a first pass analysis for quality control, but also may be used as an end-to-end solution, especially for screening. The workflow described here scales to large data sets such as those generated by high-throughput screens, and has been shown to group experimental conditions by phenotype accurately over a wide range of biological systems. The PhenoBrowser interface provides an intuitive framework to explore the phenotypic space and relate image properties to biological annotations. Taken together, the protocol described here will lower the barriers to adopting quantitative analysis of image based screens.
nmsBuilder: Freeware to create subject-specific musculoskeletal models for OpenSim.

PubMed

Valente, Giordano; Crimi, Gianluigi; Vanella, Nicola; Schileo, Enrico; Taddei, Fulvia

2017-12-01

Musculoskeletal modeling and simulations of movement have been increasingly used in orthopedic and neurological scenarios, with increased attention to subject-specific applications. In general, musculoskeletal modeling applications have been facilitated by the development of dedicated software tools; however, subject-specific studies have been limited also by time-consuming modeling workflows and high skilled expertise required. In addition, no reference tools exist to standardize the process of musculoskeletal model creation and make it more efficient. Here we present a freely available software application, nmsBuilder 2.0, to create musculoskeletal models in the file format of OpenSim, a widely-used open-source platform for musculoskeletal modeling and simulation. nmsBuilder 2.0 is the result of a major refactoring of a previous implementation that moved a first step toward an efficient workflow for subject-specific model creation. nmsBuilder includes a graphical user interface that provides access to all functionalities, based on a framework for computer-aided medicine written in C++. The operations implemented can be used in a workflow to create OpenSim musculoskeletal models from 3D surfaces. A first step includes data processing to create supporting objects necessary to create models, e.g. surfaces, anatomical landmarks, reference systems; and a second step includes the creation of OpenSim objects, e.g. bodies, joints, muscles, and the corresponding model. We present a case study using nmsBuilder 2.0: the creation of an MRI-based musculoskeletal model of the lower limb. The model included four rigid bodies, five degrees of freedom and 43 musculotendon actuators, and was created from 3D surfaces of the segmented images of a healthy subject through the modeling workflow implemented in the software application. We have presented nmsBuilder 2.0 for the creation of musculoskeletal OpenSim models from image-based data, and made it freely available via nmsbuilder.org. This application provides an efficient workflow for model creation and helps standardize the process. We hope this would help promote personalized applications in musculoskeletal biomechanics, including larger sample size studies, and might also represent a basis for future developments for specific applications. Copyright © 2017 Elsevier B.V. All rights reserved.
Implementation of Epic Beaker Anatomic Pathology at an Academic Medical Center.

PubMed

Blau, John Larry; Wilford, Joseph D; Dane, Susan K; Karandikar, Nitin J; Fuller, Emily S; Jacobsmeier, Debbie J; Jans, Melissa A; Horning, Elisabeth A; Krasowski, Matthew D; Ford, Bradley A; Becker, Kent R; Beranek, Jeanine M; Robinson, Robert A

2017-01-01

Beaker is a relatively new laboratory information system (LIS) offered by Epic Systems Corporation as part of its suite of health-care software and bundled with its electronic medical record, EpicCare. It is divided into two modules, Beaker anatomic pathology (Beaker AP) and Beaker Clinical Pathology. In this report, we describe our experience implementing Beaker AP version 2014 at an academic medical center with a go-live date of October 2015. This report covers preimplementation preparations and challenges beginning in September 2014, issues discovered soon after go-live in October 2015, and some post go-live optimizations using data from meetings, debriefings, and the project closure document. We share specific issues that we encountered during implementation, including difficulties with the proposed frozen section workflow, developing a shared specimen source dictionary, and implementation of the standard Beaker workflow in large institution with trainees. We share specific strategies that we used to overcome these issues for a successful Beaker AP implementation. Several areas of the laboratory-required adaptation of the default Beaker build parameters to meet the needs of the workflow in a busy academic medical center. In a few areas, our laboratory was unable to use the Beaker functionality to support our workflow, and we have continued to use paper or have altered our workflow. In spite of several difficulties that required creative solutions before go-live, the implementation has been successful based on satisfaction surveys completed by pathologists and others who use the software. However, optimization of Beaker workflows has continued to be an ongoing process after go-live to the present time. The Beaker AP LIS can be successfully implemented at an academic medical center but requires significant forethought, creative adaptation, and continued shared management of the ongoing product by institutional and departmental information technology staff as well as laboratory managers to meet the needs of the laboratory.
The use of Graphic User Interface for development of a user-friendly CRS-Stack software

NASA Astrophysics Data System (ADS)

Sule, Rachmat; Prayudhatama, Dythia; Perkasa, Muhammad D.; Hendriyana, Andri; Fatkhan; Sardjito; Adriansyah

2017-04-01

The development of a user-friendly Common Reflection Surface (CRS) Stack software that has been built by implementing Graphical User Interface (GUI) is described in this paper. The original CRS-Stack software developed by WIT Consortium is compiled in the unix/linux environment, which is not a user-friendly software, so that a user must write the commands and parameters manually in a script file. Due to this limitation, the CRS-Stack become a non popular method, although applying this method is actually a promising way in order to obtain better seismic sections, which have better reflector continuity and S/N ratio. After obtaining successful results that have been tested by using several seismic data belong to oil companies in Indonesia, it comes to an idea to develop a user-friendly software in our own laboratory. Graphical User Interface (GUI) is a type of user interface that allows people to interact with computer programs in a better way. Rather than typing commands and module parameters, GUI allows the users to use computer programs in much simple and easy. Thus, GUI can transform the text-based interface into graphical icons and visual indicators. The use of complicated seismic unix shell script can be avoided. The Java Swing GUI library is used to develop this CRS-Stack GUI. Every shell script that represents each seismic process is invoked from Java environment. Besides developing interactive GUI to perform CRS-Stack processing, this CRS-Stack GUI is design to help geophysicists to manage a project with complex seismic processing procedures. The CRS-Stack GUI software is composed by input directory, operators, and output directory, which are defined as a seismic data processing workflow. The CRS-Stack processing workflow involves four steps; i.e. automatic CMP stack, initial CRS-Stack, optimized CRS-Stack, and CRS-Stack Supergather. Those operations are visualized in an informative flowchart with self explanatory system to guide the user inputting the parameter values for each operation. The knowledge of CRS-Stack processing procedure is still preserved in the software, which is easy and efficient to be learned. The software will still be developed in the future. Any new innovative seismic processing workflow will also be added into this GUI software.
The LHCb software and computing upgrade for Run 3: opportunities and challenges

NASA Astrophysics Data System (ADS)

Bozzi, C.; Roiser, S.; LHCb Collaboration

2017-10-01

The LHCb detector will be upgraded for the LHC Run 3 and will be readout at 30 MHz, corresponding to the full inelastic collision rate, with major implications on the full software trigger and offline computing. If the current computing model and software framework are kept, the data storage capacity and computing power required to process data at this rate, and to generate and reconstruct equivalent samples of simulated events, will exceed the current capacity by at least one order of magnitude. A redesign of the software framework, including scheduling, the event model, the detector description and the conditions database, is needed to fully exploit the computing power of multi-, many-core architectures, and coprocessors. Data processing and the analysis model will also change towards an early streaming of different data types, in order to limit storage resources, with further implications for the data analysis workflows. Fast simulation options will allow to obtain a reasonable parameterization of the detector response in considerably less computing time. Finally, the upgrade of LHCb will be a good opportunity to review and implement changes in the domains of software design, test and review, and analysis workflow and preservation. In this contribution, activities and recent results in all the above areas are presented.
The radiologist's workflow environment: evaluation of disruptors and potential implications.

PubMed

Yu, John-Paul J; Kansagra, Akash P; Mongan, John

2014-06-01

Workflow interruptions in the health care delivery environment are a major contributor to medical errors and have been extensively studied within numerous hospital settings, including the nursing environment and the operating room, along with their effects on physician workflow. Less understood, though, is the role of interruptions in other highly specialized clinical domains and subspecialty services, such as diagnostic radiology. The workflow of the on-call radiologist, in particular, is especially susceptible to disruption by telephone calls and other modes of physician-to-physician communication. Herein, the authors describe their initial efforts to quantify the degree of interruption experienced by on-call radiologists and examine its potential implications in patient safety and overall clinical care. Copyright © 2014 American College of Radiology. Published by Elsevier Inc. All rights reserved.
The Globus Galaxies Platform. Delivering Science Gateways as a Service

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madduri, Ravi; Chard, Kyle; Chard, Ryan

We use public cloud computers to host sophisticated scientific data; software is then used to transform scientific practice by enabling broad access to capabilities previously available only to the few. The primary obstacle to more widespread use of public clouds to host scientific software (‘cloud-based science gateways’) has thus far been the considerable gap between the specialized needs of science applications and the capabilities provided by cloud infrastructures. We describe here a domain-independent, cloud-based science gateway platform, the Globus Galaxies platform, which overcomes this gap by providing a set of hosted services that directly address the needs of science gatewaymore » developers. The design and implementation of this platform leverages our several years of experience with Globus Genomics, a cloud-based science gateway that has served more than 200 genomics researchers across 30 institutions. Building on that foundation, we have also implemented a platform that leverages the popular Galaxy system for application hosting and workflow execution; Globus services for data transfer, user and group management, and authentication; and a cost-aware elastic provisioning model specialized for public cloud resources. We describe here the capabilities and architecture of this platform, present six scientific domains in which we have successfully applied it, report on user experiences, and analyze the economics of our deployments. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
Calibration of radio-astronomical data on the cloud. LOFAR, the pathway to SKA

NASA Astrophysics Data System (ADS)

Sabater, J.; Sánchez-Expósito, S.; Garrido, J.; Ruiz, J. E.; Best, P. N.; Verdes-Montenegro, L.

2015-05-01

The radio interferometer LOFAR (LOw Frequency ARray) is fully operational now. This Square Kilometre Array (SKA) pathfinder allows the observation of the sky at frequencies between 10 and 240 MHz, a relatively unexplored region of the spectrum. LOFAR is a software defined telescope: the data is mainly processed using specialized software running in common computing facilities. That means that the capabilities of the telescope are virtually defined by software and mainly limited by the available computing power. However, the quantity of data produced can quickly reach huge volumes (several Petabytes per day). After the correlation and pre-processing of the data in a dedicated cluster, the final dataset is handled to the user (typically several Terabytes). The calibration of these data requires a powerful computing facility in which the specific state of the art software under heavy continuous development can be easily installed and updated. That makes this case a perfect candidate for a cloud infrastructure which adds the advantages of an on demand, flexible solution. We present our approach to the calibration of LOFAR data using Ibercloud, the cloud infrastructure provided by Ibergrid. With the calibration work-flow adapted to the cloud, we can explore calibration strategies for the SKA and show how private or commercial cloud infrastructures (Ibercloud, Amazon EC2, Google Compute Engine, etc.) can help to solve the problems with big datasets that will be prevalent in the future of astronomy.
Digital transformation in home care. A case study.

PubMed

Bennis, Sandy; Costanzo, Diane; Flynn, Ann Marie; Reidy, Agatha; Tronni, Catherine

2007-01-01

Simply implementing software and technology does not assure that an organization's targeted clinical and financial goals will be realized. No longer is it possible to roll out a new system--by solely providing end user training and overlaying it on top of already inefficient workflows and outdated roles--and know with certainty that targets will be met. At Virtua Health's Home Care, based in south New Jersey, implementation of their electronic system initially followed this more traditional approach. Unable to completely attain their earlier identified return on investment, they enlisted the help of a new role within their health system, that of the nurse informaticist. Knowledgeable in complex clinical processes and not bound by the technology at hand, the informaticist analyzed physical workflow, digital workflow, roles and physical layout. Leveraging specific tools such as change acceleration, workouts and LEAN, the informaticist was able to redesign workflow and support new levels of functionality. This article provides a view from the "finish line", recounting how this role worked with home care to assimilate information delivery into more efficient processes and align resources to support the new workflow, ultimately achieving real tangible returns.
The View from a Few Hundred Feet : A New Transparent and Integrated Workflow for UAV-collected Data

NASA Astrophysics Data System (ADS)

Peterson, F. S.; Barbieri, L.; Wyngaard, J.

2015-12-01

Unmanned Aerial Vehicles (UAVs) allow scientists and civilians to monitor earth and atmospheric conditions in remote locations. To keep up with the rapid evolution of UAV technology, data workflows must also be flexible, integrated, and introspective. Here, we present our data workflow for a project to assess the feasibility of detecting threshold levels of methane, carbon-dioxide, and other aerosols by mounting consumer-grade gas analysis sensors on UAV's. Particularly, we highlight our use of Project Jupyter, a set of open-source software tools and documentation designed for developing "collaborative narratives" around scientific workflows. By embracing the GitHub-backed, multi-language systems available in Project Jupyter, we enable interaction and exploratory computation while simultaneously embracing distributed version control. Additionally, the transparency of this method builds trust with civilians and decision-makers and leverages collaboration and communication to resolve problems. The goal of this presentation is to provide a generic data workflow for scientific inquiries involving UAVs and to invite the participation of the AGU community in its improvement and curation.
Deploying and sharing U-Compare workflows as web services.

PubMed

Kontonatsios, Georgios; Korkontzelos, Ioannis; Kolluru, Balakrishna; Thompson, Paul; Ananiadou, Sophia

2013-02-18

U-Compare is a text mining platform that allows the construction, evaluation and comparison of text mining workflows. U-Compare contains a large library of components that are tuned to the biomedical domain. Users can rapidly develop biomedical text mining workflows by mixing and matching U-Compare's components. Workflows developed using U-Compare can be exported and sent to other users who, in turn, can import and re-use them. However, the resulting workflows are standalone applications, i.e., software tools that run and are accessible only via a local machine, and that can only be run with the U-Compare platform. We address the above issues by extending U-Compare to convert standalone workflows into web services automatically, via a two-click process. The resulting web services can be registered on a central server and made publicly available. Alternatively, users can make web services available on their own servers, after installing the web application framework, which is part of the extension to U-Compare. We have performed a user-oriented evaluation of the proposed extension, by asking users who have tested the enhanced functionality of U-Compare to complete questionnaires that assess its functionality, reliability, usability, efficiency and maintainability. The results obtained reveal that the new functionality is well received by users. The web services produced by U-Compare are built on top of open standards, i.e., REST and SOAP protocols, and therefore, they are decoupled from the underlying platform. Exported workflows can be integrated with any application that supports these open standards. We demonstrate how the newly extended U-Compare enhances the cross-platform interoperability of workflows, by seamlessly importing a number of text mining workflow web services exported from U-Compare into Taverna, i.e., a generic scientific workflow construction platform.
Deploying and sharing U-Compare workflows as web services

PubMed Central

2013-01-01

Background U-Compare is a text mining platform that allows the construction, evaluation and comparison of text mining workflows. U-Compare contains a large library of components that are tuned to the biomedical domain. Users can rapidly develop biomedical text mining workflows by mixing and matching U-Compare’s components. Workflows developed using U-Compare can be exported and sent to other users who, in turn, can import and re-use them. However, the resulting workflows are standalone applications, i.e., software tools that run and are accessible only via a local machine, and that can only be run with the U-Compare platform. Results We address the above issues by extending U-Compare to convert standalone workflows into web services automatically, via a two-click process. The resulting web services can be registered on a central server and made publicly available. Alternatively, users can make web services available on their own servers, after installing the web application framework, which is part of the extension to U-Compare. We have performed a user-oriented evaluation of the proposed extension, by asking users who have tested the enhanced functionality of U-Compare to complete questionnaires that assess its functionality, reliability, usability, efficiency and maintainability. The results obtained reveal that the new functionality is well received by users. Conclusions The web services produced by U-Compare are built on top of open standards, i.e., REST and SOAP protocols, and therefore, they are decoupled from the underlying platform. Exported workflows can be integrated with any application that supports these open standards. We demonstrate how the newly extended U-Compare enhances the cross-platform interoperability of workflows, by seamlessly importing a number of text mining workflow web services exported from U-Compare into Taverna, i.e., a generic scientific workflow construction platform. PMID:23419017
Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator.

PubMed

Garcia Castro, Alexander; Thoraval, Samuel; Garcia, Leyla J; Ragan, Mark A

2005-04-07

Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive), ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download). From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous analytical tools.

Provenance-Powered Automatic Workflow Generation and Composition

NASA Astrophysics Data System (ADS)

Zhang, J.; Lee, S.; Pan, L.; Lee, T. J.

2015-12-01

In recent years, scientists have learned how to codify tools into reusable software modules that can be chained into multi-step executable workflows. Existing scientific workflow tools, created by computer scientists, require domain scientists to meticulously design their multi-step experiments before analyzing data. However, this is oftentimes contradictory to a domain scientist's daily routine of conducting research and exploration. We hope to resolve this dispute. Imagine this: An Earth scientist starts her day applying NASA Jet Propulsion Laboratory (JPL) published climate data processing algorithms over ARGO deep ocean temperature and AMSRE sea surface temperature datasets. Throughout the day, she tunes the algorithm parameters to study various aspects of the data. Suddenly, she notices some interesting results. She then turns to a computer scientist and asks, "can you reproduce my results?" By tracking and reverse engineering her activities, the computer scientist creates a workflow. The Earth scientist can now rerun the workflow to validate her findings, modify the workflow to discover further variations, or publish the workflow to share the knowledge. In this way, we aim to revolutionize computer-supported Earth science. We have developed a prototyping system to realize the aforementioned vision, in the context of service-oriented science. We have studied how Earth scientists conduct service-oriented data analytics research in their daily work, developed a provenance model to record their activities, and developed a technology to automatically generate workflow starting from user behavior and adaptability and reuse of these workflows for replicating/improving scientific studies. A data-centric repository infrastructure is established to catch richer provenance to further facilitate collaboration in the science community. We have also established a Petri nets-based verification instrument for provenance-based automatic workflow generation and recommendation.
A Digital Repository and Execution Platform for Interactive Scholarly Publications in Neuroscience.

PubMed

Hodge, Victoria; Jessop, Mark; Fletcher, Martyn; Weeks, Michael; Turner, Aaron; Jackson, Tom; Ingram, Colin; Smith, Leslie; Austin, Jim

2016-01-01

The CARMEN Virtual Laboratory (VL) is a cloud-based platform which allows neuroscientists to store, share, develop, execute, reproduce and publicise their work. This paper describes new functionality in the CARMEN VL: an interactive publications repository. This new facility allows users to link data and software to publications. This enables other users to examine data and software associated with the publication and execute the associated software within the VL using the same data as the authors used in the publication. The cloud-based architecture and SaaS (Software as a Service) framework allows vast data sets to be uploaded and analysed using software services. Thus, this new interactive publications facility allows others to build on research results through reuse. This aligns with recent developments by funding agencies, institutions, and publishers with a move to open access research. Open access provides reproducibility and verification of research resources and results. Publications and their associated data and software will be assured of long-term preservation and curation in the repository. Further, analysing research data and the evaluations described in publications frequently requires a number of execution stages many of which are iterative. The VL provides a scientific workflow environment to combine software services into a processing tree. These workflows can also be associated with publications and executed by users. The VL also provides a secure environment where users can decide the access rights for each resource to ensure copyright and privacy restrictions are met.
Workflows and Provenance: Toward Information Science Solutions for the Natural Sciences.

PubMed

Gryk, Michael R; Ludäscher, Bertram

2017-01-01

The era of big data and ubiquitous computation has brought with it concerns about ensuring reproducibility in this new research environment. It is easy to assume computational methods self-document by their very nature of being exact, deterministic processes. However, similar to laboratory experiments, ensuring reproducibility in the computational realm requires the documentation of both the protocols used (workflows) as well as a detailed description of the computational environment: algorithms, implementations, software environments as well as the data ingested and execution logs of the computation. These two aspects of computational reproducibility (workflows and execution details) are discussed in the context of biomolecular Nuclear Magnetic Resonance spectroscopy (bioNMR) as well as the PRIMAD model for computational reproducibility.
Automated data collection in single particle electron microscopy

PubMed Central

Tan, Yong Zi; Cheng, Anchi; Potter, Clinton S.; Carragher, Bridget

2016-01-01

Automated data collection is an integral part of modern workflows in single particle electron microscopy (EM) research. This review surveys the software packages available for automated single particle EM data collection. The degree of automation at each stage of data collection is evaluated, and the capabilities of the software packages are described. Finally, future trends in automation are discussed. PMID:26671944
Deploying an Intelligent Pairing Assistant for Air Operation Centers

DTIC Science & Technology

2016-06-23

primary contributions of this case study are applying artificial intelligence techniques to a novel domain and discussing the software evaluation...their standard workflows. The primary contributions of this case study are applying artificial intelligence techniques to a novel domain and...users for more efficient and accurate pairing? Participants Participants in the evaluation consisted of three SMEs employed at Intelligent Software
Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers.

PubMed

Sochat, Vanessa V; Prybol, Cameron J; Kurtzer, Gregory M

2017-01-01

Here we present Singularity Hub, a framework to build and deploy Singularity containers for mobility of compute, and the singularity-python software with novel metrics for assessing reproducibility of such containers. Singularity containers make it possible for scientists and developers to package reproducible software, and Singularity Hub adds automation to this workflow by building, capturing metadata for, visualizing, and serving containers programmatically. Our novel metrics, based on custom filters of content hashes of container contents, allow for comparison of an entire container, including operating system, custom software, and metadata. First we will review Singularity Hub's primary use cases and how the infrastructure has been designed to support modern, common workflows. Next, we conduct three analyses to demonstrate build consistency, reproducibility metric and performance and interpretability, and potential for discovery. This is the first effort to demonstrate a rigorous assessment of measurable similarity between containers and operating systems. We provide these capabilities within Singularity Hub, as well as the source software singularity-python that provides the underlying functionality. Singularity Hub is available at https://singularity-hub.org, and we are excited to provide it as an openly available platform for building, and deploying scientific containers.
Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers

PubMed Central

Prybol, Cameron J.; Kurtzer, Gregory M.

2017-01-01

Here we present Singularity Hub, a framework to build and deploy Singularity containers for mobility of compute, and the singularity-python software with novel metrics for assessing reproducibility of such containers. Singularity containers make it possible for scientists and developers to package reproducible software, and Singularity Hub adds automation to this workflow by building, capturing metadata for, visualizing, and serving containers programmatically. Our novel metrics, based on custom filters of content hashes of container contents, allow for comparison of an entire container, including operating system, custom software, and metadata. First we will review Singularity Hub’s primary use cases and how the infrastructure has been designed to support modern, common workflows. Next, we conduct three analyses to demonstrate build consistency, reproducibility metric and performance and interpretability, and potential for discovery. This is the first effort to demonstrate a rigorous assessment of measurable similarity between containers and operating systems. We provide these capabilities within Singularity Hub, as well as the source software singularity-python that provides the underlying functionality. Singularity Hub is available at https://singularity-hub.org, and we are excited to provide it as an openly available platform for building, and deploying scientific containers. PMID:29186161
Unified Software Solution for Efficient SPR Data Analysis in Drug Research

PubMed Central

Dahl, Göran; Steigele, Stephan; Hillertz, Per; Tigerström, Anna; Egnéus, Anders; Mehrle, Alexander; Ginkel, Martin; Edfeldt, Fredrik; Holdgate, Geoff; O’Connell, Nichole; Kappler, Bernd; Brodte, Annette; Rawlins, Philip B.; Davies, Gareth; Westberg, Eva-Lotta; Folmer, Rutger H. A.; Heyse, Stephan

2016-01-01

Surface plasmon resonance (SPR) is a powerful method for obtaining detailed molecular interaction parameters. Modern instrumentation with its increased throughput has enabled routine screening by SPR in hit-to-lead and lead optimization programs, and SPR has become a mainstream drug discovery technology. However, the processing and reporting of SPR data in drug discovery are typically performed manually, which is both time-consuming and tedious. Here, we present the workflow concept, design and experiences with a software module relying on a single, browser-based software platform for the processing, analysis, and reporting of SPR data. The efficiency of this concept lies in the immediate availability of end results: data are processed and analyzed upon loading the raw data file, allowing the user to immediately quality control the results. Once completed, the user can automatically report those results to data repositories for corporate access and quickly generate printed reports or documents. The software module has resulted in a very efficient and effective workflow through saved time and improved quality control. We discuss these benefits and show how this process defines a new benchmark in the drug discovery industry for the handling, interpretation, visualization, and sharing of SPR data. PMID:27789754
Development of an impairment-based individualized treatment workflow using an iPad-based software platform.

PubMed

Kiran, Swathi; Des Roches, Carrie; Balachandran, Isabel; Ascenso, Elsa

2014-02-01

Individuals with language and cognitive deficits following brain damage likely require long-term rehabilitation. Consequently, it is a huge practical problem to provide the continued communication therapy that these individuals require. The present project describes the development of an impairment-based individualized treatment workflow using a software platform called Constant Therapy. This article is organized into two sections. We will first describe the general methods of the treatment workflow for patients involved in this study. There are four steps in this process: (1) the patient's impairment is assessed using standardized tests, (2) the patient is assigned a specific and individualized treatment plan, (3) the patient practices the therapy at home and at the clinic, and (4) the clinician and the patient can analyze the results of the patient's performance remotely and monitor and alter the treatment plan accordingly. The second section provides four case studies that provide a representative sample of participants progressing through their individualized treatment plan. The preliminary results of the patient treatment provide encouraging evidence for the feasibility of a rehabilitation program for individuals with brain damage based on the iPad (Apple Inc., Cupertino, CA). Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Next-generation sequencing meets genetic diagnostics: development of a comprehensive workflow for the analysis of BRCA1 and BRCA2 genes

PubMed Central

Feliubadaló, Lídia; Lopez-Doriga, Adriana; Castellsagué, Ester; del Valle, Jesús; Menéndez, Mireia; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Gómez, Carolina; Campos, Olga; Pineda, Marta; González, Sara; Moreno, Victor; Brunet, Joan; Blanco, Ignacio; Serra, Eduard; Capellá, Gabriel; Lázaro, Conxi

2013-01-01

Next-generation sequencing (NGS) is changing genetic diagnosis due to its huge sequencing capacity and cost-effectiveness. The aim of this study was to develop an NGS-based workflow for routine diagnostics for hereditary breast and ovarian cancer syndrome (HBOCS), to improve genetic testing for BRCA1 and BRCA2. A NGS-based workflow was designed using BRCA MASTR kit amplicon libraries followed by GS Junior pyrosequencing. Data analysis combined Variant Identification Pipeline freely available software and ad hoc R scripts, including a cascade of filters to generate coverage and variant calling reports. A BRCA homopolymer assay was performed in parallel. A research scheme was designed in two parts. A Training Set of 28 DNA samples containing 23 unique pathogenic mutations and 213 other variants (33 unique) was used. The workflow was validated in a set of 14 samples from HBOCS families in parallel with the current diagnostic workflow (Validation Set). The NGS-based workflow developed permitted the identification of all pathogenic mutations and genetic variants, including those located in or close to homopolymers. The use of NGS for detecting copy-number alterations was also investigated. The workflow meets the sensitivity and specificity requirements for the genetic diagnosis of HBOCS and improves on the cost-effectiveness of current approaches. PMID:23249957
Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services

PubMed Central

Madduri, Ravi K.; Sulakhe, Dinanath; Lacinski, Lukasz; Liu, Bo; Rodriguez, Alex; Chard, Kyle; Dave, Utpal J.; Foster, Ian T.

2014-01-01

We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads. PMID:25342933
Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services.

PubMed

Madduri, Ravi K; Sulakhe, Dinanath; Lacinski, Lukasz; Liu, Bo; Rodriguez, Alex; Chard, Kyle; Dave, Utpal J; Foster, Ian T

2014-09-10

We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads.
MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME.

PubMed

Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

2016-01-01

Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.
High-throughput bioinformatics with the Cyrille2 pipeline system

PubMed Central

Fiers, Mark WEJ; van der Burgt, Ate; Datema, Erwin; de Groot, Joost CW; van Ham, Roeland CHJ

2008-01-01

Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (GUI) that enables a pipeline operator to manage the system; 2) the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines. PMID:18269742
webpic: A flexible web application for collecting distance and count measurements from images

PubMed Central

2018-01-01

Despite increasing ability to store and analyze large amounts of data for organismal and ecological studies, the process of collecting distance and count measurements from images has largely remained time consuming and error-prone, particularly for tasks for which automation is difficult or impossible. Improving the efficiency of these tasks, which allows for more high quality data to be collected in a shorter amount of time, is therefore a high priority. The open-source web application, webpic, implements common web languages and widely available libraries and productivity apps to streamline the process of collecting distance and count measurements from images. In this paper, I introduce the framework of webpic and demonstrate one readily available feature of this application, linear measurements, using fossil leaf specimens. This application fills the gap between workflows accomplishable by individuals through existing software and those accomplishable by large, unmoderated crowds. It demonstrates that flexible web languages can be used to streamline time-intensive research tasks without the use of specialized equipment or proprietary software and highlights the potential for web resources to facilitate data collection in research tasks and outreach activities with improved efficiency. PMID:29608592
Advantages and Disadvantages of 1-Incision, 2-Incision, 3-Incision, and 4-Incision Laparoscopic Cholecystectomy: A Workflow Comparison Study.

PubMed

Bartnicka, Joanna; Zietkiewicz, Agnieszka A; Kowalski, Grzegorz J

2016-08-01

A comparison of 1-port, 2-port, 3-port, and 4-port laparoscopic cholecystectomy techniques from the point of view of workflow criteria was made to both identify specific workflow components that can cause surgical disturbances and indicate good and bad practices. As a case study, laparoscopic cholecystectomies, including manual tasks and interactions within teamwork members, were video-recorded and analyzed on the basis of specially encoded workflow information. The parameters for comparison were defined as follows: surgery time, tool and hand activeness, operator's passive work, collisions, and operator interventions. It was found that 1-port cholecystectomy is the worst technique because of nonergonomic body position, technical complexity, organizational anomalies, and operational dynamism. The differences between laparoscopic techniques are closely linked to the costs of the medical procedures. Hence, knowledge about the surgical workflow can be used for both planning surgical procedures and balancing the expenses associated with surgery.
Workflow computing. Improving management and efficiency of pathology diagnostic services.

PubMed

Buffone, G J; Moreau, D; Beck, J R

1996-04-01

Traditionally, information technology in health care has helped practitioners to collect, store, and present information and also to add a degree of automation to simple tasks (instrument interfaces supporting result entry, for example). Thus commercially available information systems do little to support the need to model, execute, monitor, coordinate, and revise the various complex clinical processes required to support health-care delivery. Workflow computing, which is already implemented and improving the efficiency of operations in several nonmedical industries, can address the need to manage complex clinical processes. Workflow computing not only provides a means to define and manage the events, roles, and information integral to health-care delivery but also supports the explicit implementation of policy or rules appropriate to the process. This article explains how workflow computing may be applied to health-care and the inherent advantages of the technology, and it defines workflow system requirements for use in health-care delivery with special reference to diagnostic pathology.
Methylation Integration (Mint) | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

A comprehensive software pipeline and set of Galaxy tools/workflows for integrative analysis of genome-wide DNA methylation and hydroxymethylation data. Data types can be either bisulfite sequencing and/or pull-down methods.
Clinical data miner: an electronic case report form system with integrated data preprocessing and machine-learning libraries supporting clinical diagnostic model research.

PubMed

Installé, Arnaud Jf; Van den Bosch, Thierry; De Moor, Bart; Timmerman, Dirk

2014-10-20

Using machine-learning techniques, clinical diagnostic model research extracts diagnostic models from patient data. Traditionally, patient data are often collected using electronic Case Report Form (eCRF) systems, while mathematical software is used for analyzing these data using machine-learning techniques. Due to the lack of integration between eCRF systems and mathematical software, extracting diagnostic models is a complex, error-prone process. Moreover, due to the complexity of this process, it is usually only performed once, after a predetermined number of data points have been collected, without insight into the predictive performance of the resulting models. The objective of the study of Clinical Data Miner (CDM) software framework is to offer an eCRF system with integrated data preprocessing and machine-learning libraries, improving efficiency of the clinical diagnostic model research workflow, and to enable optimization of patient inclusion numbers through study performance monitoring. The CDM software framework was developed using a test-driven development (TDD) approach, to ensure high software quality. Architecturally, CDM's design is split over a number of modules, to ensure future extendability. The TDD approach has enabled us to deliver high software quality. CDM's eCRF Web interface is in active use by the studies of the International Endometrial Tumor Analysis consortium, with over 4000 enrolled patients, and more studies planned. Additionally, a derived user interface has been used in six separate interrater agreement studies. CDM's integrated data preprocessing and machine-learning libraries simplify some otherwise manual and error-prone steps in the clinical diagnostic model research workflow. Furthermore, CDM's libraries provide study coordinators with a method to monitor a study's predictive performance as patient inclusions increase. To our knowledge, CDM is the only eCRF system integrating data preprocessing and machine-learning libraries. This integration improves the efficiency of the clinical diagnostic model research workflow. Moreover, by simplifying the generation of learning curves, CDM enables study coordinators to assess more accurately when data collection can be terminated, resulting in better models or lower patient recruitment costs.
Dynamic reusable workflows for ocean science

USGS Publications Warehouse

Signell, Richard; Fernandez, Filipe; Wilcox, Kyle

2016-01-01

Digital catalogs of ocean data have been available for decades, but advances in standardized services and software for catalog search and data access make it now possible to create catalog-driven workflows that automate — end-to-end — data search, analysis and visualization of data from multiple distributed sources. Further, these workflows may be shared, reused and adapted with ease. Here we describe a workflow developed within the US Integrated Ocean Observing System (IOOS) which automates the skill-assessment of water temperature forecasts from multiple ocean forecast models, allowing improved forecast products to be delivered for an open water swim event. A series of Jupyter Notebooks are used to capture and document the end-to-end workflow using a collection of Python tools that facilitate working with standardized catalog and data services. The workflow first searches a catalog of metadata using the Open Geospatial Consortium (OGC) Catalog Service for the Web (CSW), then accesses data service endpoints found in the metadata records using the OGC Sensor Observation Service (SOS) for in situ sensor data and OPeNDAP services for remotely-sensed and model data. Skill metrics are computed and time series comparisons of forecast model and observed data are displayed interactively, leveraging the capabilities of modern web browsers. The resulting workflow not only solves a challenging specific problem, but highlights the benefits of dynamic, reusable workflows in general. These workflows adapt as new data enters the data system, facilitate reproducible science, provide templates from which new scientific workflows can be developed, and encourage data providers to use standardized services. As applied to the ocean swim event, the workflow exposed problems with two of the ocean forecast products which led to improved regional forecasts once errors were corrected. While the example is specific, the approach is general, and we hope to see increased use of dynamic notebooks across the geoscience domains.

GO2OGS 1.0: a versatile workflow to integrate complex geological information with fault data into numerical simulation models

NASA Astrophysics Data System (ADS)

Fischer, T.; Naumov, D.; Sattler, S.; Kolditz, O.; Walther, M.

2015-11-01

We offer a versatile workflow to convert geological models built with the ParadigmTM GOCAD© (Geological Object Computer Aided Design) software into the open-source VTU (Visualization Toolkit unstructured grid) format for usage in numerical simulation models. Tackling relevant scientific questions or engineering tasks often involves multidisciplinary approaches. Conversion workflows are needed as a way of communication between the diverse tools of the various disciplines. Our approach offers an open-source, platform-independent, robust, and comprehensible method that is potentially useful for a multitude of environmental studies. With two application examples in the Thuringian Syncline, we show how a heterogeneous geological GOCAD model including multiple layers and faults can be used for numerical groundwater flow modeling, in our case employing the OpenGeoSys open-source numerical toolbox for groundwater flow simulations. The presented workflow offers the chance to incorporate increasingly detailed data, utilizing the growing availability of computational power to simulate numerical models.
Decaf: Decoupled Dataflows for In Situ High-Performance Workflows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dreher, M.; Peterka, T.

Decaf is a dataflow system for the parallel communication of coupled tasks in an HPC workflow. The dataflow can perform arbitrary data transformations ranging from simply forwarding data to complex data redistribution. Decaf does this by allowing the user to allocate resources and execute custom code in the dataflow. All communication through the dataflow is efficient parallel message passing over MPI. The runtime for calling tasks is entirely message-driven; Decaf executes a task when all messages for the task have been received. Such a messagedriven runtime allows cyclic task dependencies in the workflow graph, for example, to enact computational steeringmore » based on the result of downstream tasks. Decaf includes a simple Python API for describing the workflow graph. This allows Decaf to stand alone as a complete workflow system, but Decaf can also be used as the dataflow layer by one or more other workflow systems to form a heterogeneous task-based computing environment. In one experiment, we couple a molecular dynamics code with a visualization tool using the FlowVR and Damaris workflow systems and Decaf for the dataflow. In another experiment, we test the coupling of a cosmology code with Voronoi tessellation and density estimation codes using MPI for the simulation, the DIY programming model for the two analysis codes, and Decaf for the dataflow. Such workflows consisting of heterogeneous software infrastructures exist because components are developed separately with different programming models and runtimes, and this is the first time that such heterogeneous coupling of diverse components was demonstrated in situ on HPC systems.« less
Experiences and lessons learned from creating a generalized workflow for data publication of field campaign datasets

NASA Astrophysics Data System (ADS)

Santhana Vannan, S. K.; Ramachandran, R.; Deb, D.; Beaty, T.; Wright, D.

2017-12-01

This paper summarizes the workflow challenges of curating and publishing data produced from disparate data sources and provides a generalized workflow solution to efficiently archive data generated by researchers. The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) for biogeochemical dynamics and the Global Hydrology Resource Center (GHRC) DAAC have been collaborating on the development of a generalized workflow solution to efficiently manage the data publication process. The generalized workflow presented here are built on lessons learned from implementations of the workflow system. Data publication consists of the following steps: Accepting the data package from the data providers, ensuring the full integrity of the data files. Identifying and addressing data quality issues Assembling standardized, detailed metadata and documentation, including file level details, processing methodology, and characteristics of data files Setting up data access mechanisms Setup of the data in data tools and services for improved data dissemination and user experience Registering the dataset in online search and discovery catalogues Preserving the data location through Digital Object Identifiers (DOI) We will describe the steps taken to automate, and realize efficiencies to the above process. The goals of the workflow system are to reduce the time taken to publish a dataset, to increase the quality of documentation and metadata, and to track individual datasets through the data curation process. Utilities developed to achieve these goal will be described. We will also share metrics driven value of the workflow system and discuss the future steps towards creation of a common software framework.
Structured recording of intraoperative surgical workflows

NASA Astrophysics Data System (ADS)

Neumuth, T.; Durstewitz, N.; Fischer, M.; Strauss, G.; Dietz, A.; Meixensberger, J.; Jannin, P.; Cleary, K.; Lemke, H. U.; Burgert, O.

2006-03-01

Surgical Workflows are used for the methodical and scientific analysis of surgical interventions. The approach described here is a step towards developing surgical assist systems based on Surgical Workflows and integrated control systems for the operating room of the future. This paper describes concepts and technologies for the acquisition of Surgical Workflows by monitoring surgical interventions and their presentation. Establishing systems which support the Surgical Workflow in operating rooms requires a multi-staged development process beginning with the description of these workflows. A formalized description of surgical interventions is needed to create a Surgical Workflow. This description can be used to analyze and evaluate surgical interventions in detail. We discuss the subdivision of surgical interventions into work steps regarding different levels of granularity and propose a recording scheme for the acquisition of manual surgical work steps from running interventions. To support the recording process during the intervention, we introduce a new software architecture. Core of the architecture is our Surgical Workflow editor that is intended to deal with the manifold, complex and concurrent relations during an intervention. Furthermore, a method for an automatic generation of graphs is shown which is able to display the recorded surgical work steps of the interventions. Finally we conclude with considerations about extensions of our recording scheme to close the gap to S-PACS systems. The approach was used to record 83 surgical interventions from 6 intervention types from 3 different surgical disciplines: ENT surgery, neurosurgery and interventional radiology. The interventions were recorded at the University Hospital Leipzig, Germany and at the Georgetown University Hospital, Washington, D.C., USA.
HTAPP: High-Throughput Autonomous Proteomic Pipeline

PubMed Central

Yu, Kebing; Salomon, Arthur R.

2011-01-01

Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab-based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic datasets is critically important. The high-throughput autonomous proteomic pipeline (HTAPP) described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is comprised of software that controls the acquisition of mass spectral data along with automation of post-acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user-configurable lab-based relational database. The software design of HTAPP focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples. PMID:20336676
Using conceptual work products of health care to design health IT.

PubMed

Berry, Andrew B L; Butler, Keith A; Harrington, Craig; Braxton, Melissa O; Walker, Amy J; Pete, Nikki; Johnson, Trevor; Oberle, Mark W; Haselkorn, Jodie; Paul Nichol, W; Haselkorn, Mark

2016-02-01

This paper introduces a new, model-based design method for interactive health information technology (IT) systems. This method extends workflow models with models of conceptual work products. When the health care work being modeled is substantially cognitive, tacit, and complex in nature, graphical workflow models can become too complex to be useful to designers. Conceptual models complement and simplify workflows by providing an explicit specification for the information product they must produce. We illustrate how conceptual work products can be modeled using standard software modeling language, which allows them to provide fundamental requirements for what the workflow must accomplish and the information that a new system should provide. Developers can use these specifications to envision how health IT could enable an effective cognitive strategy as a workflow with precise information requirements. We illustrate the new method with a study conducted in an outpatient multiple sclerosis (MS) clinic. This study shows specifically how the different phases of the method can be carried out, how the method allows for iteration across phases, and how the method generated a health IT design for case management of MS that is efficient and easy to use. Copyright © 2015 Elsevier Inc. All rights reserved.
Conventions and workflows for using Situs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wriggers, Willy, E-mail: wriggers@biomachina.org

2012-04-01

Recent developments of the Situs software suite for multi-scale modeling are reviewed. Typical workflows and conventions encountered during processing of biophysical data from electron microscopy, tomography or small-angle X-ray scattering are described. Situs is a modular program package for the multi-scale modeling of atomic resolution structures and low-resolution biophysical data from electron microscopy, tomography or small-angle X-ray scattering. This article provides an overview of recent developments in the Situs package, with an emphasis on workflows and conventions that are important for practical applications. The modular design of the programs facilitates scripting in the bash shell that allows specific programs tomore » be combined in creative ways that go beyond the original intent of the developers. Several scripting-enabled functionalities, such as flexible transformations of data type, the use of symmetry constraints or the creation of two-dimensional projection images, are described. The processing of low-resolution biophysical maps in such workflows follows not only first principles but often relies on implicit conventions. Situs conventions related to map formats, resolution, correlation functions and feature detection are reviewed and summarized. The compatibility of the Situs workflow with CCP4 conventions and programs is discussed.« less
An end-to-end workflow for engineering of biological networks from high-level specifications.

PubMed

Beal, Jacob; Weiss, Ron; Densmore, Douglas; Adler, Aaron; Appleton, Evan; Babb, Jonathan; Bhatia, Swapnil; Davidsohn, Noah; Haddock, Traci; Loyall, Joseph; Schantz, Richard; Vasilev, Viktor; Yaman, Fusun

2012-08-17

We present a workflow for the design and production of biological networks from high-level program specifications. The workflow is based on a sequence of intermediate models that incrementally translate high-level specifications into DNA samples that implement them. We identify algorithms for translating between adjacent models and implement them as a set of software tools, organized into a four-stage toolchain: Specification, Compilation, Part Assignment, and Assembly. The specification stage begins with a Boolean logic computation specified in the Proto programming language. The compilation stage uses a library of network motifs and cellular platforms, also specified in Proto, to transform the program into an optimized Abstract Genetic Regulatory Network (AGRN) that implements the programmed behavior. The part assignment stage assigns DNA parts to the AGRN, drawing the parts from a database for the target cellular platform, to create a DNA sequence implementing the AGRN. Finally, the assembly stage computes an optimized assembly plan to create the DNA sequence from available part samples, yielding a protocol for producing a sample of engineered plasmids with robotics assistance. Our workflow is the first to automate the production of biological networks from a high-level program specification. Furthermore, the workflow's modular design allows the same program to be realized on different cellular platforms simply by swapping workflow configurations. We validated our workflow by specifying a small-molecule sensor-reporter program and verifying the resulting plasmids in both HEK 293 mammalian cells and in E. coli bacterial cells.
STSE: Spatio-Temporal Simulation Environment Dedicated to Biology.

PubMed

Stoma, Szymon; Fröhlich, Martina; Gerber, Susanne; Klipp, Edda

2011-04-28

Recently, the availability of high-resolution microscopy together with the advancements in the development of biomarkers as reporters of biomolecular interactions increased the importance of imaging methods in molecular cell biology. These techniques enable the investigation of cellular characteristics like volume, size and geometry as well as volume and geometry of intracellular compartments, and the amount of existing proteins in a spatially resolved manner. Such detailed investigations opened up many new areas of research in the study of spatial, complex and dynamic cellular systems. One of the crucial challenges for the study of such systems is the design of a well stuctured and optimized workflow to provide a systematic and efficient hypothesis verification. Computer Science can efficiently address this task by providing software that facilitates handling, analysis, and evaluation of biological data to the benefit of experimenters and modelers. The Spatio-Temporal Simulation Environment (STSE) is a set of open-source tools provided to conduct spatio-temporal simulations in discrete structures based on microscopy images. The framework contains modules to digitize, represent, analyze, and mathematically model spatial distributions of biochemical species. Graphical user interface (GUI) tools provided with the software enable meshing of the simulation space based on the Voronoi concept. In addition, it supports to automatically acquire spatial information to the mesh from the images based on pixel luminosity (e.g. corresponding to molecular levels from microscopy images). STSE is freely available either as a stand-alone version or included in the linux live distribution Systems Biology Operational Software (SB.OS) and can be downloaded from http://www.stse-software.org/. The Python source code as well as a comprehensive user manual and video tutorials are also offered to the research community. We discuss main concepts of the STSE design and workflow. We demonstrate it's usefulness using the example of a signaling cascade leading to formation of a morphological gradient of Fus3 within the cytoplasm of the mating yeast cell Saccharomyces cerevisiae. STSE is an efficient and powerful novel platform, designed for computational handling and evaluation of microscopic images. It allows for an uninterrupted workflow including digitization, representation, analysis, and mathematical modeling. By providing the means to relate the simulation to the image data it allows for systematic, image driven model validation or rejection. STSE can be scripted and extended using the Python language. STSE should be considered rather as an API together with workflow guidelines and a collection of GUI tools than a stand alone application. The priority of the project is to provide an easy and intuitive way of extending and customizing software using the Python language.
Enhancing GIS Capabilities for High Resolution Earth Science Grids

NASA Astrophysics Data System (ADS)

Koziol, B. W.; Oehmke, R.; Li, P.; O'Kuinghttons, R.; Theurich, G.; DeLuca, C.

2017-12-01

Applications for high performance GIS will continue to increase as Earth system models pursue more realistic representations of Earth system processes. Finer spatial resolution model input and output, unstructured or irregular modeling grids, data assimilation, and regional coordinate systems present novel challenges for GIS frameworks operating in the Earth system modeling domain. This presentation provides an overview of two GIS-driven applications that combine high performance software with big geospatial datasets to produce value-added tools for the modeling and geoscientific community. First, a large-scale interpolation experiment using National Hydrography Dataset (NHD) catchments, a high resolution rectilinear CONUS grid, and the Earth System Modeling Framework's (ESMF) conservative interpolation capability will be described. ESMF is a parallel, high-performance software toolkit that provides capabilities (e.g. interpolation) for building and coupling Earth science applications. ESMF is developed primarily by the NOAA Environmental Software Infrastructure and Interoperability (NESII) group. The purpose of this experiment was to test and demonstrate the utility of high performance scientific software in traditional GIS domains. Special attention will be paid to the nuanced requirements for dealing with high resolution, unstructured grids in scientific data formats. Second, a chunked interpolation application using ESMF and OpenClimateGIS (OCGIS) will demonstrate how spatial subsetting can virtually remove computing resource ceilings for very high spatial resolution interpolation operations. OCGIS is a NESII-developed Python software package designed for the geospatial manipulation of high-dimensional scientific datasets. An overview of the data processing workflow, why a chunked approach is required, and how the application could be adapted to meet operational requirements will be discussed here. In addition, we'll provide a general overview of OCGIS's parallel subsetting capabilities including challenges in the design and implementation of a scientific data subsetter.
Integrating data from an online diabetes prevention program into an electronic health record and clinical workflow, a design phase usability study.

PubMed

Mishuris, Rebecca Grochow; Yoder, Jordan; Wilson, Dan; Mann, Devin

2016-07-11

Health information is increasingly being digitally stored and exchanged. The public is regularly collecting and storing health-related data on their own electronic devices and in the cloud. Diabetes prevention is an increasingly important preventive health measure, and diet and exercise are key components of this. Patients are turning to online programs to help them lose weight. Despite primary care physicians being important in patients' weight loss success, there is no exchange of information between the primary care provider (PCP) and these online weight loss programs. There is an emerging opportunity to integrate this data directly into the electronic health record (EHR), but little is known about what information to share or how to share it most effectively. This study aims to characterize the preferences of providers concerning the integration of externally generated lifestyle modification data into a primary care EHR workflow. We performed a qualitative study using two rounds of semi-structured interviews with primary care providers. We used an iterative design process involving primary care providers, health information technology software developers and health services researchers to develop the interface. Using grounded-theory thematic analysis 4 themes emerged from the interviews: 1) barriers to establishing healthy lifestyles, 2) features of a lifestyle modification program, 3) reporting of outcomes to the primary care provider, and 4) integration with primary care. These themes guided the rapid-cycle agile design process of an interface of data from an online diabetes prevention program into the primary care EHR workflow. The integration of external health-related data into the EHR must be embedded into the provider workflow in order to be useful to the provider and beneficial for the patient. Accomplishing this requires evaluation of that clinical workflow during software design. The development of this novel interface used rapid cycle iterative design, early involvement by providers, and usability testing methodology. This provides a framework for how to integrate external data into provider workflow in efficient and effective ways. There is now the potential to realize the importance of having this data available in the clinical setting for patient engagement and health outcomes.
High Resolution Topography of Polar Regions from Commercial Satellite Imagery, Petascale Computing and Open Source Software

NASA Astrophysics Data System (ADS)

Morin, Paul; Porter, Claire; Cloutier, Michael; Howat, Ian; Noh, Myoung-Jong; Willis, Michael; Kramer, WIlliam; Bauer, Greg; Bates, Brian; Williamson, Cathleen

2017-04-01

Surface topography is among the most fundamental data sets for geosciences, essential for disciplines ranging from glaciology to geodynamics. Two new projects are using sub-meter, commercial imagery licensed by the National Geospatial-Intelligence Agency and open source photogrammetry software to produce a time-tagged 2m posting elevation model of the Arctic and an 8m posting reference elevation model for the Antarctic. When complete, this publically available data will be at higher resolution than any elevation models that cover the entirety of the Western United States. These two polar projects are made possible due to three equally important factors: 1) open-source photogrammetry software, 2) petascale computing, and 3) sub-meter imagery licensed to the United States Government. Our talk will detail the technical challenges of using automated photogrammetry software; the rapid workflow evolution to allow DEM production; the task of deploying the workflow on one of the world's largest supercomputers; the trials of moving massive amounts of data, and the management strategies the team needed to solve in order to meet deadlines. Finally, we will discuss the implications of this type of collaboration for future multi-team use of leadership-class systems such as Blue Waters, and for further elevation mapping.
Geometric processing workflow for vertical and oblique hyperspectral frame images collected using UAV

NASA Astrophysics Data System (ADS)

Markelin, L.; Honkavaara, E.; Näsi, R.; Nurminen, K.; Hakala, T.

2014-08-01

Remote sensing based on unmanned airborne vehicles (UAVs) is a rapidly developing field of technology. UAVs enable accurate, flexible, low-cost and multiangular measurements of 3D geometric, radiometric, and temporal properties of land and vegetation using various sensors. In this paper we present a geometric processing chain for multiangular measurement system that is designed for measuring object directional reflectance characteristics in a wavelength range of 400-900 nm. The technique is based on a novel, lightweight spectral camera designed for UAV use. The multiangular measurement is conducted by collecting vertical and oblique area-format spectral images. End products of the geometric processing are image exterior orientations, 3D point clouds and digital surface models (DSM). This data is needed for the radiometric processing chain that produces reflectance image mosaics and multiangular bidirectional reflectance factor (BRF) observations. The geometric processing workflow consists of the following three steps: (1) determining approximate image orientations using Visual Structure from Motion (VisualSFM) software, (2) calculating improved orientations and sensor calibration using a method based on self-calibrating bundle block adjustment (standard photogrammetric software) (this step is optional), and finally (3) creating dense 3D point clouds and DSMs using Photogrammetric Surface Reconstruction from Imagery (SURE) software that is based on semi-global-matching algorithm and it is capable of providing a point density corresponding to the pixel size of the image. We have tested the geometric processing workflow over various targets, including test fields, agricultural fields, lakes and complex 3D structures like forests.
Software Prototyping

PubMed Central

Del Fiol, Guilherme; Hanseler, Haley; Crouch, Barbara Insley; Cummins, Mollie R.

2016-01-01

Summary Background Health information exchange (HIE) between Poison Control Centers (PCCs) and Emergency Departments (EDs) could improve care of poisoned patients. However, PCC information systems are not designed to facilitate HIE with EDs; therefore, we are developing specialized software to support HIE within the normal workflow of the PCC using user-centered design and rapid prototyping. Objective To describe the design of an HIE dashboard and the refinement of user requirements through rapid prototyping. Methods Using previously elicited user requirements, we designed low-fidelity sketches of designs on paper with iterative refinement. Next, we designed an interactive high-fidelity prototype and conducted scenario-based usability tests with end users. Users were asked to think aloud while accomplishing tasks related to a case vignette. After testing, the users provided feedback and evaluated the prototype using the System Usability Scale (SUS). Results Survey results from three users provided useful feedback that was then incorporated into the design. After achieving a stable design, we used the prototype itself as the specification for development of the actual software. Benefits of prototyping included having 1) subject-matter experts heavily involved with the design; 2) flexibility to make rapid changes, 3) the ability to minimize software development efforts early in the design stage; 4) rapid finalization of requirements; 5) early visualization of designs; 6) and a powerful vehicle for communication of the design to the programmers. Challenges included 1) time and effort to develop the prototypes and case scenarios; 2) no simulation of system performance; 3) not having all proposed functionality available in the final product; and 4) missing needed data elements in the PCC information system. PMID:27081404
GABBs: Cyberinfrastructure for Self-Service Geospatial Data Exploration, Computation, and Sharing

NASA Astrophysics Data System (ADS)

Song, C. X.; Zhao, L.; Biehl, L. L.; Merwade, V.; Villoria, N.

2016-12-01

Geospatial data are present everywhere today with the proliferation of location-aware computing devices. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. In addressing these needs, the Geospatial data Analysis Building Blocks (GABBs) project aims at building geospatial modeling, data analysis and visualization capabilities in an open source web platform, HUBzero. Funded by NSF's Data Infrastructure Building Blocks initiative, GABBs is creating a geospatial data architecture that integrates spatial data management, mapping and visualization, and interfaces in the HUBzero platform for scientific collaborations. The geo-rendering enabled Rappture toolkit, a generic Python mapping library, geospatial data exploration and publication tools, and an integrated online geospatial data management solution are among the software building blocks from the project. The GABBS software will be available through Amazon's AWS Marketplace VM images and open source. Hosting services are also available to the user community. The outcome of the project will enable researchers and educators to self-manage their scientific data, rapidly create GIS-enable tools, share geospatial data and tools on the web, and build dynamic workflows connecting data and tools, all without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the GABBs architecture, toolkits and libraries, and showcase the scientific use cases that utilize GABBs capabilities, as well as the challenges and solutions for GABBs to interoperate with other cyberinfrastructure platforms.
Structuring research methods and data with the research object model: genomics workflows as a case study.

PubMed

Hettne, Kristina M; Dharuri, Harish; Zhao, Jun; Wolstencroft, Katherine; Belhajjame, Khalid; Soiland-Reyes, Stian; Mina, Eleni; Thompson, Mark; Cruickshank, Don; Verdes-Montenegro, Lourdes; Garrido, Julian; de Roure, David; Corcho, Oscar; Klyne, Graham; van Schouwen, Reinout; 't Hoen, Peter A C; Bechhofer, Sean; Goble, Carole; Roos, Marco

2014-01-01

One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows. We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as "which particular data was input to a particular workflow to test a particular hypothesis?", and "which particular conclusions were drawn from a particular workflow?". Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well. The Research Object is available at http://www.myexperiment.org/packs/428 The Wf4Ever Research Object Model is available at http://wf4ever.github.io/ro.
Abstracted Workow Framework with a Structure from Motion Application

NASA Astrophysics Data System (ADS)

Rossi, Adam J.

In scientific and engineering disciplines, from academia to industry, there is an increasing need for the development of custom software to perform experiments, construct systems, and develop products. The natural mindset initially is to shortcut and bypass all overhead and process rigor in order to obtain an immediate result for the problem at hand, with the misconception that the software will simply be thrown away at the end. In a majority of the cases, it turns out the software persists for many years, and likely ends up in production systems for which it was not initially intended. In the current study, a framework that can be used in both industry and academic applications mitigates underlying problems associated with developing scientific and engineering software. This results in software that is much more maintainable, documented, and usable by others, specifically allowing new users to extend capabilities of components already implemented in the framework. There is a multi-disciplinary need in the fields of imaging science, computer science, and software engineering for a unified implementation model, which motivates the development of an abstracted software framework. Structure from motion (SfM) has been identified as one use case where the abstracted workflow framework can improve research efficiencies and eliminate implementation redundancies in scientific fields. The SfM process begins by obtaining 2D images of a scene from different perspectives. Features from the images are extracted and correspondences are established. This provides a sufficient amount of information to initialize the problem for fully automated processing. Transformations are established between views, and 3D points are established via triangulation algorithms. The parameters for the camera models for all views / images are solved through bundle adjustment, establishing a highly consistent point cloud. The initial sparse point cloud and camera matrices are used to generate a dense point cloud through patch based techniques or densification algorithms such as Semi-Global Matching (SGM). The point cloud can be visualized or exploited by both humans and automated techniques. In some cases the point cloud is "draped" with original imagery in order to enhance the 3D model for a human viewer. The SfM workflow can be implemented in the abstracted framework, making it easily leverageable and extensible by multiple users. Like many processes in scientific and engineering domains, the workflow described for SfM is complex and requires many disparate components to form a functional system, often utilizing algorithms implemented by many users in different languages / environments and without knowledge of how the component fits into the larger system. In practice, this generally leads to issues interfacing the components, building the software for desired platforms, understanding its concept of operations, and how it can be manipulated in order to fit the desired function for a particular application. In addition, other scientists and engineers instinctively wish to analyze the performance of the system, establish new algorithms, optimize existing processes, and establish new functionality based on current research. This requires a framework whereby new components can be easily plugged in without affecting the current implemented functionality. The need for a universal programming environment establishes the motivation for the development of the abstracted workflow framework. This software implementation, named Catena, provides base classes from which new components must derive in order to operate within the framework. The derivation mandates requirements be satisfied in order to provide a complete implementation. Additionally, the developer must provide documentation of the component in terms of its overall function and inputs. The interface input and output values corresponding to the component must be defined in terms of their respective data types, and the implementation uses mechanisms within the framework to retrieve and send the values. This process requires the developer to componentize their algorithm rather than implement it monolithically. Although the requirements of the developer are slightly greater, the benefits realized from using Catena far outweigh the overhead, and results in extensible software. This thesis provides a basis for the abstracted workflow framework concept and the Catena software implementation. The benefits are also illustrated using a detailed examination of the SfM process as an example application.
Git Replacement for the

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robinson, P.

2014-09-23

GRAPE is a tool for managing software project workflows for the Git version control system. It provides a suite of tools to simplify and configure branch based development, integration with a project's testing suite, and integration with the Atlassian Stash repository hosting tool.
Hydrography for the non-Hydrographer: A Paradigm shift in Data Processing

NASA Astrophysics Data System (ADS)

Malzone, C.; Bruce, S.

2017-12-01

Advancements in technology have led to overall systematic improvements including; hardware design, software architecture, data transmission/ telepresence. Historically, utilization of this technology has required a high knowledge level obtained with many years of experience, training and/or education. High training costs are incurred to achieve and maintain an acceptable level proficiency within an organization. Recently, engineers have developed off-the-shelf software technology called Qimera that has simplified the processing of hydrographic data. The core technology is centered around the isolation of tasks within the work- flow to capitalize on the technological advances in computing technology to automate the mundane error prone tasks to bring more value to the stages in which the human brain brings value. Key design features include: guided workflow, transcription automation, processing state management, real-time QA, dynamic workflow for validation, collaborative cleaning and production line processing. Since, Qimera is designed to guide the user, it allows expedition leaders to focus on science while providing an educational opportunity for students to quickly learn the hydrographic processing workflow including ancillary data analysis, trouble-shooting, calibration and cleaning. This paper provides case studies on how Qimera is currently implemented in scientific expeditions, benefits of implementation and how it is directing the future of on-board research for the non-hydrographer.
DNAseq Workflow in a Diagnostic Context and an Example of a User Friendly Implementation.

PubMed

Wolf, Beat; Kuonen, Pierre; Dandekar, Thomas; Atlan, David

2015-01-01

Over recent years next generation sequencing (NGS) technologies evolved from costly tools used by very few, to a much more accessible and economically viable technology. Through this recently gained popularity, its use-cases expanded from research environments into clinical settings. But the technical know-how and infrastructure required to analyze the data remain an obstacle for a wider adoption of this technology, especially in smaller laboratories. We present GensearchNGS, a commercial DNAseq software suite distributed by Phenosystems SA. The focus of GensearchNGS is the optimal usage of already existing infrastructure, while keeping its use simple. This is achieved through the integration of existing tools in a comprehensive software environment, as well as custom algorithms developed with the restrictions of limited infrastructures in mind. This includes the possibility to connect multiple computers to speed up computing intensive parts of the analysis such as sequence alignments. We present a typical DNAseq workflow for NGS data analysis and the approach GensearchNGS takes to implement it. The presented workflow goes from raw data quality control to the final variant report. This includes features such as gene panels and the integration of online databases, like Ensembl for annotations or Cafe Variome for variant sharing.

Bioinformatic Workflows for Generating Complete Plastid Genome Sequences-An Example from Cabomba (Cabombaceae) in the Context of the Phylogenomic Analysis of the Water-Lily Clade.

PubMed

Gruenstaeudl, Michael; Gerschler, Nico; Borsch, Thomas

2018-06-21

The sequencing and comparison of plastid genomes are becoming a standard method in plant genomics, and many researchers are using this approach to infer plant phylogenetic relationships. Due to the widespread availability of next-generation sequencing, plastid genome sequences are being generated at breakneck pace. This trend towards massive sequencing of plastid genomes highlights the need for standardized bioinformatic workflows. In particular, documentation and dissemination of the details of genome assembly, annotation, alignment and phylogenetic tree inference are needed, as these processes are highly sensitive to the choice of software and the precise settings used. Here, we present the procedure and results of sequencing, assembling, annotating and quality-checking of three complete plastid genomes of the aquatic plant genus Cabomba as well as subsequent gene alignment and phylogenetic tree inference. We accompany our findings by a detailed description of the bioinformatic workflow employed. Importantly, we share a total of eleven software scripts for each of these bioinformatic processes, enabling other researchers to evaluate and replicate our analyses step by step. The results of our analyses illustrate that the plastid genomes of Cabomba are highly conserved in both structure and gene content.
msCompare: A Framework for Quantitative Analysis of Label-free LC-MS Data for Comparative Candidate Biomarker Studies*

PubMed Central

Hoekman, Berend; Breitling, Rainer; Suits, Frank; Bischoff, Rainer; Horvatovich, Peter

2012-01-01

Data processing forms an integral part of biomarker discovery and contributes significantly to the ultimate result. To compare and evaluate various publicly available open source label-free data processing workflows, we developed msCompare, a modular framework that allows the arbitrary combination of different feature detection/quantification and alignment/matching algorithms in conjunction with a novel scoring method to evaluate their overall performance. We used msCompare to assess the performance of workflows built from modules of publicly available data processing packages such as SuperHirn, OpenMS, and MZmine and our in-house developed modules on peptide-spiked urine and trypsin-digested cerebrospinal fluid (CSF) samples. We found that the quality of results varied greatly among workflows, and interestingly, heterogeneous combinations of algorithms often performed better than the homogenous workflows. Our scoring method showed that the union of feature matrices of different workflows outperformed the original homogenous workflows in some cases. msCompare is open source software (https://trac.nbic.nl/mscompare), and we provide a web-based data processing service for our framework by integration into the Galaxy server of the Netherlands Bioinformatics Center (http://galaxy.nbic.nl/galaxy) to allow scientists to determine which combination of modules provides the most accurate processing for their particular LC-MS data sets. PMID:22318370
a Standardized Approach to Topographic Data Processing and Workflow Management

NASA Astrophysics Data System (ADS)

Wheaton, J. M.; Bailey, P.; Glenn, N. F.; Hensleigh, J.; Hudak, A. T.; Shrestha, R.; Spaete, L.

2013-12-01

An ever-increasing list of options exist for collecting high resolution topographic data, including airborne LIDAR, terrestrial laser scanners, bathymetric SONAR and structure-from-motion. An equally rich, arguably overwhelming, variety of tools exists with which to organize, quality control, filter, analyze and summarize these data. However, scientists are often left to cobble together their analysis as a series of ad hoc steps, often using custom scripts and one-time processes that are poorly documented and rarely shared with the community. Even when literature-cited software tools are used, the input and output parameters differ from tool to tool. These parameters are rarely archived and the steps performed lost, making the analysis virtually impossible to replicate precisely. What is missing is a coherent, robust, framework for combining reliable, well-documented topographic data-processing steps into a workflow that can be repeated and even shared with others. We have taken several popular topographic data processing tools - including point cloud filtering and decimation as well as DEM differencing - and defined a common protocol for passing inputs and outputs between them. This presentation describes a free, public online portal that enables scientists to create custom workflows for processing topographic data using a number of popular topographic processing tools. Users provide the inputs required for each tool and in what sequence they want to combine them. This information is then stored for future reuse (and optionally sharing with others) before the user then downloads a single package that contains all the input and output specifications together with the software tools themselves. The user then launches the included batch file that executes the workflow on their local computer against their topographic data. This ZCloudTools architecture helps standardize, automate and archive topographic data processing. It also represents a forum for discovering and sharing effective topographic processing workflows.
ESO Reflex: A Graphical Workflow Engine for Data Reduction

NASA Astrophysics Data System (ADS)

Hook, R.; Romaniello, M.; Péron, M.; Ballester, P.; Gabasch, A.; Izzo, C.; Ullgrén, M.; Maisala, S.; Oittinen, T.; Solin, O.; Savolainen, V.; Järveläinen, P.; Tyynelä, J.

2008-08-01

Sampo {http://www.eso.org/sampo} (Hook et al. 2005) is a project led by ESO and conducted by a software development team from Finland as an in-kind contribution to joining ESO. The goal is to assess the needs of the ESO community in the area of data reduction environments and to create pilot software products that illustrate critical steps along the road to a new system. Those prototypes will not only be used to validate concepts and understand requirements but will also be tools of immediate value for the community. Most of the raw data produced by ESO instruments can be reduced using CPL {http://www.eso.org/cpl} recipes: compiled C programs following an ESO standard and utilizing routines provided by the Common Pipeline Library. Currently reduction recipes are run in batch mode as part of the data flow system to generate the input to the ESO VLT/VLTI quality control process and are also made public for external users. Sampo has developed a prototype application called ESO Reflex {http://www.eso.org/sampo/reflex/} that integrates a graphical user interface and existing data reduction algorithms. ESO Reflex can invoke CPL-based recipes in a flexible way through a dedicated interface. ESO Reflex is based on the graphical workflow engine Taverna {http://taverna.sourceforge.net} that was originally developed by the UK eScience community, mostly for work in the life sciences. Workflows have been created so far for three VLT/VLTI instrument modes ( VIMOS/IFU {http://www.eso.org/instruments/vimos/}, FORS spectroscopy {http://www.eso.org/instruments/fors/} and AMBER {http://www.eso.org/instruments/amber/}), and the easy-to-use GUI allows the user to make changes to these or create workflows of their own. Python scripts and IDL procedures can be easily brought into workflows and a variety of visualisation and display options, including custom product inspection and validation steps, are available.
qPortal: A platform for data-driven biomedical research.

PubMed

Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

2018-01-01

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on high-performance computing systems via coupling of workflow management systems. Integration of project and data management as well as workflow resources in one place present clear advantages over existing solutions.
Automated reporting of pharmacokinetic study results: gaining efficiency downstream from the laboratory.

PubMed

Schaefer, Peter

2011-07-01

The purpose of bioanalysis in the pharmaceutical industry is to provide 'raw' data about the concentration of a drug candidate and its metabolites as input for studies of drug properties such as pharmacokinetic (PK), toxicokinetic, bioavailability/bioequivalence and other studies. Building a seamless workflow from the laboratory to final reports is an ongoing challenge for IT groups and users alike. In such a workflow, PK automation can provide companies with the means to vastly increase the productivity of their scientific staff while improving the quality and consistency of their reports on PK analyses. This report presents the concept and benefits of PK automation and discuss which features of an automated reporting workflow should be translated into software requirements that pharmaceutical companies can use to select or build an efficient and effective PK automation solution that best meets their needs.
Opportunistic Computing with Lobster: Lessons Learned from Scaling up to 25k Non-Dedicated Cores

NASA Astrophysics Data System (ADS)

Wolf, Matthias; Woodard, Anna; Li, Wenzhao; Hurtado Anampa, Kenyi; Yannakopoulos, Anna; Tovar, Benjamin; Donnelly, Patrick; Brenner, Paul; Lannon, Kevin; Hildreth, Mike; Thain, Douglas

2017-10-01

We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges. Categories: Workflows can now be divided into categories based on their required system resources. This allows the batch queueing system to optimize assignment of tasks to nodes with the appropriate capabilities. Within each category, limits can be specified for the number of running jobs to regulate the utilization of communication bandwidth. System resource specifications for a task category can now be modified while a project is running, avoiding the need to restart the project if resource requirements differ from the initial estimates. Lobster now implements time limits on each task category to voluntarily terminate tasks. This allows partially completed work to be recovered. Workflow dependency specification: One workflow often requires data from other workflows as input. Rather than waiting for earlier workflows to be completed before beginning later ones, Lobster now allows dependent tasks to begin as soon as sufficient input data has accumulated. Resource monitoring: Lobster utilizes a new capability in Work Queue to monitor the system resources each task requires in order to identify bottlenecks and optimally assign tasks. The capability of the Lobster opportunistic workflow management system for HEP computation has been significantly increased. We have demonstrated efficient utilization of 25 000 non-dedicated cores and achieved a data input rate of 30 Gb/s and an output rate of 500GB/h. This has required new capabilities in task categorization, workflow dependency specification, and resource monitoring.
Flexible workflow sharing and execution services for e-scientists

NASA Astrophysics Data System (ADS)

Kacsuk, Péter; Terstyanszky, Gábor; Kiss, Tamas; Sipos, Gergely

2013-04-01

The sequence of computational and data manipulation steps required to perform a specific scientific analysis is called a workflow. Workflows that orchestrate data and/or compute intensive applications on Distributed Computing Infrastructures (DCIs) recently became standard tools in e-science. At the same time the broad and fragmented landscape of workflows and DCIs slows down the uptake of workflow-based work. The development, sharing, integration and execution of workflows is still a challenge for many scientists. The FP7 "Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs" (SHIWA) project significantly improved the situation, with a simulation platform that connects different workflow systems, different workflow languages, different DCIs and workflows into a single, interoperable unit. The SHIWA Simulation Platform is a service package, already used by various scientific communities, and used as a tool by the recently started ER-flow FP7 project to expand the use of workflows among European scientists. The presentation will introduce the SHIWA Simulation Platform and the services that ER-flow provides based on the platform to space and earth science researchers. The SHIWA Simulation Platform includes: 1. SHIWA Repository: A database where workflows and meta-data about workflows can be stored. The database is a central repository to discover and share workflows within and among communities . 2. SHIWA Portal: A web portal that is integrated with the SHIWA Repository and includes a workflow executor engine that can orchestrate various types of workflows on various grid and cloud platforms. 3. SHIWA Desktop: A desktop environment that provides similar access capabilities than the SHIWA Portal, however it runs on the users' desktops/laptops instead of a portal server. 4. Workflow engines: the ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflow engines are already integrated with the execution engine of the SHIWA Portal. Other engines can be added when required. Through the SHIWA Portal one can define and run simulations on the SHIWA Virtual Organisation, an e-infrastructure that gathers computing and data resources from various DCIs, including the European Grid Infrastructure. The Portal via third party workflow engines provides support for the most widely used academic workflow engines and it can be extended with other engines on demand. Such extensions translate between workflow languages and facilitate the nesting of workflows into larger workflows even when those are written in different languages and require different interpreters for execution. Through the workflow repository and the portal lonely scientists and scientific collaborations can share and offer workflows for reuse and execution. Given the integrated nature of the SHIWA Simulation Platform the shared workflows can be executed online, without installing any special client environment and downloading workflows. The FP7 "Building a European Research Community through Interoperable Workflows and Data" (ER-flow) project disseminates the achievements of the SHIWA project and use these achievements to build workflow user communities across Europe. ER-flow provides application supports to research communities within and beyond the project consortium to develop, share and run workflows with the SHIWA Simulation Platform.
Overview of open resources to support automated structure verification and elucidation

EPA Science Inventory

Cheminformatics methods form an essential basis for providing analytical scientists with access to data, algorithms and workflows. There are an increasing number of free online databases (compound databases, spectral libraries, data repositories) and a rich collection of software...
A dedicated software application for treatment verification with off-line PET/CT imaging at the Heidelberg Ion Beam Therapy Center

NASA Astrophysics Data System (ADS)

Chen, W.; Bauer, J.; Kurz, C.; Tessonnier, T.; Handrack, J.; Haberer, T.; Debus, J.; Parodi, K.

2017-01-01

We present the workflow of the offline-PET based range verification method used at the Heidelberg Ion Beam Therapy Center, detailing the functionalities of an in-house developed software application, SimInterface14, with which range analysis is performed. Moreover, we introduce the design of a decision support system assessing uncertainties and facilitating physicians in decisions making for plan adaptation.
PeptidePicker: a scientific workflow with web interface for selecting appropriate peptides for targeted proteomics experiments.

PubMed

Mohammed, Yassene; Domański, Dominik; Jackson, Angela M; Smith, Derek S; Deelder, André M; Palmblad, Magnus; Borchers, Christoph H

2014-06-25

One challenge in Multiple Reaction Monitoring (MRM)-based proteomics is to select the most appropriate surrogate peptides to represent a target protein. We present here a software package to automatically generate these most appropriate surrogate peptides for an LC/MRM-MS analysis. Our method integrates information about the proteins, their tryptic peptides, and the suitability of these peptides for MRM which is available online in UniProtKB, NCBI's dbSNP, ExPASy, PeptideAtlas, PRIDE, and GPMDB. The scoring algorithm reflects our knowledge in choosing the best candidate peptides for MRM, based on the uniqueness of the peptide in the targeted proteome, its physiochemical properties, and whether it previously has been observed. The modularity of the workflow allows further extension and additional selection criteria to be incorporated. We have developed a simple Web interface where the researcher provides the protein accession number, the subject organism, and peptide-specific options. Currently, the software is designed for human and mouse proteomes, but additional species can be easily be added. Our software improved the peptide selection by eliminating human error, considering multiple data sources and all of the isoforms of the protein, and resulted in faster peptide selection - approximately 50 proteins per hour compared to 8 per day. Compiling a list of optimal surrogate peptides for target proteins to be analyzed by LC/MRM-MS has been a cumbersome process, in which expert researchers retrieved information from different online repositories and used their own reasoning to find the most appropriate peptides. Our scientific workflow automates this process by integrating information from different data sources including UniProt, Global Proteome Machine, NCBI's dbSNP, and PeptideAtlas, simulating the researchers' reasoning, and incorporating their knowledge of how to select the best proteotypic peptides for an MRM analysis. The developed software can help to standardize the selection of peptides, eliminate human error, and increase productivity. Copyright © 2014 Elsevier B.V. All rights reserved.
Real-Time Multimission Event Notification System for Mars Relay

NASA Technical Reports Server (NTRS)

Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Wang, Paul; Hy, Franklin H.

2013-01-01

As the Mars Relay Network is in constant flux (missions and teams going through their daily workflow), it is imperative that users are aware of such state changes. For example, a change by an orbiter team can affect operations on a lander team. This software provides an ambient view of the real-time status of the Mars network. The Mars Relay Operations Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay Network. As part of MaROS, a feature set was developed that operates on several levels of the software architecture. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. The result is a real-time event notification and management system, so mission teams can track and act upon events on a moment-by-moment basis. This software retrieves events from MaROS and displays them to the end user. Updates happen in real time, i.e., messages are pushed to the user while logged into the system, and queued when the user is not online for later viewing. The software does not do away with the email notifications, but augments them with in-line notifications. Further, this software expands the events that can generate a notification, and allows user-generated notifications. Existing software sends a smaller subset of mission-generated notifications via email. A common complaint of users was that the system-generated e-mails often "get lost" with other e-mail that comes in. This software allows for an expanded set (including user-generated) of notifications displayed in-line of the program. By separating notifications, this can improve a user's workflow.
Reproducible Research in the Geosciences at Scale: Achievable Goal or Elusive Dream?

NASA Astrophysics Data System (ADS)

Wyborn, L. A.; Evans, B. J. K.

2016-12-01

Reproducibility is a fundamental tenant of the scientific method: it implies that any researcher, or a third party working independently, can duplicate any experiment or investigation and produce the same results. Historically computationally based research involved an individual using their own data and processing it in their own private area, often using software they wrote or inherited from close collaborators. Today, a researcher is likely to be part of a large team that will use a subset of data from an external repository and then process the data on a public or private cloud or on a large centralised supercomputer, using a mixture of their own code, third party software and libraries, or global community codes. In 'Big Geoscience' research it is common for data inputs to be extracts from externally managed dynamic data collections, where new data is being regularly appended, or existing data is revised when errors are detected and/or as processing methods are improved. New workflows increasingly use services to access data dynamically to create subsets on-the-fly from distributed sources, each of which can have a complex history. At major computational facilities, underlying systems, libraries, software and services are being constantly tuned and optimised, or as new or replacement infrastructure being installed. Likewise code used from a community repository is continually being refined, re-packaged and ported to the target platform. To achieve reproducibility, today's researcher increasingly needs to track their workflow, including querying information on the current or historical state of facilities used. Versioning methods are standard practice for software repositories or packages, but it is not common for either data repositories or data services to provide information about their state, or for systems to provide query-able access to changes in the underlying software. While a researcher can achieve transparency and describe steps in their workflow so that others can repeat them and replicate processes undertaken, they cannot achieve exact reproducibility or even transparency of results generated. In Big Geoscience, full reproducibiliy will be an elusive dream until data repositories and compute facilities can provide provenance information in a standards compliant, machine query-able way.
Realizing the Living Paper using the ProvONE Model for Reproducible Research

NASA Astrophysics Data System (ADS)

Jones, M. B.; Jones, C. S.; Ludäscher, B.; Missier, P.; Walker, L.; Slaughter, P.; Schildhauer, M.; Cuevas-Vicenttín, V.

2015-12-01

Science has advanced through traditional publications that codify research results as a permenant part of the scientific record. But because publications are static and atomic, researchers can only cite and reference a whole work when building on prior work of colleagues. The open source software model has demonstrated a new approach in which strong version control in an open environment can nurture an open ecosystem of software. Developers now commonly fork and extend software giving proper credit, with less repetition, and with confidence in the relationship to original software. Through initiatives like 'Beyond the PDF', an analogous model has been imagined for open science, in which software, data, analyses, and derived products become first class objects within a publishing ecosystem that has evolved to be finer-grained and is realized through a web of linked open data. We have prototyped a Living Paper concept by developing the ProvONE provenance model for scientific workflows, with prototype deployments in DataONE. ProvONE promotes transparency and openness by describing the authenticity, origin, structure, and processing history of research artifacts and by detailing the steps in computational workflows that produce derived products. To realize the Living Paper, we decompose scientific papers into their constituent products and publish these as compound objects in the DataONE federation of archival repositories. Each individual finding and sub-product of a reseach project (such as a derived data table, a workflow or script, a figure, an image, or a finding) can be independently stored, versioned, and cited. ProvONE provenance traces link these fine-grained products within and across versions of a paper, and across related papers that extend an original analysis. This allows for open scientific publishing in which researchers extend and modify findings, creating a dynamic, evolving web of results that collectively represent the scientific enterprise. The Living Paper provides detailed metadata for properly interpreting and verifying individual research findings, for tracing the origin of ideas, for launching new lines of inquiry, and for implementing transitive credit for research and engineering.
LipidMatch: an automated workflow for rule-based lipid identification using untargeted high-resolution tandem mass spectrometry data.

PubMed

Koelmel, Jeremy P; Kroeger, Nicholas M; Ulmer, Candice Z; Bowden, John A; Patterson, Rainey E; Cochran, Jason A; Beecher, Christopher W W; Garrett, Timothy J; Yost, Richard A

2017-07-10

Lipids are ubiquitous and serve numerous biological functions; thus lipids have been shown to have great potential as candidates for elucidating biomarkers and pathway perturbations associated with disease. Methods expanding coverage of the lipidome increase the likelihood of biomarker discovery and could lead to more comprehensive understanding of disease etiology. We introduce LipidMatch, an R-based tool for lipid identification for liquid chromatography tandem mass spectrometry workflows. LipidMatch currently has over 250,000 lipid species spanning 56 lipid types contained in in silico fragmentation libraries. Unique fragmentation libraries, compared to other open source software, include oxidized lipids, bile acids, sphingosines, and previously uncharacterized adducts, including ammoniated cardiolipins. LipidMatch uses rule-based identification. For each lipid type, the user can select which fragments must be observed for identification. Rule-based identification allows for correct annotation of lipids based on the fragments observed, unlike typical identification based solely on spectral similarity scores, where over-reporting structural details that are not conferred by fragmentation data is common. Another unique feature of LipidMatch is ranking lipid identifications for a given feature by the sum of fragment intensities. For each lipid candidate, the intensities of experimental fragments with exact mass matches to expected in silico fragments are summed. The lipid identifications with the greatest summed intensity using this ranking algorithm were comparable to other lipid identification software annotations, MS-DIAL and Greazy. For example, for features with identifications from all 3 software, 92% of LipidMatch identifications by fatty acyl constituents were corroborated by at least one other software in positive mode and 98% in negative ion mode. LipidMatch allows users to annotate lipids across a wide range of high resolution tandem mass spectrometry experiments, including imaging experiments, direct infusion experiments, and experiments employing liquid chromatography. LipidMatch leverages the most extensive in silico fragmentation libraries of freely available software. When integrated into a larger lipidomics workflow, LipidMatch may increase the probability of finding lipid-based biomarkers and determining etiology of disease by covering a greater portion of the lipidome and using annotation which does not over-report biologically relevant structural details of identified lipid molecules.
Handling Metadata in a Neurophysiology Laboratory

PubMed Central

Zehl, Lyuba; Jaillet, Florent; Stoewer, Adrian; Grewe, Jan; Sobolev, Andrey; Wachtler, Thomas; Brochier, Thomas G.; Riehle, Alexa; Denker, Michael; Grün, Sonja

2016-01-01

To date, non-reproducibility of neurophysiological research is a matter of intense discussion in the scientific community. A crucial component to enhance reproducibility is to comprehensively collect and store metadata, that is, all information about the experiment, the data, and the applied preprocessing steps on the data, such that they can be accessed and shared in a consistent and simple manner. However, the complexity of experiments, the highly specialized analysis workflows and a lack of knowledge on how to make use of supporting software tools often overburden researchers to perform such a detailed documentation. For this reason, the collected metadata are often incomplete, incomprehensible for outsiders or ambiguous. Based on our research experience in dealing with diverse datasets, we here provide conceptual and technical guidance to overcome the challenges associated with the collection, organization, and storage of metadata in a neurophysiology laboratory. Through the concrete example of managing the metadata of a complex experiment that yields multi-channel recordings from monkeys performing a behavioral motor task, we practically demonstrate the implementation of these approaches and solutions with the intention that they may be generalized to other projects. Moreover, we detail five use cases that demonstrate the resulting benefits of constructing a well-organized metadata collection when processing or analyzing the recorded data, in particular when these are shared between laboratories in a modern scientific collaboration. Finally, we suggest an adaptable workflow to accumulate, structure and store metadata from different sources using, by way of example, the odML metadata framework. PMID:27486397
Handling Metadata in a Neurophysiology Laboratory.

PubMed

Zehl, Lyuba; Jaillet, Florent; Stoewer, Adrian; Grewe, Jan; Sobolev, Andrey; Wachtler, Thomas; Brochier, Thomas G; Riehle, Alexa; Denker, Michael; Grün, Sonja

2016-01-01

To date, non-reproducibility of neurophysiological research is a matter of intense discussion in the scientific community. A crucial component to enhance reproducibility is to comprehensively collect and store metadata, that is, all information about the experiment, the data, and the applied preprocessing steps on the data, such that they can be accessed and shared in a consistent and simple manner. However, the complexity of experiments, the highly specialized analysis workflows and a lack of knowledge on how to make use of supporting software tools often overburden researchers to perform such a detailed documentation. For this reason, the collected metadata are often incomplete, incomprehensible for outsiders or ambiguous. Based on our research experience in dealing with diverse datasets, we here provide conceptual and technical guidance to overcome the challenges associated with the collection, organization, and storage of metadata in a neurophysiology laboratory. Through the concrete example of managing the metadata of a complex experiment that yields multi-channel recordings from monkeys performing a behavioral motor task, we practically demonstrate the implementation of these approaches and solutions with the intention that they may be generalized to other projects. Moreover, we detail five use cases that demonstrate the resulting benefits of constructing a well-organized metadata collection when processing or analyzing the recorded data, in particular when these are shared between laboratories in a modern scientific collaboration. Finally, we suggest an adaptable workflow to accumulate, structure and store metadata from different sources using, by way of example, the odML metadata framework.
Automated quality control in a file-based broadcasting workflow

NASA Astrophysics Data System (ADS)

Zhang, Lina

2014-04-01

Benefit from the development of information and internet technologies, television broadcasting is transforming from inefficient tape-based production and distribution to integrated file-based workflows. However, no matter how many changes have took place, successful broadcasting still depends on the ability to deliver a consistent high quality signal to the audiences. After the transition from tape to file, traditional methods of manual quality control (QC) become inadequate, subjective, and inefficient. Based on China Central Television's full file-based workflow in the new site, this paper introduces an automated quality control test system for accurate detection of hidden troubles in media contents. It discusses the system framework and workflow control when the automated QC is added. It puts forward a QC criterion and brings forth a QC software followed this criterion. It also does some experiments on QC speed by adopting parallel processing and distributed computing. The performance of the test system shows that the adoption of automated QC can make the production effective and efficient, and help the station to achieve a competitive advantage in the media market.
Orchid: a novel management, annotation and machine learning framework for analyzing cancer mutations.

PubMed

Cario, Clinton L; Witte, John S

2018-03-15

As whole-genome tumor sequence and biological annotation datasets grow in size, number and content, there is an increasing basic science and clinical need for efficient and accurate data management and analysis software. With the emergence of increasingly sophisticated data stores, execution environments and machine learning algorithms, there is also a need for the integration of functionality across frameworks. We present orchid, a python based software package for the management, annotation and machine learning of cancer mutations. Building on technologies of parallel workflow execution, in-memory database storage and machine learning analytics, orchid efficiently handles millions of mutations and hundreds of features in an easy-to-use manner. We describe the implementation of orchid and demonstrate its ability to distinguish tissue of origin in 12 tumor types based on 339 features using a random forest classifier. Orchid and our annotated tumor mutation database are freely available at https://github.com/wittelab/orchid. Software is implemented in python 2.7, and makes use of MySQL or MemSQL databases. Groovy 2.4.5 is optionally required for parallel workflow execution. JWitte@ucsf.edu. Supplementary data are available at Bioinformatics online.
Designing the safety of healthcare. Participation of ergonomics to the design of cooperative systems in radiotherapy.

PubMed

Munoz, Maria Isabel; Bouldi, Nadia; Barcellini, Flore; Nascimento, Adelaide

2012-01-01

This communication deals with the involvement of ergonomists in a research-action design process of a software platform in radiotherapy. The goal of the design project is to enhance patient safety by designing a workflow software that supports cooperation between professionals producing treatment in radiotherapy. The general framework of our approach is the ergonomics management of a design process, which is based in activity analysis and grounded in participatory design. Two fields are concerned by the present action: a design environment which is a participatory design process that involves software designers, caregivers as future users and ergonomists; and a reference real work setting in radiotherapy. Observations, semi-structured interviews and participatory workshops allow the characterization of activity in radiotherapy dealing with uses of cooperative tools, sources of variability and non-ruled strategies to manage the variability of the situations. This production of knowledge about work searches to enhance the articulation between technocentric and anthropocentric approaches, and helps in clarifying design requirements. An issue of this research-action is to develop a framework to define the parameters of the workflow tool, and the conditions of its deployment.

COMPASS: a suite of pre- and post-search proteomics software tools for OMSSA

PubMed Central

Wenger, Craig D.; Phanstiel, Douglas H.; Lee, M. Violet; Bailey, Derek J.; Coon, Joshua J.

2011-01-01

Here we present the Coon OMSSA Proteomic Analysis Software Suite (COMPASS): a free and open-source software pipeline for high-throughput analysis of proteomics data, designed around the Open Mass Spectrometry Search Algorithm. We detail a synergistic set of tools for protein database generation, spectral reduction, peptide false discovery rate analysis, peptide quantitation via isobaric labeling, protein parsimony and protein false discovery rate analysis, and protein quantitation. We strive for maximum ease of use, utilizing graphical user interfaces and working with data files in the original instrument vendor format. Results are stored in plain text comma-separated values files, which are easy to view and manipulate with a text editor or spreadsheet program. We illustrate the operation and efficacy of COMPASS through the use of two LC–MS/MS datasets. The first is a dataset of a highly annotated mixture of standard proteins and manually validated contaminants that exhibits the identification workflow. The second is a dataset of yeast peptides, labeled with isobaric stable isotope tags and mixed in known ratios, to demonstrate the quantitative workflow. For these two datasets, COMPASS performs equivalently or better than the current de facto standard, the Trans-Proteomic Pipeline. PMID:21298793
Ergatis: a web interface and scalable software system for bioinformatics workflows

PubMed Central

Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

2010-01-01

Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634
PyGOLD: a python based API for docking based virtual screening workflow generation.

PubMed

Patel, Hitesh; Brinkjost, Tobias; Koch, Oliver

2017-08-15

Molecular docking is one of the successful approaches in structure based discovery and development of bioactive molecules in chemical biology and medicinal chemistry. Due to the huge amount of computational time that is still required, docking is often the last step in a virtual screening approach. Such screenings are set as workflows spanned over many steps, each aiming at different filtering task. These workflows can be automatized in large parts using python based toolkits except for docking using the docking software GOLD. However, within an automated virtual screening workflow it is not feasible to use the GUI in between every step to change the GOLD configuration file. Thus, a python module called PyGOLD was developed, to parse, edit and write the GOLD configuration file and to automate docking based virtual screening workflows. The latest version of PyGOLD, its documentation and example scripts are available at: http://www.ccb.tu-dortmund.de/koch or http://www.agkoch.de. PyGOLD is implemented in Python and can be imported as a standard python module without any further dependencies. oliver.koch@agkoch.de, oliver.koch@tu-dortmund.de. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Service Oriented Architecture for Coast Guard Command and Control

DTIC Science & Technology

2007-03-01

Operations BPEL4WS The Business Process Execution Language for Web Services BPMN Business Process Modeling Notation CASP Computer Aided Search Planning...Business Process Modeling Notation ( BPMN ) provides a standardized graphical notation for drawing business processes in a workflow. Software tools
Grid-based platform for training in Earth Observation

NASA Astrophysics Data System (ADS)

Petcu, Dana; Zaharie, Daniela; Panica, Silviu; Frincu, Marc; Neagul, Marian; Gorgan, Dorian; Stefanut, Teodor

2010-05-01

GiSHEO platform [1] providing on-demand services for training and high education in Earth Observation is developed, in the frame of an ESA funded project through its PECS programme, to respond to the needs of powerful education resources in remote sensing field. It intends to be a Grid-based platform of which potential for experimentation and extensibility are the key benefits compared with a desktop software solution. Near-real time applications requiring simultaneous multiple short-time-response data-intensive tasks, as in the case of a short time training event, are the ones that are proved to be ideal for this platform. The platform is based on Globus Toolkit 4 facilities for security and process management, and on the clusters of four academic institutions involved in the project. The authorization uses a VOMS service. The main public services are the followings: the EO processing services (represented through special WSRF-type services); the workflow service exposing a particular workflow engine; the data indexing and discovery service for accessing the data management mechanisms; the processing services, a collection allowing easy access to the processing platform. The WSRF-type services for basic satellite image processing are reusing free image processing tools, OpenCV and GDAL. New algorithms and workflows were develop to tackle with challenging problems like detecting the underground remains of old fortifications, walls or houses. More details can be found in [2]. Composed services can be specified through workflows and are easy to be deployed. The workflow engine, OSyRIS (Orchestration System using a Rule based Inference Solution), is based on DROOLS, and a new rule-based workflow language, SILK (SImple Language for worKflow), has been built. Workflow creation in SILK can be done with or without a visual designing tools. The basics of SILK are the tasks and relations (rules) between them. It is similar with the SCUFL language, but not relying on XML in order to allow the introduction of more workflow specific issues. Moreover, an event-condition-action (ECA) approach allows a greater flexibility when expressing data and task dependencies, as well as the creation of adaptive workflows which can react to changes in the configuration of the Grid or in the workflow itself. Changes inside the grid are handled by creating specific rules which allow resource selection based on various task scheduling criteria. Modifications of the workflow are usually accomplished either by inserting or retracting at runtime rules belonging to it or by modifying the executor of the task in case a better one is found. The former implies changes in its structure while the latter does not necessarily mean changes of the resource but more precisely changes of the algorithm used for solving the task. More details can be found in [3]. Another important platform component is the data indexing and storage service, GDIS, providing features for data storage, indexing data using a specialized RDBMS, finding data by various conditions, querying external services and keeping track of temporary data generated by other components. The data storage component part of GDIS is responsible for storing the data by using available storage backends such as local disk file systems (ext3), local cluster storage (GFS) or distributed file systems (HDFS). A front-end GridFTP service is capable of interacting with the storage domains on behalf of the clients and in a uniform way and also enforces the security restrictions provided by other specialized services and related with data access. The data indexing is performed by PostGIS. An advanced and flexible interface for searching the project's geographical repository is built around a custom query language (LLQL - Lisp Like Query Language) designed to provide fine grained access to the data in the repository and to query external services (e.g. for exploiting the connection with GENESI-DR catalog). More details can be found in [4]. The Workload Management System (WMS) provides two types of resource managers. The first one will be based on Condor HTC and use Condor as a job manager for task dispatching and working nodes (for development purposes) while the second one will use GT4 GRAM (for production purposes). The WMS main component, the Grid Task Dispatcher (GTD), is responsible for the interaction with other internal services as the composition engine in order to facilitate access to the processing platform. Its main responsibilities are to receive tasks from the workflow engine or directly from user interface, to use a task description language (the ClassAd meta language in case of Condor HTC) for job units, to submit and check the status of jobs inside the workload management system and to retrieve job logs for debugging purposes. More details can be found in [4]. A particular component of the platform is eGLE, the eLearning environment. It provides the functionalities necessary to create the visual appearance of the lessons through the usage of visual containers like tools, patterns and templates. The teacher uses the platform for testing the already created lessons, as well as for developing new lesson resources, such as new images and workflows describing graph-based processing. The students execute the lessons or describe and experiment with new workflows or different data. The eGLE database includes several workflow-based lesson descriptions, teaching materials and lesson resources, selected satellite and spatial data. More details can be found in [5]. A first training event of using the platform was organized in September 2009 during 11th SYNASC symposium (links to the demos, testing interface, and exercises are available on project site [1]). The eGLE component was presented at 4th GPC conference in May 2009. Moreover, the functionality of the platform will be presented as demo in April 2010 at 5th EGEE User forum. References: [1] GiSHEO consortium, Project site, http://gisheo.info.uvt.ro [2] D. Petcu, D. Zaharie, M. Neagul, S. Panica, M. Frincu, D. Gorgan, T. Stefanut, V. Bacu, Remote Sensed Image Processing on Grids for Training in Earth Observation. In Image Processing, V. Kordic (ed.), In-Tech, January 2010. [3] M. Neagul, S. Panica, D. Petcu, D. Zaharie, D. Gorgan, Web and Grid Services for Training in Earth Observation, IDAACS 2009, IEEE Computer Press, 241-246 [4] M. Frincu, S. Panica, M. Neagul, D. Petcu, Gisheo: On Demand Grid Service Based Platform for EO Data Processing. HiperGrid 2009, Politehnica Press, 415-422. [5] D. Gorgan, T. Stefanut, V. Bacu, Grid Based Training Environment for Earth Observation, GPC 2009, LNCS 5529, 98-109
Multidisciplinary HIS DICOM interfaces at the Department of Veterans Affairs

NASA Astrophysics Data System (ADS)

Kuzmak, Peter M.; Dayhoff, Ruth E.

2000-05-01

The U.S. Department of Veterans Affairs (VA) is using the Digital Imaging and Communications in Medicine (DICOM) standard to integrate image data objects from multiple systems for use across the healthcare enterprise. DICOM uses a structured representation of image data and a communication mechanism that allows the VA to easily acquire images from multiple sources and store them directly into the online patient record. The VA can obtain both radiology and non- radiology images using DICOM, and can display them on low-cost clinician's color workstations throughout the medical center. High-resolution gray-scale diagnostic quality multi-monitor workstations with specialized viewing software can be used for reading radiology images. The VA's DICOM capabilities can interface six different commercial Picture Archiving and Communication Systems (PACS) and over twenty different image acquisition modalities. The VA is advancing its use of DICOM beyond radiology. New color imaging applications for Gastrointestinal Endoscopy and Ophthalmology using DICOM are under development. These are the first DICOM offerings for the vendors, who are planning to support the recently passed DICOM Visible Light and Structured Reporting service classes. Implementing these in VistA is a challenge because of the different workflow and software support for these disciplines within the VA HIS environment.
Reducing Time to Science: Unidata and JupyterHub Technology Using the Jetstream Cloud

NASA Astrophysics Data System (ADS)

Chastang, J.; Signell, R. P.; Fischer, J. L.

2017-12-01

Cloud computing can accelerate scientific workflows, discovery, and collaborations by reducing research and data friction. We describe the deployment of Unidata and JupyterHub technologies on the NSF-funded XSEDE Jetstream cloud. With the aid of virtual machines and Docker technology, we deploy a Unidata JupyterHub server co-located with a Local Data Manager (LDM), THREDDS data server (TDS), and RAMADDA geoscience content management system. We provide Jupyter Notebooks and the pre-built Python environments needed to run them. The notebooks can be used for instruction and as templates for scientific experimentation and discovery. We also supply a large quantity of NCEP forecast model results to allow data-proximate analysis and visualization. In addition, users can transfer data using Globus command line tools, and perform their own data-proximate analysis and visualization with Notebook technology. These data can be shared with others via a dedicated TDS server for scientific distribution and collaboration. There are many benefits of this approach. Not only is the cloud computing environment fast, reliable and scalable, but scientists can analyze, visualize, and share data using only their web browser. No local specialized desktop software or a fast internet connection is required. This environment will enable scientists to spend less time managing their software and more time doing science.
Development and implementation of a radiation therapy incident learning system compatible with local workflow and a national taxonomy.

PubMed

Montgomery, Logan; Fava, Palma; Freeman, Carolyn R; Hijal, Tarek; Maietta, Ciro; Parker, William; Kildea, John

2018-01-01

Collaborative incident learning initiatives in radiation therapy promise to improve and standardize the quality of care provided by participating institutions. However, the software interfaces provided with such initiatives must accommodate all participants and thus are not optimized for the workflows of individual radiation therapy centers. This article describes the development and implementation of a radiation therapy incident learning system that is optimized for a clinical workflow and uses the taxonomy of the Canadian National System for Incident Reporting - Radiation Treatment (NSIR-RT). The described incident learning system is a novel version of an open-source software called the Safety and Incident Learning System (SaILS). A needs assessment was conducted prior to development to ensure SaILS (a) was intuitive and efficient (b) met changing staff needs and (c) accommodated revisions to NSIR-RT. The core functionality of SaILS includes incident reporting, investigations, tracking, and data visualization. Postlaunch modifications of SaILS were informed by discussion and a survey of radiation therapy staff. There were 240 incidents detected and reported using SaILS in 2016 and the number of incidents per month tended to increase throughout the year. An increase in incident reporting occurred after switching to fully online incident reporting from an initial hybrid paper-electronic system. Incident templating functionality and a connection with our center's oncology information system were incorporated into the investigation interface to minimize repetitive data entry. A taskable actions feature was also incorporated to document outcomes of incident reports and has since been utilized for 36% of reported incidents. Use of SaILS and the NSIR-RT taxonomy has improved the structure of, and staff engagement with, incident learning in our center. Software and workflow modifications informed by staff feedback improved the utility of SaILS and yielded an efficient and transparent solution to categorize incidents with the NSIR-RT taxonomy. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Using CyberShake Workflows to Manage Big Seismic Hazard Data on Large-Scale Open-Science HPC Resources

NASA Astrophysics Data System (ADS)

Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

2015-12-01

The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and migrating from file-based communication to MPI messaging, to greatly reduce the I/O demands and node-hour requirements of CyberShake. We will also present performance metrics from CyberShake Study 15.4, and discuss challenges that producers of Big Data on open-science HPC resources face moving forward.
lumpR 2.0.0: an R package facilitating landscape discretisation for hillslope-based hydrological models

NASA Astrophysics Data System (ADS)

Pilz, Tobias; Francke, Till; Bronstert, Axel

2017-08-01

The characteristics of a landscape pose essential factors for hydrological processes. Therefore, an adequate representation of the landscape of a catchment in hydrological models is vital. However, many of such models exist differing, amongst others, in spatial concept and discretisation. The latter constitutes an essential pre-processing step, for which many different algorithms along with numerous software implementations exist. In that context, existing solutions are often model specific, commercial, or depend on commercial back-end software, and allow only a limited or no workflow automation at all. Consequently, a new package for the scientific software and scripting environment R, called lumpR, was developed. lumpR employs an algorithm for hillslope-based landscape discretisation directed to large-scale application via a hierarchical multi-scale approach. The package addresses existing limitations as it is free and open source, easily extendible to other hydrological models, and the workflow can be fully automated. Moreover, it is user-friendly as the direct coupling to a GIS allows for immediate visual inspection and manual adjustment. Sufficient control is furthermore retained via parameter specification and the option to include expert knowledge. Conversely, completely automatic operation also allows for extensive analysis of aspects related to landscape discretisation. In a case study, the application of the package is presented. A sensitivity analysis of the most important discretisation parameters demonstrates its efficient workflow automation. Considering multiple streamflow metrics, the employed model proved reasonably robust to the discretisation parameters. However, parameters determining the sizes of subbasins and hillslopes proved to be more important than the others, including the number of representative hillslopes, the number of attributes employed for the lumping algorithm, and the number of sub-discretisations of the representative hillslopes.
An integrated development workflow for community-driven FOSS-projects using continuous integration tools

NASA Astrophysics Data System (ADS)

Bilke, Lars; Watanabe, Norihiro; Naumov, Dmitri; Kolditz, Olaf

2016-04-01

A complex software project in general with high standards regarding code quality requires automated tools to help developers in doing repetitive and tedious tasks such as compilation on different platforms and configurations, doing unit testing as well as end-to-end tests and generating distributable binaries and documentation. This is known as continuous integration (CI). A community-driven FOSS-project within the Earth Sciences benefits even more from CI as time and resources regarding software development are often limited. Therefore testing developed code on more than the developers PC is a task which is often neglected and where CI can be the solution. We developed an integrated workflow based on GitHub, Travis and Jenkins for the community project OpenGeoSys - a coupled multiphysics modeling and simulation package - allowing developers to concentrate on implementing new features in a tight feedback loop. Every interested developer/user can create a pull request containing source code modifications on the online collaboration platform GitHub. The modifications are checked (compilation, compiler warnings, memory leaks, undefined behaviors, unit tests, end-to-end tests, analyzing differences in simulation run results between changes etc.) from the CI system which automatically responds to the pull request or by email on success or failure with detailed reports eventually requesting to improve the modifications. Core team developers review the modifications and merge them into the main development line once they satisfy agreed standards. We aim for efficient data structures and algorithms, self-explaining code, comprehensive documentation and high test code coverage. This workflow keeps entry barriers to get involved into the project low and permits an agile development process concentrating on feature additions rather than software maintenance procedures.
Biological Web Service Repositories Review

PubMed Central

Urdidiales‐Nieto, David; Navas‐Delgado, Ismael

2016-01-01

Abstract Web services play a key role in bioinformatics enabling the integration of database access and analysis of algorithms. However, Web service repositories do not usually publish information on the changes made to their registered Web services. Dynamism is directly related to the changes in the repositories (services registered or unregistered) and at service level (annotation changes). Thus, users, software clients or workflow based approaches lack enough relevant information to decide when they should review or re‐execute a Web service or workflow to get updated or improved results. The dynamism of the repository could be a measure for workflow developers to re‐check service availability and annotation changes in the services of interest to them. This paper presents a review on the most well‐known Web service repositories in the life sciences including an analysis of their dynamism. Freshness is introduced in this paper, and has been used as the measure for the dynamism of these repositories. PMID:27783459
Policy Driven Development: Flexible Policy Insertion for Large Scale Systems.

PubMed

Demchak, Barry; Krüger, Ingolf

2012-07-01

The success of a software system depends critically on how well it reflects and adapts to stakeholder requirements. Traditional development methods often frustrate stakeholders by creating long latencies between requirement articulation and system deployment, especially in large scale systems. One source of latency is the maintenance of policy decisions encoded directly into system workflows at development time, including those involving access control and feature set selection. We created the Policy Driven Development (PDD) methodology to address these development latencies by enabling the flexible injection of decision points into existing workflows at runtime , thus enabling policy composition that integrates requirements furnished by multiple, oblivious stakeholder groups. Using PDD, we designed and implemented a production cyberinfrastructure that demonstrates policy and workflow injection that quickly implements stakeholder requirements, including features not contemplated in the original system design. PDD provides a path to quickly and cost effectively evolve such applications over a long lifetime.
STAR Data Reconstruction at NERSC/Cori, an adaptable Docker container approach for HPC

NASA Astrophysics Data System (ADS)

Mustafa, Mustafa; Balewski, Jan; Lauret, Jérôme; Porter, Jefferson; Canon, Shane; Gerhardt, Lisa; Hajdu, Levente; Lukascsyk, Mark

2017-10-01

As HPC facilities grow their resources, adaptation of classic HEP/NP workflows becomes a need. Linux containers may very well offer a way to lower the bar to exploiting such resources and at the time, help collaboration to reach vast elastic resources on such facilities and address their massive current and future data processing challenges. In this proceeding, we showcase STAR data reconstruction workflow at Cori HPC system at NERSC. STAR software is packaged in a Docker image and runs at Cori in Shifter containers. We highlight two of the typical end-to-end optimization challenges for such pipelines: 1) data transfer rate which was carried over ESnet after optimizing end points and 2) scalable deployment of conditions database in an HPC environment. Our tests demonstrate equally efficient data processing workflows on Cori/HPC, comparable to standard Linux clusters.
Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

PubMed

Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

2016-06-01

Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
Understanding dental CAD/CAM for restorations--the digital workflow from a mechanical engineering viewpoint.

PubMed

Tapie, L; Lebon, N; Mawussi, B; Fron Chabouis, H; Duret, F; Attal, J-P

2015-01-01

As digital technology infiltrates every area of daily life, including the field of medicine, so it is increasingly being introduced into dental practice. Apart from chairside practice, computer-aided design/computer-aided manufacturing (CAD/CAM) solutions are available for creating inlays, crowns, fixed partial dentures (FPDs), implant abutments, and other dental prostheses. CAD/CAM dental solutions can be considered a chain of digital devices and software for the almost automatic design and creation of dental restorations. However, dentists who want to use the technology often do not have the time or knowledge to understand it. A basic knowledge of the CAD/CAM digital workflow for dental restorations can help dentists to grasp the technology and purchase a CAM/CAM system that meets the needs of their office. This article provides a computer-science and mechanical-engineering approach to the CAD/CAM digital workflow to help dentists understand the technology.
75 FR 26320 - Fourteenth Plenary Meeting, RTCA Special Committee 205/EUROCAE WG 71: Software Considerations in...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-05-11

... Special Committee 205/EUROCAE WG 71: Software Considerations in Aeronautical Systems AGENCY: Federal Aviation Administration (FAA), DOT. ACTION: Notice of RTCA Special Committee 205/EUROCAE WG 71: Software... meeting of RTCA Special Committee 205/EUROCAE WG 71: Software Considerations in Aeronautical Systems...
75 FR 2924 - Thirteenth Plenary Meeting, RTCA Special Committee 205/EUROCAE WG 71: Software Considerations in...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-01-19

... Special Committee 205/EUROCAE WG 71: Software Considerations in Aeronautical Systems AGENCY: Federal Aviation Administration (FAA), DOT. ACTION: Notice of RTCA Special Committee 205/EUROCAE WG 71: Software... meeting of RTCA Special Committee 205/EUROCAE WG 71: Software Considerations in Aeronautical Systems...
Wearable Notification via Dissemination Service in a Pervasive Computing Environment

DTIC Science & Technology

2015-09-01

context, state, and environment in a manner that would be transparent to a Soldier’s common operations. 15. SUBJECT TERMS pervasive computing, Android ...of user context shifts, i.e., changes in the user’s position, history , workflow, or resource interests. If the PCE is described as a 2-component...convenient viewing on the Glass’s screen just above the line of sight. All of the software developed uses Google’s Android open-source software stack
GUIdock: Using Docker Containers with a Common Graphics User Interface to Address the Reproducibility of Research

PubMed Central

Yeung, Ka Yee

2016-01-01

Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface. PMID:27045593

GUIdock: Using Docker Containers with a Common Graphics User Interface to Address the Reproducibility of Research.

PubMed

Hung, Ling-Hong; Kristiyanto, Daniel; Lee, Sung Bong; Yeung, Ka Yee

2016-01-01

Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface.
A networked modular hardware and software system for MRI-guided robotic prostate interventions

NASA Astrophysics Data System (ADS)

Su, Hao; Shang, Weijian; Harrington, Kevin; Camilo, Alex; Cole, Gregory; Tokuda, Junichi; Hata, Nobuhiko; Tempany, Clare; Fischer, Gregory S.

2012-02-01

Magnetic resonance imaging (MRI) provides high resolution multi-parametric imaging, large soft tissue contrast, and interactive image updates making it an ideal modality for diagnosing prostate cancer and guiding surgical tools. Despite a substantial armamentarium of apparatuses and systems has been developed to assist surgical diagnosis and therapy for MRI-guided procedures over last decade, the unified method to develop high fidelity robotic systems in terms of accuracy, dynamic performance, size, robustness and modularity, to work inside close-bore MRI scanner still remains a challenge. In this work, we develop and evaluate an integrated modular hardware and software system to support the surgical workflow of intra-operative MRI, with percutaneous prostate intervention as an illustrative case. Specifically, the distinct apparatuses and methods include: 1) a robot controller system for precision closed loop control of piezoelectric motors, 2) a robot control interface software that connects the 3D Slicer navigation software and the robot controller to exchange robot commands and coordinates using the OpenIGTLink open network communication protocol, and 3) MRI scan plane alignment to the planned path and imaging of the needle as it is inserted into the target location. A preliminary experiment with ex-vivo phantom validates the system workflow, MRI-compatibility and shows that the robotic system has a better than 0.01mm positioning accuracy.
Custom software development for use in a clinical laboratory

PubMed Central

Sinard, John H.; Gershkovich, Peter

2012-01-01

In-house software development for use in a clinical laboratory is a controversial issue. Many of the objections raised are based on outdated software development practices, an exaggeration of the risks involved, and an underestimation of the benefits that can be realized. Buy versus build analyses typically do not consider total costs of ownership, and unfortunately decisions are often made by people who are not directly affected by the workflow obstacles or benefits that result from those decisions. We have been developing custom software for clinical use for over a decade, and this article presents our perspective on this practice. A complete analysis of the decision to develop or purchase must ultimately examine how the end result will mesh with the departmental workflow, and custom-developed solutions typically can have the greater positive impact on efficiency and productivity, substantially altering the decision balance sheet. Involving the end-users in preparation of the functional specifications is crucial to the success of the process. A large development team is not needed, and even a single programmer can develop significant solutions. Many of the risks associated with custom development can be mitigated by a well-structured development process, use of open-source tools, and embracing an agile development philosophy. In-house solutions have the significant advantage of being adaptable to changing departmental needs, contributing to efficient and higher quality patient care. PMID:23372985
Custom software development for use in a clinical laboratory.

PubMed

Sinard, John H; Gershkovich, Peter

2012-01-01

In-house software development for use in a clinical laboratory is a controversial issue. Many of the objections raised are based on outdated software development practices, an exaggeration of the risks involved, and an underestimation of the benefits that can be realized. Buy versus build analyses typically do not consider total costs of ownership, and unfortunately decisions are often made by people who are not directly affected by the workflow obstacles or benefits that result from those decisions. We have been developing custom software for clinical use for over a decade, and this article presents our perspective on this practice. A complete analysis of the decision to develop or purchase must ultimately examine how the end result will mesh with the departmental workflow, and custom-developed solutions typically can have the greater positive impact on efficiency and productivity, substantially altering the decision balance sheet. Involving the end-users in preparation of the functional specifications is crucial to the success of the process. A large development team is not needed, and even a single programmer can develop significant solutions. Many of the risks associated with custom development can be mitigated by a well-structured development process, use of open-source tools, and embracing an agile development philosophy. In-house solutions have the significant advantage of being adaptable to changing departmental needs, contributing to efficient and higher quality patient care.
ToxMiner Software Interface for Visualizing and Analyzing ToxCast Data

EPA Science Inventory

The ToxCast dataset represents a collection of assays and endpoints that will require both standard statistical approaches as well as customized data analysis workflows. To analyze this unique dataset, we have developed an integrated database with Javabased interface called ToxMi...
Towards usable and interdisciplinary e-infrastructure (Invited)

NASA Astrophysics Data System (ADS)

de Roure, D.

2010-12-01

e-Science and cyberinfrastucture at their outset tended to focus on ‘big science’ and cross-organisational infrastructures, demonstrating complex engineering with the promise of high returns. It soon became evident that the key to researchers harnessing new technology for everyday use is a user-centric approach which empowers the user - both from a developer and an end user viewpoint. For example, this philosophy is demonstrated in workflow systems for systematic data processing and in the Web 2.0 approach as exemplified by the myExperiment social web site for sharing workflows, methods and ‘research objects’. Hence the most disruptive aspect of Cloud and virtualisation is perhaps that they make new computational resources and applications usable, creating a flourishing ecosystem for routine processing and innovation alike - and in this we must consider software sustainability. This talk will discuss the changing nature of e-Science digital ecosystem, focus on the e-infrastructure for cross-disciplinary work, and highlight issues in sustainable software development in this context.
A software-aided workflow for precinct-scale residential redevelopment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glackin, Stephen, E-mail: sglackin@swin.edu.au; Trubka, Roman, E-mail: r.trubka@gmail.com; Dionisio, Maria Rita, E-mail: rita.dionisio@canterbury.ac.nz

2016-09-15

Growing urban populations, combined with environmental challenges, have placed significant pressure on urban planning to supply housing while addressing policy issues such as sustainability, affordability, and liveability. The interrelated nature of these issues, combined with the requirement of evidence-based planning, has made decision-making so complex that urban planners need to combine expertise on energy, water, carbon emissions, transport and economic development along with other bodies of knowledge necessary to make well-informed decisions. This paper presents two geospatial software systems that can assist in the mediation of complexity, by allowing users to assess a variety of planning metrics without expert knowledgemore » in those disciplines. Using Envision and Envision Scenario Planner (ESP), both products of the Greening the Greyfields research project funded by the Cooperative Research Centre for Spatial Information (CRCSI) in Australia, we demonstrate a workflow for identifying potential redevelopment precincts and designing and assessing possible redevelopment scenarios to optimise planning outcomes.« less
Workflow of CAD / CAM Scoliosis Brace Adjustment in Preparation Using 3D Printing.

PubMed

Weiss, Hans-Rudolf; Tournavitis, Nicos; Nan, Xiaofeng; Borysov, Maksym; Paul, Lothar

2017-01-01

High correction bracing is the most effective conservative treatment for patients with scoliosis during growth. Still today braces for the treatment of scoliosis are made by casting patients while computer aided design (CAD) and computer aided manufacturing (CAM) is available with all possibilities to standardize pattern specific brace treatment and improve wearing comfort. CAD / CAM brace production mainly relies on carving a polyurethane foam model which is the basis for vacuuming a polyethylene (PE) or polypropylene (PP) brace. Purpose of this short communication is to describe the workflow currently used and to outline future requirements with respect to 3D printing technology. Description of the steps of virtual brace adjustment as available today are content of this paper as well as an outline of the great potential there is for the future 3D printing technology. For 3D printing of scoliosis braces it is necessary to establish easy to use software plug-ins in order to allow adding 3D printing technology to the current workflow of virtual CAD / CAM brace adjustment. Textures and structures can be added to the brace models at certain well defined locations offering the potential of more wearing comfort without losing in-brace correction. Advances have to be made in the field of CAD / CAM software tools with respect to design and generation of individually structured brace models based on currently well established and standardized scoliosis brace libraries.
IQ-Station: A Low Cost Portable Immersive Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eric Whiting; Patrick O'Leary; William Sherman

2010-11-01

The emergence of inexpensive 3D TV’s, affordable input and rendering hardware and open-source software has created a yeasty atmosphere for the development of low-cost immersive environments (IE). A low cost IE system, or IQ-station, fashioned from commercial off the shelf technology (COTS), coupled with a targeted immersive application can be a viable laboratory instrument for enhancing scientific workflow for exploration and analysis. The use of an IQ-station in a laboratory setting also has the potential of quickening the adoption of a more sophisticated immersive environment as a critical enabler in modern scientific and engineering workflows. Prior work in immersive environmentsmore » generally required either a head mounted display (HMD) system or a large projector-based implementation both of which have limitations in terms of cost, usability, or space requirements. The solution presented here provides an alternative platform providing a reasonable immersive experience that addresses those limitations. Our work brings together the needed hardware and software to create a fully integrated immersive display and interface system that can be readily deployed in laboratories and common workspaces. By doing so, it is now feasible for immersive technologies to be included in researchers’ day-to-day workflows. The IQ-Station sets the stage for much wider adoption of immersive environments outside the small communities of virtual reality centers.« less
MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME

PubMed Central

Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

2017-01-01

Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments. PMID:28190948
Lipidomics informatics for life-science.

PubMed

Schwudke, D; Shevchenko, A; Hoffmann, N; Ahrends, R

2017-11-10

Lipidomics encompasses analytical approaches that aim to identify and quantify the complete set of lipids, defined as lipidome in a given cell, tissue or organism as well as their interactions with other molecules. The majority of lipidomics workflows is based on mass spectrometry and has been proven as a powerful tool in system biology in concert with other Omics disciplines. Unfortunately, bioinformatics infrastructures for this relatively young discipline are limited only to some specialists. Search engines, quantification algorithms, visualization tools and databases developed by the 'Lipidomics Informatics for Life-Science' (LIFS) partners will be restructured and standardized to provide broad access to these specialized bioinformatics pipelines. There are many medical challenges related to lipid metabolic alterations that will be fostered by capacity building suggested by LIFS. LIFS as member of the 'German Network for Bioinformatics' (de.NBI) node for 'Bioinformatics for Proteomics' (BioInfra.Prot) and will provide access to the described software as well as to tutorials and consulting services via a unified web-portal. Copyright © 2017 Elsevier B.V. All rights reserved.
Bar Coding and Tracking in Pathology.

PubMed

Hanna, Matthew G; Pantanowitz, Liron

2016-03-01

Bar coding and specimen tracking are intricately linked to pathology workflow and efficiency. In the pathology laboratory, bar coding facilitates many laboratory practices, including specimen tracking, automation, and quality management. Data obtained from bar coding can be used to identify, locate, standardize, and audit specimens to achieve maximal laboratory efficiency and patient safety. Variables that need to be considered when implementing and maintaining a bar coding and tracking system include assets to be labeled, bar code symbologies, hardware, software, workflow, and laboratory and information technology infrastructure as well as interoperability with the laboratory information system. This article addresses these issues, primarily focusing on surgical pathology. Copyright © 2016 Elsevier Inc. All rights reserved.
An integrated workflow for analysis of ChIP-chip data.

PubMed

Weigelt, Karin; Moehle, Christoph; Stempfl, Thomas; Weber, Bernhard; Langmann, Thomas

2008-08-01

Although ChIP-chip is a powerful tool for genome-wide discovery of transcription factor target genes, the steps involving raw data analysis, identification of promoters, and correlation with binding sites are still laborious processes. Therefore, we report an integrated workflow for the analysis of promoter tiling arrays with the Genomatix ChipInspector system. We compare this tool with open-source software packages to identify PU.1 regulated genes in mouse macrophages. Our results suggest that ChipInspector data analysis, comparative genomics for binding site prediction, and pathway/network modeling significantly facilitate and enhance whole-genome promoter profiling to reveal in vivo sites of transcription factor-DNA interactions.
Clinical Note Creation, Binning, and Artificial Intelligence

PubMed Central

Deliberato, Rodrigo Octávio; Stone, David J

2017-01-01

The creation of medical notes in software applications poses an intrinsic problem in workflow as the technology inherently intervenes in the processes of collecting and assembling information, as well as the production of a data-driven note that meets both individual and healthcare system requirements. In addition, the note writing applications in currently available electronic health records (EHRs) do not function to support decision making to any substantial degree. We suggest that artificial intelligence (AI) could be utilized to facilitate the workflows of the data collection and assembly processes, as well as to support the development of personalized, yet data-driven assessments and plans. PMID:28778845
Bar Coding and Tracking in Pathology.

PubMed

Hanna, Matthew G; Pantanowitz, Liron

2015-06-01

Bar coding and specimen tracking are intricately linked to pathology workflow and efficiency. In the pathology laboratory, bar coding facilitates many laboratory practices, including specimen tracking, automation, and quality management. Data obtained from bar coding can be used to identify, locate, standardize, and audit specimens to achieve maximal laboratory efficiency and patient safety. Variables that need to be considered when implementing and maintaining a bar coding and tracking system include assets to be labeled, bar code symbologies, hardware, software, workflow, and laboratory and information technology infrastructure as well as interoperability with the laboratory information system. This article addresses these issues, primarily focusing on surgical pathology. Copyright © 2015 Elsevier Inc. All rights reserved.
Software Prototyping: A Case Report of Refining User Requirements for a Health Information Exchange Dashboard.

PubMed

Nelson, Scott D; Del Fiol, Guilherme; Hanseler, Haley; Crouch, Barbara Insley; Cummins, Mollie R

2016-01-01

Health information exchange (HIE) between Poison Control Centers (PCCs) and Emergency Departments (EDs) could improve care of poisoned patients. However, PCC information systems are not designed to facilitate HIE with EDs; therefore, we are developing specialized software to support HIE within the normal workflow of the PCC using user-centered design and rapid prototyping. To describe the design of an HIE dashboard and the refinement of user requirements through rapid prototyping. Using previously elicited user requirements, we designed low-fidelity sketches of designs on paper with iterative refinement. Next, we designed an interactive high-fidelity prototype and conducted scenario-based usability tests with end users. Users were asked to think aloud while accomplishing tasks related to a case vignette. After testing, the users provided feedback and evaluated the prototype using the System Usability Scale (SUS). Survey results from three users provided useful feedback that was then incorporated into the design. After achieving a stable design, we used the prototype itself as the specification for development of the actual software. Benefits of prototyping included having 1) subject-matter experts heavily involved with the design; 2) flexibility to make rapid changes, 3) the ability to minimize software development efforts early in the design stage; 4) rapid finalization of requirements; 5) early visualization of designs; 6) and a powerful vehicle for communication of the design to the programmers. Challenges included 1) time and effort to develop the prototypes and case scenarios; 2) no simulation of system performance; 3) not having all proposed functionality available in the final product; and 4) missing needed data elements in the PCC information system.
Application Examples for Handle System Usage

NASA Astrophysics Data System (ADS)

Toussaint, F.; Weigel, T.; Thiemann, H.; Höck, H.; Stockhause, M.; Lautenschlager, M.

2012-12-01

Besides the well-known DOI (Digital Object Identifiers) as a special form of Handles that resolve to scientific publications there are various other applications in use. Others perhaps are just not yet. We present some examples for the existing ones and some ideas for the future. The national German project C3-Grid provides a framework to implement a first solution for provenance tracing and explore unforeseen implications. Though project-specific, the high-level architecture is generic and represents well a common notion of data derivation. Users select one or many input datasets and a workflow software module (an agent in this context) to execute on the data. The output data is deposited in a repository to be delivered to the user. All data is accompanied by an XML metadata document. All input and output data, metadata and the workflow module receive Handles and are linked together to establish a directed acyclic graph of derived data objects and involved agents. Data that has been modified by a workflow module is linked to its predecessor data and the workflow module involved. Version control systems such as svn or git provide Internet access to software repositories using URLs. To refer to a specific state of the source code of for instance a C3 workflow module, it is sufficient to reference the URL to the svn revision or git hash. In consequence, individual revisions and the repository as a whole receive PIDs. Moreover, the revision specific PIDs are linked to their respective predecessors and become part of the provenance graph. Another example for usage of PIDs in a current major project is given in EUDAT (European Data Infrastructure) which will link scientific data of several research communities together. In many fields it is necessary to provide data objects at multiple locations for a variety of applications. To ensure consistency, not only the master of a data object but also its copies shall be provided with a PID. To verify transaction safety and to keep all copies consistent requires that the chain from master to copy and vice versa has to be resolvable, preferably through PIDs directly. As part of EUDAT necessary services are created on the basis of iRODS. These form the core structure of the data infrastructure developed within EUDAT. Though many implementations of PID systems already exist, many valuable web accessible data sources come with unresolvable identifiers like UUIDs, with instable recognition patterns like URLs, or even with proprietary implementations. However, other data collections would like to link to them in the data descriptions of their metadata. In addition, by usage of PIDs one can decouple the responsibilities for data and MD in projects where necessary. For some metadata entities like persons or even institutes it makes sense to give them single PIDs that point to contact and/or location information. ORCID (Open Researcher & Contributor ID), e.g., keeps track of persons working in scholarly fields, independent of name changes and linguistic variances. The ISO 27729 based International Standard Name Identifier (ISNI) also identifies legal entities and fictional characters besides natural persons. Other systems exist that, e.g., reference geographic localities. IDs of this kind may resolve to a URL where detailed information is given.
Improving hydrologic disaster forecasting and response for transportation by assimilating and fusing NASA and other data sets : final report.

DOT National Transportation Integrated Search

2017-04-15

In this 3-year project, the research team developed the Hydrologic Disaster Forecast and Response (HDFR) system, a set of integrated software tools for end users that streamlines hydrologic prediction workflows involving automated retrieval of hetero...
The ABCs of PDFs.

ERIC Educational Resources Information Center

Adler, Steve

2000-01-01

Explains the use of Adobe Acrobat's Portable Document Format (PDF) for school Web sites and Intranets. Explains the PDF workflow; components for Web-based PDF delivery, including the Web server, preparing content of the PDF files, and the browser; incorporating PDFs into the Web site; incorporating multimedia; and software. (LRW)
Workflows for ingest of research data into digital archives - tests with Archivematica

NASA Astrophysics Data System (ADS)

Kirchner, I.; Bertelmann, R.; Gebauer, P.; Hasler, T.; Hirt, M.; Klump, J. F.; Peters-Kotting, W.; Rusch, B.; Ulbricht, D.

2013-12-01

Publication of research data and future re-use of measured data require the long-term preservation of digital objects. The ISO OAIS reference model defines responsibilities for long-term preservation of digital objects and although there is software available to support preservation of digital data, there are still problems remaining to be solved. A key task in preservation is to make the datasets ready for ingest into the archive, which is called the creation of Submission Information Packages (SIPs) in the OAIS model. This includes the creation of appropriate preservation metadata. Scientists need to be trained to deal with different types of data and to heighten their awareness for quality metadata. Other problems arise during the assembly of SIPs and during ingest into the archive because file format validators may produce conflicting output for identical data files and these conflicts are difficult to resolve automatically. Also, validation and identification tools are notorious for their poor performance. In the project EWIG Zuse-Institute Berlin acts as an infrastructure facility, while the Institute for Meteorology at FU Berlin and the German research Centre for Geosciences GFZ act as two different data producers. The aim of the project is to develop workflows for the transfer of research data into digital archives and the future re-use of data from long-term archives with emphasis on data from the geosciences. The technical work is supplemented by interviews with data practitioners at several institutions to identify problems in digital preservation workflows and by the development of university teaching materials to train students in the curation of research data and metadata. The free and open-source software Archivematica [1] is used as digital preservation system. The creation and ingest of SIPs has to meet several archival standards and be compatible to the Metadata Encoding and Transmission Standard (METS). The two data producers use different software in their workflows to test the assembly of SIPs and ingest of SIPs into the archive. GFZ Potsdam uses a combination of eSciDoc [2], panMetaDocs [3], and bagit [4] to collect research data and assemble SIPs for ingest into Archivematica, while the Institute for Meteorology at FU Berlin evaluates a variety of software solutions to describe data and publications and to generate SIPs. [1] http://www.archivematica.org [2] http://www.escidoc.org [3] http://panmetadocs.sf.net [4] http://sourceforge.net/projects/loc-xferutils/

Development of Analytical Plug-ins for ENSITE: Version 1.0

DTIC Science & Technology

2017-11-01

ENSITE’s core-software platform builds upon leading geospatial platforms already in use by the Army and is designed to offer an easy-to-use, customized ...use by the Army and is designed to offer an easy-to-use, customized set of workflows for CB planners. Within this platform are added software compo...public good . Find out more at www.erdc.usace.army.mil. To search for other technical reports published by ERDC, visit the ERDC online library at
Server-based enterprise collaboration software improves safety and quality in high-volume PET/CT practice.

PubMed

McDonald, James E; Kessler, Marcus M; Hightower, Jeremy L; Henry, Susan D; Deloney, Linda A

2013-12-01

With increasing volumes of complex imaging cases and rising economic pressure on physician staffing, timely reporting will become progressively challenging. Current and planned iterations of PACS and electronic medical record systems do not offer workflow management tools to coordinate delivery of imaging interpretations with the needs of the patient and ordering physician. The adoption of a server-based enterprise collaboration software system by our Division of Nuclear Medicine has significantly improved our efficiency and quality of service.
Managing Written Directives: A Software Solution to Streamline Workflow.

PubMed

Wagner, Robert H; Savir-Baruch, Bital; Gabriel, Medhat S; Halama, James R; Bova, Davide

2017-06-01

A written directive is required by the U.S. Nuclear Regulatory Commission for any use of 131 I above 1.11 MBq (30 μCi) and for patients receiving radiopharmaceutical therapy. This requirement has also been adopted and must be enforced by the agreement states. As the introduction of new radiopharmaceuticals increases therapeutic options in nuclear medicine, time spent on regulatory paperwork also increases. The pressure of managing these time-consuming regulatory requirements may heighten the potential for inaccurate or incomplete directive data and subsequent regulatory violations. To improve on the paper-trail method of directive management, we created a software tool using a Health Insurance Portability and Accountability Act (HIPAA)-compliant database. This software allows for secure data-sharing among physicians, technologists, and managers while saving time, reducing errors, and eliminating the possibility of loss and duplication. Methods: The software tool was developed using Visual Basic, which is part of the Visual Studio development environment for the Windows platform. Patient data are deposited in an Access database on a local HIPAA-compliant secure server or hard disk. Once a working version had been developed, it was installed at our institution and used to manage directives. Updates and modifications of the software were released regularly until no more significant problems were found with its operation. Results: The software has been used at our institution for over 2 y and has reliably kept track of all directives. All physicians and technologists use the software daily and find it superior to paper directives. They can retrieve active directives at any stage of completion, as well as completed directives. Conclusion: We have developed a software solution for the management of written directives that streamlines and structures the departmental workflow. This solution saves time, centralizes the information for all staff to share, and decreases confusion about the creation, completion, filing, and retrieval of directives. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.
GiA Roots: software for the high throughput analysis of plant root system architecture.

PubMed

Galkovskyi, Taras; Mileyko, Yuriy; Bucksch, Alexander; Moore, Brad; Symonova, Olga; Price, Charles A; Topp, Christopher N; Iyer-Pascuzzi, Anjali S; Zurek, Paul R; Fang, Suqin; Harer, John; Benfey, Philip N; Weitz, Joshua S

2012-07-26

Characterizing root system architecture (RSA) is essential to understanding the development and function of vascular plants. Identifying RSA-associated genes also represents an underexplored opportunity for crop improvement. Software tools are needed to accelerate the pace at which quantitative traits of RSA are estimated from images of root networks. We have developed GiA Roots (General Image Analysis of Roots), a semi-automated software tool designed specifically for the high-throughput analysis of root system images. GiA Roots includes user-assisted algorithms to distinguish root from background and a fully automated pipeline that extracts dozens of root system phenotypes. Quantitative information on each phenotype, along with intermediate steps for full reproducibility, is returned to the end-user for downstream analysis. GiA Roots has a GUI front end and a command-line interface for interweaving the software into large-scale workflows. GiA Roots can also be extended to estimate novel phenotypes specified by the end-user. We demonstrate the use of GiA Roots on a set of 2393 images of rice roots representing 12 genotypes from the species Oryza sativa. We validate trait measurements against prior analyses of this image set that demonstrated that RSA traits are likely heritable and associated with genotypic differences. Moreover, we demonstrate that GiA Roots is extensible and an end-user can add functionality so that GiA Roots can estimate novel RSA traits. In summary, we show that the software can function as an efficient tool as part of a workflow to move from large numbers of root images to downstream analysis.
A software tool for advanced MRgFUS prostate therapy planning and follow up

NASA Astrophysics Data System (ADS)

van Straaten, Dörte; Hoogenboom, Martijn; van Amerongen, Martinus J.; Weiler, Florian; Issawi, Jumana Al; Günther, Matthias; Fütterer, Jurgen; Jenne, Jürgen W.

2017-03-01

US guided HIFU/FUS ablation for the therapy of prostate cancer is a clinical established method, while MR guided HIFU/FUS applications for prostate recently started clinical evaluation. Even if MRI examination is an excellent diagnostic tool for prostate cancer, it is a time consuming procedure and not practicable within an MRgFUS therapy session. The aim of our ongoing work is to develop software to support therapy planning and post-therapy follow-up for MRgFUS on localized prostate cancer, based on multi-parametric MR protocols. The clinical workflow of diagnosis, therapy and follow-up of MR guided FUS on prostate cancer was deeply analyzed. Based on this, the image processing workflow was designed and all necessary components, e.g. GUI, viewer, registration tools etc. were defined and implemented. The software bases on MeVisLab with several implemented C++ modules for the image processing tasks. The developed software, called LTC (Local Therapy Control) will register and visualize automatically all images (T1w, T2w, DWI etc.) and ADC or perfusion maps gained from the diagnostic MRI session. This maximum of diagnostic information helps to segment all necessary ROIs, e.g. the tumor, for therapy planning. Final therapy planning will be performed based on these segmentation data in the following MRgFUS therapy session. In addition, the developed software should help to evaluate the therapy success, by synchronization and display of pre-therapeutic, therapy and follow-up image data including the therapy plan and thermal dose information. In this ongoing project, the first stand-alone prototype was completed and will be clinically evaluated.
KDE Bioscience: platform for bioinformatics analysis workflows.

PubMed

Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

2006-08-01

Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research.
pyNSMC: A Python Module for Null-Space Monte Carlo Uncertainty Analysis

NASA Astrophysics Data System (ADS)

White, J.; Brakefield, L. K.

2015-12-01

The null-space monte carlo technique is a non-linear uncertainty analyses technique that is well-suited to high-dimensional inverse problems. While the technique is powerful, the existing workflow for completing null-space monte carlo is cumbersome, requiring the use of multiple commandline utilities, several sets of intermediate files and even a text editor. pyNSMC is an open-source python module that automates the workflow of null-space monte carlo uncertainty analyses. The module is fully compatible with the PEST and PEST++ software suites and leverages existing functionality of pyEMU, a python framework for linear-based uncertainty analyses. pyNSMC greatly simplifies the existing workflow for null-space monte carlo by taking advantage of object oriented design facilities in python. The core of pyNSMC is the ensemble class, which draws and stores realized random vectors and also provides functionality for exporting and visualizing results. By relieving users of the tedium associated with file handling and command line utility execution, pyNSMC instead focuses the user on the important steps and assumptions of null-space monte carlo analysis. Furthermore, pyNSMC facilitates learning through flow charts and results visualization, which are available at many points in the algorithm. The ease-of-use of the pyNSMC workflow is compared to the existing workflow for null-space monte carlo for a synthetic groundwater model with hundreds of estimable parameters.
Workflow Management for Complex HEP Analyses

NASA Astrophysics Data System (ADS)

Erdmann, M.; Fischer, R.; Rieger, M.; von Cube, R. F.

2017-10-01

We present the novel Analysis Workflow Management (AWM) that provides users with the tools and competences of professional large scale workflow systems, e.g. Apache’s Airavata[1]. The approach presents a paradigm shift from executing parts of the analysis to defining the analysis. Within AWM an analysis consists of steps. For example, a step defines to run a certain executable for multiple files of an input data collection. Each call to the executable for one of those input files can be submitted to the desired run location, which could be the local computer or a remote batch system. An integrated software manager enables automated user installation of dependencies in the working directory at the run location. Each execution of a step item creates one report for bookkeeping purposes containing error codes and output data or file references. Required files, e.g. created by previous steps, are retrieved automatically. Since data storage and run locations are exchangeable from the steps perspective, computing resources can be used opportunistically. A visualization of the workflow as a graph of the steps in the web browser provides a high-level view on the analysis. The workflow system is developed and tested alongside of a ttbb cross section measurement where, for instance, the event selection is represented by one step and a Bayesian statistical inference is performed by another. The clear interface and dependencies between steps enables a make-like execution of the whole analysis.
Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology.

PubMed

Cock, Peter J A; Grüning, Björn A; Paszkiewicz, Konrad; Pritchard, Leighton

2013-01-01

The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of "effector" proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen's predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu).
PyDBS: an automated image processing workflow for deep brain stimulation surgery.

PubMed

D'Albis, Tiziano; Haegelen, Claire; Essert, Caroline; Fernández-Vidal, Sara; Lalys, Florent; Jannin, Pierre

2015-02-01

Deep brain stimulation (DBS) is a surgical procedure for treating motor-related neurological disorders. DBS clinical efficacy hinges on precise surgical planning and accurate electrode placement, which in turn call upon several image processing and visualization tasks, such as image registration, image segmentation, image fusion, and 3D visualization. These tasks are often performed by a heterogeneous set of software tools, which adopt differing formats and geometrical conventions and require patient-specific parameterization or interactive tuning. To overcome these issues, we introduce in this article PyDBS, a fully integrated and automated image processing workflow for DBS surgery. PyDBS consists of three image processing pipelines and three visualization modules assisting clinicians through the entire DBS surgical workflow, from the preoperative planning of electrode trajectories to the postoperative assessment of electrode placement. The system's robustness, speed, and accuracy were assessed by means of a retrospective validation, based on 92 clinical cases. The complete PyDBS workflow achieved satisfactory results in 92 % of tested cases, with a median processing time of 28 min per patient. The results obtained are compatible with the adoption of PyDBS in clinical practice.
An NGS Workflow Blueprint for DNA Sequencing Data and Its Application in Individualized Molecular Oncology

PubMed Central

Li, Jian; Batcha, Aarif Mohamed Nazeer; Grüning, Björn; Mansmann, Ulrich R.

2015-01-01

Next-generation sequencing (NGS) technologies that have advanced rapidly in the past few years possess the potential to classify diseases, decipher the molecular code of related cell processes, identify targets for decision-making on targeted therapy or prevention strategies, and predict clinical treatment response. Thus, NGS is on its way to revolutionize oncology. With the help of NGS, we can draw a finer map for the genetic basis of diseases and can improve our understanding of diagnostic and prognostic applications and therapeutic methods. Despite these advantages and its potential, NGS is facing several critical challenges, including reduction of sequencing cost, enhancement of sequencing quality, improvement of technical simplicity and reliability, and development of semiautomated and integrated analysis workflow. In order to address these challenges, we conducted a literature research and summarized a four-stage NGS workflow for providing a systematic review on NGS-based analysis, explaining the strength and weakness of diverse NGS-based software tools, and elucidating its potential connection to individualized medicine. By presenting this four-stage NGS workflow, we try to provide a minimal structural layout required for NGS data storage and reproducibility. PMID:27081306
Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis.

PubMed

Sreedharan, Vipin T; Schultheiss, Sebastian J; Jean, Géraldine; Kahles, André; Bohnert, Regina; Drewe, Philipp; Mudrakarta, Pramod; Görnitz, Nico; Zeller, Georg; Rätsch, Gunnar

2014-05-01

We present Oqtans, an open-source workbench for quantitative transcriptome analysis, that is integrated in Galaxy. Its distinguishing features include customizable computational workflows and a modular pipeline architecture that facilitates comparative assessment of tool and data quality. Oqtans integrates an assortment of machine learning-powered tools into Galaxy, which show superior or equal performance to state-of-the-art tools. Implemented tools comprise a complete transcriptome analysis workflow: short-read alignment, transcript identification/quantification and differential expression analysis. Oqtans and Galaxy facilitate persistent storage, data exchange and documentation of intermediate results and analysis workflows. We illustrate how Oqtans aids the interpretation of data from different experiments in easy to understand use cases. Users can easily create their own workflows and extend Oqtans by integrating specific tools. Oqtans is available as (i) a cloud machine image with a demo instance at cloud.oqtans.org, (ii) a public Galaxy instance at galaxy.cbio.mskcc.org, (iii) a git repository containing all installed software (oqtans.org/git); most of which is also available from (iv) the Galaxy Toolshed and (v) a share string to use along with Galaxy CloudMan.
Managing the CMS Data and Monte Carlo Processing during LHC Run 2

NASA Astrophysics Data System (ADS)

Wissing, C.; CMS Collaboration

2017-10-01

In order to cope with the challenges expected during the LHC Run 2 CMS put in a number of enhancements into the main software packages and the tools used for centrally managed processing. In the presentation we will highlight these improvements that allow CMS to deal with the increased trigger output rate, the increased pileup and the evolution in computing technology. The overall system aims at high flexibility, improved operational flexibility and largely automated procedures. The tight coupling of workflow classes to types of sites has been drastically relaxed. Reliable and high-performing networking between most of the computing sites and the successful deployment of a data-federation allow the execution of workflows using remote data access. That required the development of a largely automatized system to assign workflows and to handle necessary pre-staging of data. Another step towards flexibility has been the introduction of one large global HTCondor Pool for all types of processing workflows and analysis jobs. Besides classical Grid resources also some opportunistic resources as well as Cloud resources have been integrated into that Pool, which gives reach to more than 200k CPU cores.
LATUX: An Iterative Workflow for Designing, Validating, and Deploying Learning Analytics Visualizations

ERIC Educational Resources Information Center

Martinez-Maldonado, Roberto; Pardo, Abelardo; Mirriahi, Negin; Yacef, Kalina; Kay, Judy; Clayphan, Andrew

2015-01-01

Designing, validating, and deploying learning analytics tools for instructors or students is a challenge that requires techniques and methods from different disciplines, such as software engineering, human-computer interaction, computer graphics, educational design, and psychology. Whilst each has established its own design methodologies, we now…
Ontology-Driven Discovery of Scientific Computational Entities

ERIC Educational Resources Information Center

Brazier, Pearl W.

2010-01-01

Many geoscientists use modern computational resources, such as software applications, Web services, scientific workflows and datasets that are readily available on the Internet, to support their research and many common tasks. These resources are often shared via human contact and sometimes stored in data portals; however, they are not necessarily…
Engineering Documentation and Data Control

NASA Technical Reports Server (NTRS)

Matteson, Michael J.; Bramley, Craig; Ciaruffoli, Veronica

2001-01-01

Mississippi Space Services (MSS) the facility services contractor for NASA's John C. Stennis Space Center (SSC), is utilizing technology to improve engineering documentation and data control. Two identified improvement areas, labor intensive documentation research and outdated drafting standards, were targeted as top priority. MSS selected AutoManager(R) WorkFlow from Cyco software to manage engineering documentation. The software is currently installed on over 150 desctops. The outdated SSC drafting standard was written for pre-CADD drafting methods, in other words, board drafting. Implementation of COTS software solutions to manage engineering documentation and update the drafting standard resulted in significant increases in productivity by reducing the time spent searching for documents.
Software to Manage the Unmanageable

NASA Technical Reports Server (NTRS)

2005-01-01

In 1995, NASA s Jet Propulsion Laboratory (JPL) contracted Redmond, Washington-based Lucidoc Corporation, to design a technology infrastructure to automate the intersection between policy management and operations management with advanced software that automates document workflow, document status, and uniformity of document layout. JPL had very specific parameters for the software. It expected to store and catalog over 8,000 technical and procedural documents integrated with hundreds of processes. The project ended in 2000, but NASA still uses the resulting highly secure document management system, and Lucidoc has managed to help other organizations, large and small, with integrating document flow and operations management to ensure a compliance-ready culture.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

PubMed

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
Workflow efficiency of two 1.5 T MR scanners with and without an automated user interface for head examinations.

PubMed

Moenninghoff, Christoph; Umutlu, Lale; Kloeters, Christian; Ringelstein, Adrian; Ladd, Mark E; Sombetzki, Antje; Lauenstein, Thomas C; Forsting, Michael; Schlamann, Marc

2013-06-01

Workflow efficiency and workload of radiological technologists (RTs) were compared in head examinations performed with two 1.5 T magnetic resonance (MR) scanners equipped with or without an automated user interface called "day optimizing throughput" (Dot) workflow engine. Thirty-four patients with known intracranial pathology were examined with a 1.5 T MR scanner with Dot workflow engine (Siemens MAGNETOM Aera) and with a 1.5 T MR scanner with conventional user interface (Siemens MAGNETOM Avanto) using four standardized examination protocols. The elapsed time for all necessary work steps, which were performed by 11 RTs within the total examination time, was compared for each examination at both MR scanners. The RTs evaluated the user-friendliness of both scanners by a questionnaire. Normality of distribution was checked for all continuous variables by use of the Shapiro-Wilk test. Normally distributed variables were analyzed by Student's paired t-test, otherwise Wilcoxon signed-rank test was used to compare means. Total examination time of MR examinations performed with Dot engine was reduced from 24:53 to 20:01 minutes (P < .001) and the necessary RT intervention decreased by 61% (P < .001). The Dot engine's automated choice of MR protocols was significantly better assessed by the RTs than the conventional user interface (P = .001). According to this preliminary study, the Dot workflow engine is a time-saving user assistance software, which decreases the RTs' effort significantly and may help to automate neuroradiological examinations for a higher workflow efficiency. Copyright © 2013 AUR. Published by Elsevier Inc. All rights reserved.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

PubMed Central

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450

PhenoTips: patient phenotyping software for clinical and research use.

PubMed

Girdea, Marta; Dumitriu, Sergiu; Fiume, Marc; Bowdin, Sarah; Boycott, Kym M; Chénier, Sébastien; Chitayat, David; Faghfoury, Hanna; Meyn, M Stephen; Ray, Peter N; So, Joyce; Stavropoulos, Dimitri J; Brudno, Michael

2013-08-01

We have developed PhenoTips: open source software for collecting and analyzing phenotypic information for patients with genetic disorders. Our software combines an easy-to-use interface, compatible with any device that runs a Web browser, with a standardized database back end. The PhenoTips' user interface closely mirrors clinician workflows so as to facilitate the recording of observations made during the patient encounter. Collected data include demographics, medical history, family history, physical and laboratory measurements, physical findings, and additional notes. Phenotypic information is represented using the Human Phenotype Ontology; however, the complexity of the ontology is hidden behind a user interface, which combines simple selection of common phenotypes with error-tolerant, predictive search of the entire ontology. PhenoTips supports accurate diagnosis by analyzing the entered data, then suggesting additional clinical investigations and providing Online Mendelian Inheritance in Man (OMIM) links to likely disorders. By collecting, classifying, and analyzing phenotypic information during the patient encounter, PhenoTips allows for streamlining of clinic workflow, efficient data entry, improved diagnosis, standardization of collected patient phenotypes, and sharing of anonymized patient phenotype data for the study of rare disorders. Our source code and a demo version of PhenoTips are available at http://phenotips.org. © 2013 WILEY PERIODICALS, INC.
Implementation of Systematic Review Tools in IRIS | Science ...

EPA Pesticide Factsheets

Currently, the number of chemicals present in the environment exceeds the ability of public health scientists to efficiently screen the available data in order to produce well-informed human health risk assessments in a timely manner. For this reason, the US EPA’s Integrated Risk Information System (IRIS) program has started implementing new software tools into the hazard characterization workflow. These automated tools aid in multiple phases of the systematic review process including scoping and problem formulation, literature search, and identification and screening of available published studies. The increased availability of these tools lays the foundation for automating or semi-automating multiple phases of the systematic review process. Some of these software tools include modules to facilitate a structured approach to study quality evaluation of human and animal data, although approaches are generally lacking for assessing complex mechanistic information, in particular “omics”-based evidence tools are starting to become available to evaluate these types of studies. We will highlight how new software programs, online tools, and approaches for assessing study quality can be better integrated to allow for a more efficient and transparent workflow of the risk assessment process as well as identify tool gaps that would benefit future risk assessments. Disclaimer: The views expressed here are those of the authors and do not necessarily represent the view
Web Time-Management Tool

NASA Technical Reports Server (NTRS)

2000-01-01

Oak Grove Reactor, developed by Oak Grove Systems, is a new software program that allows users to integrate workflow processes. It can be used with portable communication devices. The software can join e-mail, calendar/scheduling and legacy applications into one interactive system via the web. Priority tasks and due dates are organized and highlighted to keep the user up to date with developments. Reactor works with existing software and few new skills are needed to use it. Using a web browser, a user can can work on something while other users can work on the same procedure or view its status while it is being worked on at another site. The software was developed by the Jet Propulsion Lab and originally put to use at Johnson Space Center.
The eSMAF: a software for the assessment and follow-up of functional autonomy in geriatrics

PubMed Central

Boissy, Patrick; Brière, Simon; Tousignant, Michel; Rousseau, Eric

2007-01-01

Background Functional status or disability forms the core of most assessment instruments used to identify mix and level of resources and services needed by older adults who possess common characteristics. The Functional Autonomy Measurement System (SMAF) is a 29-item scale measuring functional ability in five different areas. It has been recommended for use for home care, for allocation of chronic beds, for developing care plans in institutional settings and for epidemiological and evaluative studies. The SMAF can also be used with a case-mix classification system (Iso-SMAF) to allocate resources based on patients' functional autonomy characteristics. The objective of this project was to develop a software version of the SMAF to facilitate the evaluation of the functional status of older adults in health services research and to optimize the clinical decision-making process. Results The eSMAF was developed over an 24-month period using a modified waterfall software engineering process. Requirements and functional specifications were determined using focus groups of stakeholders. Different versions of the software were iteratively field-tested in clinical and research environments and software adaptations made accordingly. User documentation and online help were created to assist the deployment of the software. The software is available in French or English versions under a 30-day unregistered demonstration license or a free restricted registered academic license. It can be used locally on a Windows-based PC or over a network to input SMAF data into a database, search and aggregate client data according to clinical and/or administrative criteria, and generate summary or detailed reports of selected data sets for print or export to another database. Conclusion In the last year, the software has been successfully deployed in the clinical workflow of different institutions in research and clinical applications. The software performed relatively well in terms of stability and performance. Barriers to implementation included antiquated computer hardware, low computer literacy and access to IT support. Key factors for the deployment of the software included standardization of the workflow, user training and support. PMID:17298673
76 FR 50812 - Seventeenth Meeting: RTCA Special Committee 205/EUROCAE WG-71: Software Considerations in...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-08-16

... Committee 205/EUROCAE WG-71: Software Considerations in Aeronautical Systems AGENCY: Federal Aviation Administration (FAA), DOT. ACTION: Notice of RTCA Special Committee 205/EUROCAE WG-71 meeting: Software... of RTCA Special Committee 205/EUROCAE WG-71: Software Considerations in Aeronautical Systems Agenda...
76 FR 16469 - Sixteenth Meeting: RTCA Special Committee 205/EUROCAE WG-71: Software Considerations in...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-03-23

... Committee 205/EUROCAE WG-71: Software Considerations in Aeronautical Systems AGENCY: Federal Aviation Administration (FAA), DOT. ACTION: Notice of RTCA Special Committee 205/EUROCAE WG-71 meeting: Software... of RTCA Special Committee 205/EUROCAE WG-71: Software Considerations in Aeronautical Systems Agenda...
HydroDesktop: An Open Source GIS-Based Platform for Hydrologic Data Discovery, Visualization, and Analysis

NASA Astrophysics Data System (ADS)

Ames, D. P.; Kadlec, J.; Cao, Y.; Grover, D.; Horsburgh, J. S.; Whiteaker, T.; Goodall, J. L.; Valentine, D. W.

2010-12-01

A growing number of hydrologic information servers are being deployed by government agencies, university networks, and individual researchers using the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Hydrologic Information System (HIS). The CUAHSI HIS Project has developed a standard software stack, called HydroServer, for publishing hydrologic observations data. It includes the Observations Data Model (ODM) database and Water Data Service web services, which together enable publication of data on the Internet in a standard format called Water Markup Language (WaterML). Metadata describing available datasets hosted on these servers is compiled within a central metadata catalog called HIS Central at the San Diego Supercomputer Center and is searchable through a set of predefined web services based queries. Together, these servers and central catalog service comprise a federated HIS of a scale and comprehensiveness never previously available. This presentation will briefly review/introduce the CUAHSI HIS system with special focus on a new HIS software tool called "HydroDesktop" and the open source software development web portal, www.HydroDesktop.org, which supports community development and maintenance of the software. HydroDesktop is a client-side, desktop software application that acts as a search and discovery tool for exploring the distributed network of HydroServers, downloading specific data series, visualizing and summarizing data series and exporting these to formats needed for analysis by external software. HydroDesktop is based on the open source DotSpatial GIS developer toolkit which provides it with map-based data interaction and visualization, and a plug-in interface that can be used by third party developers and researchers to easily extend the software using Microsoft .NET programming languages. HydroDesktop plug-ins that are presently available or currently under development within the project and by third party collaborators include functions for data search and discovery, extensive graphing, data editing and export, HydroServer exploration, integration with the OpenMI workflow and modeling system, and an interface for data analysis through the R statistical package.
Options in virtual 3D, optical-impression-based planning of dental implants.

PubMed

Reich, Sven; Kern, Thomas; Ritter, Lutz

2014-01-01

If a 3D radiograph, which in today's dentistry often consists of a CBCT dataset, is available for computerized implant planning, the 3D planning should also consider functional prosthetic aspects. In a conventional workflow, the CBCT is done with a specially produced radiopaque prosthetic setup that makes the desired prosthetic situation visible during virtual implant planning. If an exclusively digital workflow is chosen, intraoral digital impressions are taken. On these digital models, the desired prosthetic suprastructures are designed. The entire datasets are virtually superimposed by a "registration" process on the corresponding structures (teeth) in the CBCTs. Thus, both the osseous and prosthetic structures are visible in one single 3D application and make it possible to consider surgical and prosthetic aspects. After having determined the implant positions on the computer screen, a drilling template is designed digitally. According to this design (CAD), a template is printed or milled in CAM process. This template is the first physically extant product in the entire workflow. The article discusses the options and limitations of this workflow.
CyREST: Turbocharging Cytoscape Access for External Tools via a RESTful API.

PubMed

Ono, Keiichiro; Muetze, Tanja; Kolishovski, Georgi; Shannon, Paul; Demchak, Barry

2015-01-01

As bioinformatic workflows become increasingly complex and involve multiple specialized tools, so does the difficulty of reliably reproducing those workflows. Cytoscape is a critical workflow component for executing network visualization, analysis, and publishing tasks, but it can be operated only manually via a point-and-click user interface. Consequently, Cytoscape-oriented tasks are laborious and often error prone, especially with multistep protocols involving many networks. In this paper, we present the new cyREST Cytoscape app and accompanying harmonization libraries. Together, they improve workflow reproducibility and researcher productivity by enabling popular languages (e.g., Python and R, JavaScript, and C#) and tools (e.g., IPython/Jupyter Notebook and RStudio) to directly define and query networks, and perform network analysis, layouts and renderings. We describe cyREST's API and overall construction, and present Python- and R-based examples that illustrate how Cytoscape can be integrated into large scale data analysis pipelines. cyREST is available in the Cytoscape app store (http://apps.cytoscape.org) where it has been downloaded over 1900 times since its release in late 2014.
CyREST: Turbocharging Cytoscape Access for External Tools via a RESTful API

PubMed Central

Ono, Keiichiro; Muetze, Tanja; Kolishovski, Georgi; Shannon, Paul; Demchak, Barry

2015-01-01

As bioinformatic workflows become increasingly complex and involve multiple specialized tools, so does the difficulty of reliably reproducing those workflows. Cytoscape is a critical workflow component for executing network visualization, analysis, and publishing tasks, but it can be operated only manually via a point-and-click user interface. Consequently, Cytoscape-oriented tasks are laborious and often error prone, especially with multistep protocols involving many networks. In this paper, we present the new cyREST Cytoscape app and accompanying harmonization libraries. Together, they improve workflow reproducibility and researcher productivity by enabling popular languages (e.g., Python and R, JavaScript, and C#) and tools (e.g., IPython/Jupyter Notebook and RStudio) to directly define and query networks, and perform network analysis, layouts and renderings. We describe cyREST’s API and overall construction, and present Python- and R-based examples that illustrate how Cytoscape can be integrated into large scale data analysis pipelines. cyREST is available in the Cytoscape app store (http://apps.cytoscape.org) where it has been downloaded over 1900 times since its release in late 2014. PMID:26672762
Using the CMS threaded framework in a production environment

DOE PAGES

Jones, C. D.; Contreras, L.; Gartung, P.; ...

2015-12-23

During 2014, the CMS Offline and Computing Organization completed the necessary changes to use the CMS threaded framework in the full production environment. We will briefly discuss the design of the CMS Threaded Framework, in particular how the design affects scaling performance. We will then cover the effort involved in getting both the CMSSW application software and the workflow management system ready for using multiple threads for production. Finally, we will present metrics on the performance of the application and workflow system as well as the difficulties which were uncovered. As a result, we will end with CMS' plans formore » using the threaded framework to do production for LHC Run 2.« less
Usability study of clinical exome analysis software: top lessons learned and recommendations.

PubMed

Shyr, Casper; Kushniruk, Andre; Wasserman, Wyeth W

2014-10-01

New DNA sequencing technologies have revolutionized the search for genetic disruptions. Targeted sequencing of all protein coding regions of the genome, called exome analysis, is actively used in research-oriented genetics clinics, with the transition to exomes as a standard procedure underway. This transition is challenging; identification of potentially causal mutation(s) amongst ∼10(6) variants requires specialized computation in combination with expert assessment. This study analyzes the usability of user interfaces for clinical exome analysis software. There are two study objectives: (1) To ascertain the key features of successful user interfaces for clinical exome analysis software based on the perspective of expert clinical geneticists, (2) To assess user-system interactions in order to reveal strengths and weaknesses of existing software, inform future design, and accelerate the clinical uptake of exome analysis. Surveys, interviews, and cognitive task analysis were performed for the assessment of two next-generation exome sequence analysis software packages. The subjects included ten clinical geneticists who interacted with the software packages using the "think aloud" method. Subjects' interactions with the software were recorded in their clinical office within an urban research and teaching hospital. All major user interface events (from the user interactions with the packages) were time-stamped and annotated with coding categories to identify usability issues in order to characterize desired features and deficiencies in the user experience. We detected 193 usability issues, the majority of which concern interface layout and navigation, and the resolution of reports. Our study highlights gaps in specific software features typical within exome analysis. The clinicians perform best when the flow of the system is structured into well-defined yet customizable layers for incorporation within the clinical workflow. The results highlight opportunities to dramatically accelerate clinician analysis and interpretation of patient genomic data. We present the first application of usability methods to evaluate software interfaces in the context of exome analysis. Our results highlight how the study of user responses can lead to identification of usability issues and challenges and reveal software reengineering opportunities for improving clinical next-generation sequencing analysis. While the evaluation focused on two distinctive software tools, the results are general and should inform active and future software development for genome analysis software. As large-scale genome analysis becomes increasingly common in healthcare, it is critical that efficient and effective software interfaces are provided to accelerate clinical adoption of the technology. Implications for improved design of such applications are discussed. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
A data management and publication workflow for a large-scale, heterogeneous sensor network.

PubMed

Jones, Amber Spackman; Horsburgh, Jeffery S; Reeder, Stephanie L; Ramírez, Maurier; Caraballo, Juan

2015-06-01

It is common for hydrology researchers to collect data using in situ sensors at high frequencies, for extended durations, and with spatial distributions that produce data volumes requiring infrastructure for data storage, management, and sharing. The availability and utility of these data in addressing scientific questions related to water availability, water quality, and natural disasters relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into usable data products. It also depends on the ability of researchers to share and access the data in useable formats. In this paper, we describe a data management and publication workflow and software tools for research groups and sites conducting long-term monitoring using in situ sensors. Functionality includes the ability to track monitoring equipment inventory and events related to field maintenance. Linking this information to the observational data is imperative in ensuring the quality of sensor-based data products. We present these tools in the context of a case study for the innovative Urban Transitions and Aridregion Hydrosustainability (iUTAH) sensor network. The iUTAH monitoring network includes sensors at aquatic and terrestrial sites for continuous monitoring of common meteorological variables, snow accumulation and melt, soil moisture, surface water flow, and surface water quality. We present the overall workflow we have developed for effectively transferring data from field monitoring sites to ultimate end-users and describe the software tools we have deployed for storing, managing, and sharing the sensor data. These tools are all open source and available for others to use.
A Portable Regional Weather and Climate Downscaling System Using GEOS-5, LIS-6, WRF, and the NASA Workflow Tool

NASA Astrophysics Data System (ADS)

Kemp, E. M.; Putman, W. M.; Gurganus, J.; Burns, R. W.; Damon, M. R.; McConaughy, G. R.; Seablom, M. S.; Wojcik, G. S.

2009-12-01

We present a regional downscaling system (RDS) suitable for high-resolution weather and climate simulations in multiple supercomputing environments. The RDS is built on the NASA Workflow Tool, a software framework for configuring, running, and managing computer models on multiple platforms with a graphical user interface. The Workflow Tool is used to run the NASA Goddard Earth Observing System Model Version 5 (GEOS-5), a global atmospheric-ocean model for weather and climate simulations down to 1/4 degree resolution; the NASA Land Information System Version 6 (LIS-6), a land surface modeling system that can simulate soil temperature and moisture profiles; and the Weather Research and Forecasting (WRF) community model, a limited-area atmospheric model for weather and climate simulations down to 1-km resolution. The Workflow Tool allows users to customize model settings to user needs; saves and organizes simulation experiments; distributes model runs across different computer clusters (e.g., the DISCOVER cluster at Goddard Space Flight Center, the Cray CX-1 Desktop Supercomputer, etc.); and handles all file transfers and network communications (e.g., scp connections). Together, the RDS is intended to aid researchers by making simulations as easy as possible to generate on the computer resources available. Initial conditions for LIS-6 and GEOS-5 are provided by Modern Era Retrospective-Analysis for Research and Applications (MERRA) reanalysis data stored on DISCOVER. The LIS-6 is first run for 2-4 years forced by MERRA atmospheric analyses, generating initial conditions for the WRF soil physics. GEOS-5 is then initialized from MERRA data and run for the period of interest. Large-scale atmospheric data, sea-surface temperatures, and sea ice coverage from GEOS-5 are used as boundary conditions for WRF, which is run for the same period of interest. Multiply nested grids are used for both LIS-6 and WRF, with the innermost grid run at a resolution sufficient for typical local weather features (terrain, convection, etc.) All model runs, restarts, and file transfers are coordinated by the Workflow Tool. Two use cases are being pursued. First, the RDS generates regional climate simulations down to 4-km for the Chesapeake Bay region, with WRF output provided as input to more specialized models (e.g., ocean/lake, hydrological, marine biology, and air pollution). This will allow assessment of climate impact on local interests (e.g., changes in Bay water levels and temperatures, innundation, fish kills, etc.) Second, the RDS generates high-resolution hurricane simulations in the tropical North Atlantic. This use case will support Observing System Simulation Experiments (OSSEs) of dynamically-targeted lidar observations as part of the NASA Sensor Web Simulator project. Sample results will be presented at the AGU Fall Meeting.
A workflow to process 3D+time microscopy images of developing organisms and reconstruct their cell lineage

PubMed Central

Faure, Emmanuel; Savy, Thierry; Rizzi, Barbara; Melani, Camilo; Stašová, Olga; Fabrèges, Dimitri; Špir, Róbert; Hammons, Mark; Čúnderlík, Róbert; Recher, Gaëlle; Lombardot, Benoît; Duloquin, Louise; Colin, Ingrid; Kollár, Jozef; Desnoulez, Sophie; Affaticati, Pierre; Maury, Benoît; Boyreau, Adeline; Nief, Jean-Yves; Calvat, Pascal; Vernier, Philippe; Frain, Monique; Lutfalla, Georges; Kergosien, Yannick; Suret, Pierre; Remešíková, Mariana; Doursat, René; Sarti, Alessandro; Mikula, Karol; Peyriéras, Nadine; Bourgine, Paul

2016-01-01

The quantitative and systematic analysis of embryonic cell dynamics from in vivo 3D+time image data sets is a major challenge at the forefront of developmental biology. Despite recent breakthroughs in the microscopy imaging of living systems, producing an accurate cell lineage tree for any developing organism remains a difficult task. We present here the BioEmergences workflow integrating all reconstruction steps from image acquisition and processing to the interactive visualization of reconstructed data. Original mathematical methods and algorithms underlie image filtering, nucleus centre detection, nucleus and membrane segmentation, and cell tracking. They are demonstrated on zebrafish, ascidian and sea urchin embryos with stained nuclei and membranes. Subsequent validation and annotations are carried out using Mov-IT, a custom-made graphical interface. Compared with eight other software tools, our workflow achieved the best lineage score. Delivered in standalone or web service mode, BioEmergences and Mov-IT offer a unique set of tools for in silico experimental embryology. PMID:26912388
Automated metadata--final project report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schissel, David

This report summarizes the work of the Automated Metadata, Provenance Cataloging, and Navigable Interfaces: Ensuring the Usefulness of Extreme-Scale Data Project (MPO Project) funded by the United States Department of Energy (DOE), Offices of Advanced Scientific Computing Research and Fusion Energy Sciences. Initially funded for three years starting in 2012, it was extended for 6 months with additional funding. The project was a collaboration between scientists at General Atomics, Lawrence Berkley National Laboratory (LBNL), and Massachusetts Institute of Technology (MIT). The group leveraged existing computer science technology where possible, and extended or created new capabilities where required. The MPO projectmore » was able to successfully create a suite of software tools that can be used by a scientific community to automatically document their scientific workflows. These tools were integrated into workflows for fusion energy and climate research illustrating the general applicability of the project’s toolkit. Feedback was very positive on the project’s toolkit and the value of such automatic workflow documentation to the scientific endeavor.« less
Biological Web Service Repositories Review.

PubMed

Urdidiales-Nieto, David; Navas-Delgado, Ismael; Aldana-Montes, José F

2017-05-01

Web services play a key role in bioinformatics enabling the integration of database access and analysis of algorithms. However, Web service repositories do not usually publish information on the changes made to their registered Web services. Dynamism is directly related to the changes in the repositories (services registered or unregistered) and at service level (annotation changes). Thus, users, software clients or workflow based approaches lack enough relevant information to decide when they should review or re-execute a Web service or workflow to get updated or improved results. The dynamism of the repository could be a measure for workflow developers to re-check service availability and annotation changes in the services of interest to them. This paper presents a review on the most well-known Web service repositories in the life sciences including an analysis of their dynamism. Freshness is introduced in this paper, and has been used as the measure for the dynamism of these repositories. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
An Interactive and Comprehensive Working Environment for High-Energy Physics Software with Python and Jupyter Notebooks

NASA Astrophysics Data System (ADS)

Braun, N.; Hauth, T.; Pulvermacher, C.; Ritter, M.

2017-10-01

Today’s analyses for high-energy physics (HEP) experiments involve processing a large amount of data with highly specialized algorithms. The contemporary workflow from recorded data to final results is based on the execution of small scripts - often written in Python or ROOT macros which call complex compiled algorithms in the background - to perform fitting procedures and generate plots. During recent years interactive programming environments, such as Jupyter, became popular. Jupyter allows to develop Python-based applications, so-called notebooks, which bundle code, documentation and results, e.g. plots. Advantages over classical script-based approaches is the feature to recompute only parts of the analysis code, which allows for fast and iterative development, and a web-based user frontend, which can be hosted centrally and only requires a browser on the user side. In our novel approach, Python and Jupyter are tightly integrated into the Belle II Analysis Software Framework (basf2), currently being developed for the Belle II experiment in Japan. This allows to develop code in Jupyter notebooks for every aspect of the event simulation, reconstruction and analysis chain. These interactive notebooks can be hosted as a centralized web service via jupyterhub with docker and used by all scientists of the Belle II Collaboration. Because of its generality and encapsulation, the setup can easily be scaled to large installations.
Introducing concurrency in the Gaudi data processing framework

NASA Astrophysics Data System (ADS)

Clemencic, Marco; Hegner, Benedikt; Mato, Pere; Piparo, Danilo

2014-06-01

In the past, the increasing demands for HEP processing resources could be fulfilled by the ever increasing clock-frequencies and by distributing the work to more and more physical machines. Limitations in power consumption of both CPUs and entire data centres are bringing an end to this era of easy scalability. To get the most CPU performance per watt, future hardware will be characterised by less and less memory per processor, as well as thinner, more specialized and more numerous cores per die, and rather heterogeneous resources. To fully exploit the potential of the many cores, HEP data processing frameworks need to allow for parallel execution of reconstruction or simulation algorithms on several events simultaneously. We describe our experience in introducing concurrency related capabilities into Gaudi, a generic data processing software framework, which is currently being used by several HEP experiments, including the ATLAS and LHCb experiments at the LHC. After a description of the concurrent framework and the most relevant design choices driving its development, we describe the behaviour of the framework in a more realistic environment, using a subset of the real LHCb reconstruction workflow, and present our strategy and the used tools to validate the physics outcome of the parallel framework against the results of the present, purely sequential LHCb software. We then summarize the measurement of the code performance of the multithreaded application in terms of memory and CPU usage.
Collaborative engineering and design management for the Hobby-Eberly Telescope tracker upgrade

NASA Astrophysics Data System (ADS)

Mollison, Nicholas T.; Hayes, Richard J.; Good, John M.; Booth, John A.; Savage, Richard D.; Jackson, John R.; Rafal, Marc D.; Beno, Joseph H.

2010-07-01

The engineering and design of systems as complex as the Hobby-Eberly Telescope's* new tracker require that multiple tasks be executed in parallel and overlapping efforts. When the design of individual subsystems is distributed among multiple organizations, teams, and individuals, challenges can arise with respect to managing design productivity and coordinating successful collaborative exchanges. This paper focuses on design management issues and current practices for the tracker design portion of the Hobby-Eberly Telescope Wide Field Upgrade project. The scope of the tracker upgrade requires engineering contributions and input from numerous fields including optics, instrumentation, electromechanics, software controls engineering, and site-operations. Successful system-level integration of tracker subsystems and interfaces is critical to the telescope's ultimate performance in astronomical observation. Software and process controls for design information and workflow management have been implemented to assist the collaborative transfer of tracker design data. The tracker system architecture and selection of subsystem interfaces has also proven to be a determining factor in design task formulation and team communication needs. Interface controls and requirements change controls will be discussed, and critical team interactions are recounted (a group-participation Failure Modes and Effects Analysis [FMEA] is one of special interest). This paper will be of interest to engineers, designers, and managers engaging in multi-disciplinary and parallel engineering projects that require coordination among multiple individuals, teams, and organizations.

Inventory-based landscape-scale simulation of management effectiveness and economic feasibility with BioSum

Treesearch

Jeremy S. Fried; Larry D. Potts; Sara M. Loreno; Glenn A. Christensen; R. Jamie Barbour

2017-01-01

The Forest Inventory and Analysis (FIA)-based BioSum (Bioregional Inventory Originated Simulation Under Management) is a free policy analysis framework and workflow management software solution. It addresses complex management questions concerning forest health and vulnerability for large, multimillion acre, multiowner landscapes using FIA plot data as the initial...
Usability Testing and Workflow Analysis of the TRADOC Data Visualization Tool

DTIC Science & Technology

2012-09-01

software such as blink data, saccades, and cognitive load based on pupil contraction. Eye-tracking was only a component of the data evaluated and as...line charts were a pain to read) Yes Yes Projecting the charts directly onto the regions increased clutter on the screen and is a bad stylistic
A Virtual Environment for Process Management. A Step by Step Implementation

ERIC Educational Resources Information Center

Mayer, Sergio Valenzuela

2003-01-01

In this paper it is presented a virtual organizational environment, conceived with the integration of three computer programs: a manufacturing simulation package, an automation of businesses processes (workflows), and business intelligence (Balanced Scorecard) software. It was created as a supporting tool for teaching IE, its purpose is to give…
BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines.

PubMed

Hernández, Yözen; Bernstein, Rocky; Pagan, Pedro; Vargas, Levy; McCaig, William; Ramrattan, Girish; Akther, Saymon; Larracuente, Amanda; Di, Lia; Vieira, Filipe G; Qiu, Wei-Gang

2018-03-02

Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. Command-line utilities further benefit by relieving bioinformatics developers to learn the use of, or to interact directly with, biological software libraries. There is however a lack of command-line utilities that leverage popular Open Source biological software toolkits such as BioPerl ( http://bioperl.org ) to make many of the well-designed, robust, and routinely used biological classes available for a wider base of end users. Designed as standard utilities for UNIX-family operating systems, BpWrapper makes functionality of some of the most popular BioPerl modules readily accessible on the command line to novice as well as to experienced bioinformatics practitioners. The initial release of BpWrapper includes four utilities with concise command-line user interfaces, bioseq, bioaln, biotree, and biopop, specialized for manipulation of molecular sequences, sequence alignments, phylogenetic trees, and DNA polymorphisms, respectively. Over a hundred methods are currently available as command-line options and new methods are easily incorporated. Performance of BpWrapper utilities lags that of precompiled utilities while equivalent to that of other utilities based on BioPerl. BpWrapper has been tested on BioPerl Release 1.6, Perl versions 5.10.1 to 5.25.10, and operating systems including Apple macOS, Microsoft Windows, and GNU/Linux. Release code is available from the Comprehensive Perl Archive Network (CPAN) at https://metacpan.org/pod/Bio::BPWrapper . Source code is available on GitHub at https://github.com/bioperl/p5-bpwrapper . BpWrapper improves on existing sequence utilities by following the design principles of Unix text utilities such including a concise user interface, extensive command-line options, and standard input/output for serialized operations. Further, dozens of novel methods for manipulation of sequences, alignments, and phylogenetic trees, unavailable in existing utilities (e.g., EMBOSS, Newick Utilities, and FAST), are provided. Bioinformaticians should find BpWrapper useful for rapid prototyping of workflows on the command-line without creating custom scripts for comparative genomics and other bioinformatics applications.
Proving Value in Radiology: Experience Developing and Implementing a Shareable Open Source Registry Platform Driven by Radiology Workflow.

PubMed

Gichoya, Judy Wawira; Kohli, Marc D; Haste, Paul; Abigail, Elizabeth Mills; Johnson, Matthew S

2017-10-01

Numerous initiatives are in place to support value based care in radiology including decision support using appropriateness criteria, quality metrics like radiation dose monitoring, and efforts to improve the quality of the radiology report for consumption by referring providers. These initiatives are largely data driven. Organizations can choose to purchase proprietary registry systems, pay for software as a service solution, or deploy/build their own registry systems. Traditionally, registries are created for a single purpose like radiation dosage or specific disease tracking like diabetes registry. This results in a fragmented view of the patient, and increases overhead to maintain such single purpose registry system by requiring an alternative data entry workflow and additional infrastructure to host and maintain multiple registries for different clinical needs. This complexity is magnified in the health care enterprise whereby radiology systems usually are run parallel to other clinical systems due to the different clinical workflow for radiologists. In the new era of value based care where data needs are increasing with demand for a shorter turnaround time to provide data that can be used for information and decision making, there is a critical gap to develop registries that are more adapt to the radiology workflow with minimal overhead on resources for maintenance and setup. We share our experience of developing and implementing an open source registry system for quality improvement and research in our academic institution that is driven by our radiology workflow.
Integrative analysis workflow for the structural and functional classification of C-type lectins

PubMed Central

2011-01-01

Background It is important to understand the roles of C-type lectins in the immune system due to their ubiquity and diverse range of functions in animal cells. It has been observed that currently confirmed C-type lectins share a highly conserved domain known as the C-type carbohydrate recognition domain (CRD). Using the sequence profile of the CRD, an increasing number of putative C-type lectins have been identified. Hence, it is highly needed to develop a systematic framework that enables us to elucidate their carbohydrate (glycan) recognition function, and discover their physiological and pathological roles. Results Presented herein is an integrated workflow for characterizing the sequence and structural features of novel C-type lectins. Our workflow utilizes web-based queries and available software suites to annotate features that can be found on the C-type lectin, given its amino acid sequence. At the same time, it incorporates modeling and analysis of glycans - a major class of ligands that interact with C-type lectins. Thereafter, the results are analyzed together with context-specific knowledge to filter off unlikely predictions. This allows researchers to design their subsequent experiments to confirm the functions of the C-type lectins in a systematic manner. Conclusions The efficacy and usefulness of our proposed immunoinformatics workflow was demonstrated by applying our integrated workflow to a novel C-type lectin -CLEC17A - and we report some of its possible functions that warrants further validation through wet-lab experiments. PMID:22372988
Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology

PubMed Central

Grüning, Björn A.; Paszkiewicz, Konrad; Pritchard, Leighton

2013-01-01

The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of “effector” proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen’s predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu). PMID:24109552
Combining transrectal ultrasound and CT for image-guided adaptive brachytherapy of cervical cancer: Proof of concept.

PubMed

Nesvacil, Nicole; Schmid, Maximilian P; Pötter, Richard; Kronreif, Gernot; Kirisits, Christian

To investigate the feasibility of a treatment planning workflow for three-dimensional image-guided cervix cancer brachytherapy, combining volumetric transrectal ultrasound (TRUS) for target definition with CT for dose optimization to organs at risk (OARs), for settings with no access to MRI. A workflow for TRUS/CT-based volumetric treatment planning was developed, based on a customized system including ultrasound probe, stepper unit, and software for image volume acquisition. A full TRUS/CT-based workflow was simulated in a clinical case and compared with MR- or CT-only delineation. High-risk clinical target volume was delineated on TRUS, and OARs were delineated on CT. Manually defined tandem/ring applicator positions on TRUS and CT were used as a reference for rigid registration of the image volumes. Treatment plan optimization for TRUS target and CT organ volumes was performed and compared to MRI and CT target contours. TRUS/CT-based contouring, applicator reconstruction, image fusion, and treatment planning were feasible, and the full workflow could be successfully demonstrated. The TRUS/CT plan fulfilled all clinical planning aims. Dose-volume histogram evaluation of the TRUS/CT-optimized plan (high-risk clinical target volume D 90 , OARs D 2cm³ for) on different image modalities showed good agreement between dose values reported for TRUS/CT and MRI-only reference contours and large deviations for CT-only target parameters. A TRUS/CT-based workflow for full three-dimensional image-guided cervix brachytherapy treatment planning seems feasible and may be clinically comparable to MRI-based treatment planning. Further development to solve challenges with applicator definition in the TRUS volume is required before systematic applicability of this workflow. Copyright Â© 2016 American Brachytherapy Society. Published by Elsevier Inc. All rights reserved.
Secure Encapsulation and Publication of Biological Services in the Cloud Computing Environment

PubMed Central

Zhang, Weizhe; Wang, Xuehui; Lu, Bo; Kim, Tai-hoon

2013-01-01

Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved. PMID:24078906
Secure encapsulation and publication of biological services in the cloud computing environment.

PubMed

Zhang, Weizhe; Wang, Xuehui; Lu, Bo; Kim, Tai-hoon

2013-01-01

Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved.
Enabling a systems biology knowledgebase with gaggle and firegoose

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baliga, Nitin S.

The overall goal of this project was to extend the existing Gaggle and Firegoose systems to develop an open-source technology that runs over the web and links desktop applications with many databases and software applications. This technology would enable researchers to incorporate workflows for data analysis that can be executed from this interface to other online applications. The four specific aims were to (1) provide one-click mapping of genes, proteins, and complexes across databases and species; (2) enable multiple simultaneous workflows; (3) expand sophisticated data analysis for online resources; and enhance open-source development of the Gaggle-Firegoose infrastructure. Gaggle is anmore » open-source Java software system that integrates existing bioinformatics programs and data sources into a user-friendly, extensible environment to allow interactive exploration, visualization, and analysis of systems biology data. Firegoose is an extension to the Mozilla Firefox web browser that enables data transfer between websites and desktop tools including Gaggle. In the last phase of this funding period, we have made substantial progress on development and application of the Gaggle integration framework. We implemented the workspace to the Network Portal. Users can capture data from Firegoose and save them to the workspace. Users can create workflows to start multiple software components programmatically and pass data between them. Results of analysis can be saved to the cloud so that they can be easily restored on any machine. We also developed the Gaggle Chrome Goose, a plugin for the Google Chrome browser in tandem with an opencpu server in the Amazon EC2 cloud. This allows users to interactively perform data analysis on a single web page using the R packages deployed on the opencpu server. The cloud-based framework facilitates collaboration between researchers from multiple organizations. We have made a number of enhancements to the cmonkey2 application to enable and improve the integration within different environments, and we have created a new tools pipeline for generating EGRIN2 models in a largely automated way.« less
VOLCWORKS: A suite for optimization of hazards mapping

NASA Astrophysics Data System (ADS)

Delgado Granados, H.; Ramírez Guzmán, R.; Villareal Benítez, J. L.; García Sánchez, T.

2012-04-01

Making hazards maps is a process linking basic science, applied science and engineering for the benefit of the society. The methodologies for hazards maps' construction have evolved enormously together with the tools that allow the forecasting of the behavior of the materials produced by different eruptive processes. However, in spite of the development of tools and evolution of methodologies, the utility of hazards maps has not changed: prevention and mitigation of volcanic disasters. Integration of different tools for simulation of different processes for a single volcano is a challenge to be solved using software tools including processing, simulation and visualization techniques, and data structures in order to build up a suit that helps in the construction process starting from the integration of the geological data, simulations and simplification of the output to design a hazards/scenario map. Scientific visualization is a powerful tool to explore and gain insight into complex data from instruments and simulations. The workflow from data collection, quality control and preparation for simulations, to achieve visual and appropriate presentation is a process that is usually disconnected, using in most of the cases different applications for each of the needed processes, because it requires many tools that are not built for the solution of a specific problem, or were developed by research groups to solve particular tasks, but disconnected. In volcanology, due to its complexity, groups typically examine only one aspect of the phenomenon: ash dispersal, laharic flows, pyroclastic flows, lava flows, and ballistic projectile ejection, among others. However, when studying the hazards associated to the activity of a volcano, it is important to analyze all the processes comprehensively, especially for communication of results to the end users: decision makers and planners. In order to solve this problem and connect different parts of a workflow we are developing the suite VOLCWORKS, whose principle is to have a flexible-implementation architecture allowing rapid development of software to the extent specified by the needs including calculations, routines, or algorithms, both new and through redesign of available software in the volcanological community, but especially allowing to include new knowledge, models or software transferring them to software modules. The design is component-oriented platform, which allows incorporating particular solutions (routines, simulations, etc.), which can be concatenated for integration or highlighting information. The platform includes a graphical interface with capabilities for working in different visual environments that can be focused to the particular work of different types of users (researchers, lecturers, students, etc.). This platform aims to integrate simulation and visualization phases, incorporating proven tools (now isolated). VOLCWORKS can be used under different operating systems (Windows, Linux and Mac OS) and fit the context of use automatically and at runtime: in both tasks and their sequence, such as utilization of hardware resources (CPU, GPU, special monitors, etc.). The application has the ability to run on a laptop or even in a virtual reality room with access to supercomputers.
The impact of software quality characteristics on healthcare outcome: a literature review.

PubMed

Aghazadeh, Sakineh; Pirnejad, Habibollah; Moradkhani, Alireza; Aliev, Alvosat

2014-01-01

The aim of this study was to discover the effect of software quality characteristics on healthcare quality and efficiency indicators. Through a systematic literature review, we selected and analyzed 37 original research papers to investigate the impact of the software indicators (coming from the standard ISO 9126 quality characteristics and sub-characteristics) on some of healthcare important outcome indicators and finally ranked these software indicators. The results showed that the software characteristics usability, reliability and efficiency were mostly favored in the studies, indicating their importance. On the other hand, user satisfaction, quality of patient care, clinical workflow efficiency, providers' communication and information exchange, patient satisfaction and care costs were among the healthcare outcome indicators frequently evaluated in relation to the mentioned software characteristics. Regression Logistic Method was the most common assessment methodology, and Confirmatory Factor Analysis and Structural Equation Modeling were performed to test the structural model's fit. The software characteristics were considered to impact the healthcare outcome indicators through other intermediate factors (variables).
48 CFR 227.7205 - Contracts for special works.

Code of Federal Regulations, 2014 CFR

2014-10-01

... Computer Software and Computer Software Documentation 227.7205 Contracts for special works. (a) Use the... a specific need to control the distribution of computer software or computer software documentation..., modification, reproduction, release, performance, display, or disclosure of such software or documentation. Use...
48 CFR 227.7205 - Contracts for special works.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Computer Software and Computer Software Documentation 227.7205 Contracts for special works. (a) Use the... a specific need to control the distribution of computer software or computer software documentation..., modification, reproduction, release, performance, display, or disclosure of such software or documentation. Use...
Data processing workflows from low-cost digital survey to various applications: three case studies of Chinese historic architecture

NASA Astrophysics Data System (ADS)

Sun, Z.; Cao, Y. K.

2015-08-01

The paper focuses on the versatility of data processing workflows ranging from BIM-based survey to structural analysis and reverse modeling. In China nowadays, a large number of historic architecture are in need of restoration, reinforcement and renovation. But the architects are not prepared for the conversion from the booming AEC industry to architectural preservation. As surveyors working with architects in such projects, we have to develop efficient low-cost digital survey workflow robust to various types of architecture, and to process the captured data for architects. Although laser scanning yields high accuracy in architectural heritage documentation and the workflow is quite straightforward, the cost and portability hinder it from being used in projects where budget and efficiency are of prime concern. We integrate Structure from Motion techniques with UAV and total station in data acquisition. The captured data is processed for various purposes illustrated with three case studies: the first one is as-built BIM for a historic building based on registered point clouds according to Ground Control Points; The second one concerns structural analysis for a damaged bridge using Finite Element Analysis software; The last one relates to parametric automated feature extraction from captured point clouds for reverse modeling and fabrication.
The BioExtract Server: a web-based bioinformatic workflow platform

PubMed Central

Lushbough, Carol M.; Jennewein, Douglas M.; Brendel, Volker P.

2011-01-01

The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet. PMID:21546552
Differential gene expression in the siphonophore Nanomia bijuga (Cnidaria) assessed with multiple next-generation sequencing workflows.

PubMed

Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W

2011-01-01

We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing.
Differential Gene Expression in the Siphonophore Nanomia bijuga (Cnidaria) Assessed with Multiple Next-Generation Sequencing Workflows

PubMed Central

Siebert, Stefan; Robinson, Mark D.; Tintori, Sophia C.; Goetz, Freya; Helm, Rebecca R.; Smith, Stephen A.; Shaner, Nathan; Haddock, Steven H. D.; Dunn, Casey W.

2011-01-01

We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing. PMID:21829563
Integration, acceptance testing, and clinical operation of the Medical Information, Communication and Archive System, phase II.

PubMed

Smith, E M; Wandtke, J; Robinson, A

1999-05-01

The Medical Information, Communication and Archive System (MICAS) is a multivendor incremental approach to picture archiving and communications system (PACS). It is a multimodality integrated image management system that is seamlessly integrated with the radiology information system (RIS). Phase II enhancements of MICAS include a permanent archive, automated workflow, study caches, Microsoft (Redmond, WA) Windows NT diagnostic workstations with all components adhering to Digital Information Communications in Medicine (DICOM) standards. MICAS is designed as an enterprise-wide PACS to provide images and reports throughout the Strong Health healthcare network. Phase II includes the addition of a Cemax-Icon (Fremont, CA) archive, PACS broker (Mitra, Waterloo, Canada), an interface (IDX PACSlink, Burlington, VT) to the RIS (IDXrad) plus the conversion of the UNIX-based redundant array of inexpensive disks (RAID) 5 temporary archives in phase I to NT-based RAID 0 DICOM modality-specific study caches (ImageLabs, Bedford, MA). The phase I acquisition engines and workflow management software was uninstalled and the Cemax archive manager (AM) assumed these functions. The existing ImageLabs UNIX-based viewing software was enhanced and converted to an NT-based DICOM viewer. Installation of phase II hardware and software and integration with existing components began in July 1998. Phase II of MICAS demonstrates that a multivendor open-system incremental approach to PACS is feasible, cost-effective, and has significant advantages over a single-vendor implementation.

A reliable computational workflow for the selection of optimal screening libraries.

PubMed

Gilad, Yocheved; Nadassy, Katalin; Senderowitz, Hanoch

2015-01-01

The experimental screening of compound collections is a common starting point in many drug discovery projects. Successes of such screening campaigns critically depend on the quality of the screened library. Many libraries are currently available from different vendors yet the selection of the optimal screening library for a specific project is challenging. We have devised a novel workflow for the rational selection of project-specific screening libraries. The workflow accepts as input a set of virtual candidate libraries and applies the following steps to each library: (1) data curation; (2) assessment of ADME/T profile; (3) assessment of the number of promiscuous binders/frequent HTS hitters; (4) assessment of internal diversity; (5) assessment of similarity to known active compound(s) (optional); (6) assessment of similarity to in-house or otherwise accessible compound collections (optional). For ADME/T profiling, Lipinski's and Veber's rule-based filters were implemented and a new blood brain barrier permeation model was developed and validated (85 and 74 % success rate for training set and test set, respectively). Diversity and similarity descriptors which demonstrated best performances in terms of their ability to select either diverse or focused sets of compounds from three databases (Drug Bank, CMC and CHEMBL) were identified and used for diversity and similarity assessments. The workflow was used to analyze nine common screening libraries available from six vendors. The results of this analysis are reported for each library providing an assessment of its quality. Furthermore, a consensus approach was developed to combine the results of these analyses into a single score for selecting the optimal library under different scenarios. We have devised and tested a new workflow for the rational selection of screening libraries under different scenarios. The current workflow was implemented using the Pipeline Pilot software yet due to the usage of generic components, it can be easily adapted and reproduced by computational groups interested in rational selection of screening libraries. Furthermore, the workflow could be readily modified to include additional components. This workflow has been routinely used in our laboratory for the selection of libraries in multiple projects and consistently selects libraries which are well balanced across multiple parameters.Graphical abstract.
CellSegm - a MATLAB toolbox for high-throughput 3D cell segmentation

PubMed Central

2013-01-01

The application of fluorescence microscopy in cell biology often generates a huge amount of imaging data. Automated whole cell segmentation of such data enables the detection and analysis of individual cells, where a manual delineation is often time consuming, or practically not feasible. Furthermore, compared to manual analysis, automation normally has a higher degree of reproducibility. CellSegm, the software presented in this work, is a Matlab based command line software toolbox providing an automated whole cell segmentation of images showing surface stained cells, acquired by fluorescence microscopy. It has options for both fully automated and semi-automated cell segmentation. Major algorithmic steps are: (i) smoothing, (ii) Hessian-based ridge enhancement, (iii) marker-controlled watershed segmentation, and (iv) feature-based classfication of cell candidates. Using a wide selection of image recordings and code snippets, we demonstrate that CellSegm has the ability to detect various types of surface stained cells in 3D. After detection and outlining of individual cells, the cell candidates can be subject to software based analysis, specified and programmed by the end-user, or they can be analyzed by other software tools. A segmentation of tissue samples with appropriate characteristics is also shown to be resolvable in CellSegm. The command-line interface of CellSegm facilitates scripting of the separate tools, all implemented in Matlab, offering a high degree of flexibility and tailored workflows for the end-user. The modularity and scripting capabilities of CellSegm enable automated workflows and quantitative analysis of microscopic data, suited for high-throughput image based screening. PMID:23938087
Neurophysiological analytics for all! Free open-source software tools for documenting, analyzing, visualizing, and sharing using electronic notebooks.

PubMed

Rosenberg, David M; Horn, Charles C

2016-08-01

Neurophysiology requires an extensive workflow of information analysis routines, which often includes incompatible proprietary software, introducing limitations based on financial costs, transfer of data between platforms, and the ability to share. An ecosystem of free open-source software exists to fill these gaps, including thousands of analysis and plotting packages written in Python and R, which can be implemented in a sharable and reproducible format, such as the Jupyter electronic notebook. This tool chain can largely replace current routines by importing data, producing analyses, and generating publication-quality graphics. An electronic notebook like Jupyter allows these analyses, along with documentation of procedures, to display locally or remotely in an internet browser, which can be saved as an HTML, PDF, or other file format for sharing with team members and the scientific community. The present report illustrates these methods using data from electrophysiological recordings of the musk shrew vagus-a model system to investigate gut-brain communication, for example, in cancer chemotherapy-induced emesis. We show methods for spike sorting (including statistical validation), spike train analysis, and analysis of compound action potentials in notebooks. Raw data and code are available from notebooks in data supplements or from an executable online version, which replicates all analyses without installing software-an implementation of reproducible research. This demonstrates the promise of combining disparate analyses into one platform, along with the ease of sharing this work. In an age of diverse, high-throughput computational workflows, this methodology can increase efficiency, transparency, and the collaborative potential of neurophysiological research. Copyright © 2016 the American Physiological Society.
CellSegm - a MATLAB toolbox for high-throughput 3D cell segmentation.

PubMed

Hodneland, Erlend; Kögel, Tanja; Frei, Dominik Michael; Gerdes, Hans-Hermann; Lundervold, Arvid

2013-08-09

: The application of fluorescence microscopy in cell biology often generates a huge amount of imaging data. Automated whole cell segmentation of such data enables the detection and analysis of individual cells, where a manual delineation is often time consuming, or practically not feasible. Furthermore, compared to manual analysis, automation normally has a higher degree of reproducibility. CellSegm, the software presented in this work, is a Matlab based command line software toolbox providing an automated whole cell segmentation of images showing surface stained cells, acquired by fluorescence microscopy. It has options for both fully automated and semi-automated cell segmentation. Major algorithmic steps are: (i) smoothing, (ii) Hessian-based ridge enhancement, (iii) marker-controlled watershed segmentation, and (iv) feature-based classfication of cell candidates. Using a wide selection of image recordings and code snippets, we demonstrate that CellSegm has the ability to detect various types of surface stained cells in 3D. After detection and outlining of individual cells, the cell candidates can be subject to software based analysis, specified and programmed by the end-user, or they can be analyzed by other software tools. A segmentation of tissue samples with appropriate characteristics is also shown to be resolvable in CellSegm. The command-line interface of CellSegm facilitates scripting of the separate tools, all implemented in Matlab, offering a high degree of flexibility and tailored workflows for the end-user. The modularity and scripting capabilities of CellSegm enable automated workflows and quantitative analysis of microscopic data, suited for high-throughput image based screening.
Implementation and Challenges of Direct Acoustic Dosing into Cell-Based Assays.

PubMed

Roberts, Karen; Callis, Rowena; Ikeda, Tim; Paunovic, Amalia; Simpson, Carly; Tang, Eric; Turton, Nick; Walker, Graeme

2016-02-01

Since the adoption of Labcyte Echo Acoustic Droplet Ejection (ADE) technology by AstraZeneca in 2005, ADE has become the preferred method for compound dosing into both biochemical and cell-based assays across AstraZeneca research and development globally. The initial implementation of Echos and the direct dosing workflow provided AstraZeneca with a unique set of challenges. In this article, we outline how direct Echo dosing has evolved over the past decade in AstraZeneca. We describe the practical challenges of applying ADE technology to 96-well, 384-well, and 1536-well assays and how AstraZeneca developed and applied software and robotic solutions to generate fully automated and effective cell-based assay workflows. © 2015 Society for Laboratory Automation and Screening.
Music Software for Special Needs.

ERIC Educational Resources Information Center

McCord, Kimberly

2001-01-01

Discusses the use of computer software for students with special needs in the music classroom. Focuses on software programs that are appropriate for children with special needs such as: "Musicshop,""Band-in-a-Box,""Rock Rap'n Roll,""Music Mania,""Music Ace" and "Music Ace 2," and "Children's Songbook." (CMK)
Digital Workflows for Restoration and Management of the Museum Affandi - a Case Study in Challenging Circumstances

NASA Astrophysics Data System (ADS)

Herbig, U.; Styhler-Aydın, G.; Grandits, D.; Stampfer, L.; Pont, U.; Mayer, I.

2017-08-01

The appropriate restoration of architectural heritage needs a careful and comprehensive documentation of the existing structures, which even elaborates, if the function of the building needs special attention, like in museums. In a collaborative project between the Universitas Gadjah Mada, Yogyakarta, Indonesia and two universities in Austria (TU Wien and the Danube University Krems) a restoration and adaptation concept of the Affandi Museum in Yogyakarta is currently in progress. It provides a perfect case study for the development of a workflow to combine data from a building survey, architectural research, indoor climate measurements and the documentation of artwork in a challenging environment, from hot and humid tropical climate to continuous threads by natural hazards like earthquakes or volcanic eruptions. The Affandi Museum houses the collection of Affandi, who is considered to be Indonesia's foremost Expressionist painter and partly designed and constructed the museum by himself. With the spirit of the artist still perceptible in the complex the Affandi Museum is an important part of the Indonesian cultural heritage. Thus its preservation takes special attention and adds to the complexity of the development of a monitoring and maintenance concept. This paper describes the ongoing development of an approach to a workflow from the measurement and research of the objects, both architectural and artwork, to the semantically enriched BIM Model as the base for a sustainable monitoring tool for the Affandi Museum.
Using the Git Software Tool on the Peregrine System | High-Performance

Science.gov Websites

branch workflow. Create a local branch called "experimental" based on the current master... git branch experimental Use your branch (start working on that experimental branch....) git checkout experimental git pull origin experimental # work, work, work, commit.... Send local branch to the repo git push
[Change in process management by implementing RIS, PACS and flat-panel detectors].

PubMed

Imhof, H; Dirisamer, A; Fischer, H; Grampp, S; Heiner, L; Kaderk, M; Krestan, C; Kainberger, F

2002-05-01

Implementation of radiological information systems (RIS) and picture archiving and communicating systems (PACS) results in significant changes of workflow in a radiological department. Additional connection with flat-panel detectors leads to a shortening of the work process. RIS and PACS implementation alone reduces the complete workflow by 21-80%. With flatpanel technology the image production process is further shortened by 25-30%. The workflow-steps are changed from original 17-12 with the implementation of RIS and PACS and to 5 with the integrated use of flatpanels. This clearly recognizable advantages in the workflow need an according financial investment. Several studies could show that the capitalisation-factor calculated over eight years is positive, with a gain range between 5-25%. Whether the additional implementation of flatpanel detectors results also in a positive capitalisation over the years, cannot be estimated exactly, at the moment, because the experiences are too short. Particularly critical are the interfaces, which needs a constant quality control. Our flatpanel detector-system is fixed, special images--as we have them in about 3-5% of all cases--need still conventional filmscreen or phosphorplate-systems. Full-spine and long-leg examinations cannot be performed with sufficient exactness. Without any questions implementation of integrated RIS, PACS and flatpanel detector-system needs excellent training of the employees, because of the changes in workflow etc. The main profits of such an integrated implementation are an increase in quality in image and report datas, easier handling--there are almost no more cassettes necessary--and excessive shortening of workflow.
Managing a Multisite Academic-Private Radiology Practice Reading Environment: Impact of IT Downtimes on Enterprise Efficiency.

PubMed

Becker, Murray; Goldszal, Alberto; Detal, Julie; Gronlund-Jacob, Judith; Epstein, Robert

2015-06-01

The aim of this study was to assess whether the complex radiology IT infrastructures needed for large, geographically diversified, radiology practices are inherently stable with respect to system downtimes, and to characterize the nature of the downtimes to better understand their impact on radiology department workflow. All radiology IT unplanned downtimes over a 12-month period in a hybrid academic-private practice that performs all interpretations in-house (no commercial "nighthawk" services) for approximately 900,000 studies per year, originating at 6 hospitals, 10 outpatient imaging centers, and multiple low-volume off-hours sites, were logged and characterized using 5 downtime metrics: duration, etiology, failure type, extent, and severity. In 12 consecutive months, 117 unplanned downtimes occurred with the following characteristics: duration: median time = 3.5 hours with 34% <1.5 hours and 30% >12 hours; etiology: 87% were due to software malfunctions, and 13% to hardware malfunctions; failure type: 88% were transient component failures, 12% were complete component failures; extent: all sites experienced downtimes, but downtimes were always localized to a subset of sites, and no system-wide downtimes occurred; severity (impact on radiologist workflow): 47% had minimal impact, 50% moderate impact, and 3% severe impact. In the complex radiology IT system that was studied, downtimes were common; they were usually a result of transient software malfunctions; the geographic extent was always localized rather than system wide; and most often, the impacts on radiologist workflow were modest. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Visualizing Cross-sectional Data in a Real-World Context

NASA Astrophysics Data System (ADS)

Van Noten, K.; Lecocq, T.

2016-12-01

If you could fly around your research results in three dimensions, wouldn't you like to do it? Visualizing research results properly during scientific presentations already does half the job of informing the public on the geographic framework of your research. Many scientists use the Google Earth™ mapping service (V7.1.2.2041) because it's a great interactive mapping tool for assigning geographic coordinates to individual data points, localizing a research area, and draping maps of results over Earth's surface for 3D visualization. However, visualizations of research results in vertical cross-sections are often not shown simultaneously with the maps in Google Earth. A few tutorials and programs to display cross-sectional data in Google Earth do exist, and the workflow is rather simple. By importing a cross-sectional figure into in the open software SketchUp Make [Trimble Navigation Limited, 2016], any spatial model can be exported to a vertical figure in Google Earth. In this presentation a clear workflow/tutorial is presented how to image cross-sections manually in Google Earth. No software skills, nor any programming codes are required. It is very easy to use, offers great possibilities for teaching and allows fast figure manipulation in Google Earth. The full workflow can be found in "Van Noten, K. 2016. Visualizing Cross-Sectional Data in a Real-World Context. EOS, Transactions AGU, 97, 16-19".The video tutorial can be found here: https://www.youtube.com/watch?v=Tr8LwFJ4RYU&Figure: Cross-sectional Research Examples Illustrated in Google Earth
Global Relative Quantification with Liquid Chromatography–Matrix-assisted Laser Desorption Ionization Time-of-flight (LC-MALDI-TOF)—Cross–validation with LTQ-Orbitrap Proves Reliability and Reveals Complementary Ionization Preferences*

PubMed Central

Hessling, Bernd; Büttner, Knut; Hecker, Michael; Becher, Dörte

2013-01-01

Quantitative LC-MALDI is an underrepresented method, especially in large-scale experiments. The additional fractionation step that is needed for most MALDI-TOF-TOF instruments, the comparatively long analysis time, and the very limited number of established software tools for the data analysis render LC-MALDI a niche application for large quantitative analyses beside the widespread LC–electrospray ionization workflows. Here, we used LC-MALDI in a relative quantification analysis of Staphylococcus aureus for the first time on a proteome-wide scale. Samples were analyzed in parallel with an LTQ-Orbitrap, which allowed cross-validation with a well-established workflow. With nearly 850 proteins identified in the cytosolic fraction and quantitative data for more than 550 proteins obtained with the MASCOT Distiller software, we were able to prove that LC-MALDI is able to process highly complex samples. The good correlation of quantities determined via this method and the LTQ-Orbitrap workflow confirmed the high reliability of our LC-MALDI approach for global quantification analysis. Because the existing literature reports differences for MALDI and electrospray ionization preferences and the respective experimental work was limited by technical or methodological constraints, we systematically compared biochemical attributes of peptides identified with either instrument. This genome-wide, comprehensive study revealed biases toward certain peptide properties for both MALDI-TOF-TOF- and LTQ-Orbitrap-based approaches. These biases are based on almost 13,000 peptides and result in a general complementarity of the two approaches that should be exploited in future experiments. PMID:23788530
Global relative quantification with liquid chromatography-matrix-assisted laser desorption ionization time-of-flight (LC-MALDI-TOF)--cross-validation with LTQ-Orbitrap proves reliability and reveals complementary ionization preferences.

PubMed

Hessling, Bernd; Büttner, Knut; Hecker, Michael; Becher, Dörte

2013-10-01

Quantitative LC-MALDI is an underrepresented method, especially in large-scale experiments. The additional fractionation step that is needed for most MALDI-TOF-TOF instruments, the comparatively long analysis time, and the very limited number of established software tools for the data analysis render LC-MALDI a niche application for large quantitative analyses beside the widespread LC-electrospray ionization workflows. Here, we used LC-MALDI in a relative quantification analysis of Staphylococcus aureus for the first time on a proteome-wide scale. Samples were analyzed in parallel with an LTQ-Orbitrap, which allowed cross-validation with a well-established workflow. With nearly 850 proteins identified in the cytosolic fraction and quantitative data for more than 550 proteins obtained with the MASCOT Distiller software, we were able to prove that LC-MALDI is able to process highly complex samples. The good correlation of quantities determined via this method and the LTQ-Orbitrap workflow confirmed the high reliability of our LC-MALDI approach for global quantification analysis. Because the existing literature reports differences for MALDI and electrospray ionization preferences and the respective experimental work was limited by technical or methodological constraints, we systematically compared biochemical attributes of peptides identified with either instrument. This genome-wide, comprehensive study revealed biases toward certain peptide properties for both MALDI-TOF-TOF- and LTQ-Orbitrap-based approaches. These biases are based on almost 13,000 peptides and result in a general complementarity of the two approaches that should be exploited in future experiments.
An Architecture for Automated Fire Detection Early Warning System Based on Geoprocessing Service Composition

NASA Astrophysics Data System (ADS)

Samadzadegan, F.; Saber, M.; Zahmatkesh, H.; Joze Ghazi Khanlou, H.

2013-09-01

Rapidly discovering, sharing, integrating and applying geospatial information are key issues in the domain of emergency response and disaster management. Due to the distributed nature of data and processing resources in disaster management, utilizing a Service Oriented Architecture (SOA) to take advantages of workflow of services provides an efficient, flexible and reliable implementations to encounter different hazardous situation. The implementation specification of the Web Processing Service (WPS) has guided geospatial data processing in a Service Oriented Architecture (SOA) platform to become a widely accepted solution for processing remotely sensed data on the web. This paper presents an architecture design based on OGC web services for automated workflow for acquisition, processing remotely sensed data, detecting fire and sending notifications to the authorities. A basic architecture and its building blocks for an automated fire detection early warning system are represented using web-based processing of remote sensing imageries utilizing MODIS data. A composition of WPS processes is proposed as a WPS service to extract fire events from MODIS data. Subsequently, the paper highlights the role of WPS as a middleware interface in the domain of geospatial web service technology that can be used to invoke a large variety of geoprocessing operations and chaining of other web services as an engine of composition. The applicability of proposed architecture by a real world fire event detection and notification use case is evaluated. A GeoPortal client with open-source software was developed to manage data, metadata, processes, and authorities. Investigating feasibility and benefits of proposed framework shows that this framework can be used for wide area of geospatial applications specially disaster management and environmental monitoring.
Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

PubMed

Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

2016-04-01

Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Acquiring Data by Mining the Past: Pairing Communities with Environmental Monitoring Methods through Open Online Collaborative Replication

NASA Astrophysics Data System (ADS)

Lippincott, M.; Lewis, E. S.; Gehrke, G. E.; Wise, A.; Pyle, S.; Sinatra, V.; Bland, G.; Bydlowski, D.; Henry, A.; Gilberts, P. A.

2016-12-01

Community groups are interested in low-cost sensors to monitor their environment. However, many new commercial sensors are unknown devices without peer-reviewed evaluations of data quality or pathways to regulatory acceptance, and the time to achieve these outcomes may be beyond a community's patience and attention. Rather than developing a device from scratch or validating a new commercial product, a workflow is presented whereby existing technologies, especially those that are out of patent, are replicated through open online collaboration between communities affected by environmental pollution, volunteers, academic institutions, and existing open hardware and open source software projects. Technology case studies will be presented, focusing primarily on a passive PM monitor based on the UNC Passive Monitor. Stages of the project will be detailed moving from identifying community needs, reviewing existing technology, partnership development, technology replication, IP review and licensing, data quality assurance (in process), and field evaluation with community partners (in process), with special attention to partnership development and technology review. We have leveraged open hardware and open source software to lower the cost and access barriers of existing technologies for PM10-2.5 and other atmospheric measures that have already been validated through peer review. Existing validation of and regulatory familiarity with a technology enables a rapid pathway towards collecting data, shortening the time it takes for communities to leverage data in environmental management decisions. Online collaboration requires rigorous documentation that aids in spreading research methods and promoting deep engagement by interested community researchers outside academia. At the same time, careful choice of technology and the use of small-scale fabrication through laser cutting, 3D printing, and open, shared repositories of plans and software enables educational engagement that broadens a project's reach.
EPOS Data and Service Provision

NASA Astrophysics Data System (ADS)

Bailo, Daniele; Jeffery, Keith G.; Atakan, Kuvvet; Harrison, Matt

2017-04-01

EPOS is now in IP (implementation phase) after a successful PP (preparatory phase). EPOS consists of essentially two components, one ICS (Integrated Core Services) representing the integrating ICT (Information and Communication Technology) and many TCS (Thematic Core Services) representing the scientific domains. The architecture developed, demonstrated and agreed within the project during the PP is now being developed utilising co-design with the TCS teams and agile, spiral methods within the ICS team. The 'heart' of EPOS is the metadata catalog. This provides for the ICS a digital representation of the TCS assets (services, data, software, equipment, expertise…) thus facilitating access, interoperation and (re-)use. A major part of the work has been interactions with the TCS. The original intention to harvest information from the TCS required (and still requires) discussions to understand fully the TCS organisational structures linked with rights, security and privacy; their (meta)data syntax (structure) and semantics (meaning); their workflows and methods of working and the services offered. To complicate matters further the TCS are each at varying stages of development and the ICS design has to accommodate pre-existing, developing and expected future standards for metadata, data, software and processes. Through information documents, questionnaires and interviews/meetings the EPOS ICS team has collected DDSS (Data, Data Products, Software and Services) information from the TCS. The ICS team developed a simplified metadata model for presentation to the TCS and the ICS team will perform the mapping and conversion from this model to the internal detailed technical metadata model using (CERIF: a EU recommendation to Member States maintained, developed and promoted by euroCRIS www.eurocris.org ). At the time of writing the final modifications of the EPOS metadata model are being made, and the mappings to CERIF designed, prior to the main phase of (meta)data collection into the EPOS metadata catalog. In parallel work proceeds on the user interface softsare, the APIs (Application Programming Interfaces) to the TCS services, the harvesting method and software, the AAAI (Authentication, Authorisation, Accounting Infrastructure) and the system manager. The next steps will involve interfaces to ICS-D (Distributed ICS i.e. facilities and services for computing, data storage, detectors and instruments for data collection etc.) to which requests, software and data will be deployed and from which data will be generated. Associated with this will be the development of the workflow system which will assist the end-user in building a workflow to achieve the scientific objectives.
Virtual facebow technique.

PubMed

Solaberrieta, Eneko; Garmendia, Asier; Minguez, Rikardo; Brizuela, Aritza; Pradies, Guillermo

2015-12-01

This article describes a virtual technique for transferring the location of a digitized cast from the patient to a virtual articulator (virtual facebow transfer). Using a virtual procedure, the maxillary digital cast is transferred to a virtual articulator by means of reverse engineering devices. The following devices necessary to carry out this protocol are available in many contemporary practices: an intraoral scanner, a digital camera, and specific software. Results prove the viability of integrating different tools and software and of completely integrating this procedure into a dental digital workflow. Copyright © 2015 Editorial Council for the Journal of Prosthetic Dentistry. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Joseph, A.; Seuntjens, J.; Parker, W.

We describe development of automated, web-based, electronic health record (EHR) auditing software for use within our paperless radiation oncology clinic. By facilitating access to multiple databases within the clinic, each patient's EHR is audited prior to treatment, regularly during treatment, and post treatment. Anomalies such as missing documentation, non-compliant workflow and treatment parameters that differ significantly from the norm may be monitored, flagged and brought to the attention of clinicians. By determining historical trends using existing patient data and by comparing new patient data with the historical, we expect our software to provide a measurable improvement in the quality ofmore » radiotherapy at our centre.« less
Singularity: Scientific containers for mobility of compute.

PubMed

Kurtzer, Gregory M; Sochat, Vanessa; Bauer, Michael W

2017-01-01

Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.

Singularity: Scientific containers for mobility of compute

PubMed Central

Kurtzer, Gregory M.; Bauer, Michael W.

2017-01-01

Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science. PMID:28494014
KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.

PubMed

Hastreiter, Maximilian; Jeske, Tim; Hoser, Jonathan; Kluge, Michael; Ahomaa, Kaarin; Friedl, Marie-Sophie; Kopetzky, Sebastian J; Quell, Jan-Dominik; Mewes, H Werner; Küffner, Robert

2017-05-15

Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). robert.kueffner@helmholtz-muenchen.de. Supplementary data are available at Bioinformatics online.
A preliminary model of work during initial examination and treatment planning appointments.

PubMed

Irwin, J Y; Torres-Urquidy, M H; Schleyer, T; Monaco, V

2009-01-10

Objective This study's objective was to formally describe the work process for charting and treatment planning in general dental practice to inform the design of a new clinical computing environment.Methods Using a process called contextual inquiry, researchers observed 23 comprehensive examination and treatment planning sessions during 14 visits to 12 general US dental offices. For each visit, field notes were analysed and reformulated as formalised models. Subsequently, each model type was consolidated across all offices and visits. Interruptions to the workflow, called breakdowns, were identified.Results Clinical work during dental examination and treatment planning appointments is a highly collaborative activity involving dentists, hygienists and assistants. Personnel with multiple overlapping roles complete complex multi-step tasks supported by a large and varied collection of equipment, artifacts and technology. Most of the breakdowns were related to technology which interrupted the workflow, caused rework and increased the number of steps in work processes.Conclusion Current dental software could be significantly improved with regard to its support for communication and collaboration, workflow, information design and presentation, information content, and data entry.
LipidFinder: A computational workflow for discovery of lipids identifies eicosanoid-phosphoinositides in platelets

PubMed Central

O’Connor, Anne; Brasher, Christopher J.; Slatter, David A.; Meckelmann, Sven W.; Hawksworth, Jade I.; Allen, Stuart M.; O’Donnell, Valerie B.

2017-01-01

Accurate and high-quality curation of lipidomic datasets generated from plasma, cells, or tissues is becoming essential for cell biology investigations and biomarker discovery for personalized medicine. However, a major challenge lies in removing artifacts otherwise mistakenly interpreted as real lipids from large mass spectrometry files (>60 K features), while retaining genuine ions in the dataset. This requires powerful informatics tools; however, available workflows have not been tailored specifically for lipidomics, particularly discovery research. We designed LipidFinder, an open-source Python workflow. An algorithm is included that optimizes analysis based on users’ own data, and outputs are screened against online databases and categorized into LIPID MAPS classes. LipidFinder outperformed three widely used metabolomics packages using data from human platelets. We show a family of three 12-hydroxyeicosatetraenoic acid phosphoinositides (16:0/, 18:1/, 18:0/12-HETE-PI) generated by thrombin-activated platelets, indicating crosstalk between eicosanoid and phosphoinositide pathways in human cells. The software is available on GitHub (https://github.com/cjbrasher/LipidFinder), with full userguides. PMID:28405621
XML schemas for common bioinformatic data types and their application in workflow systems

PubMed Central

Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

2006-01-01

Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823
THE ART OF DATA MINING THE MINEFIELDS OF TOXICITY DATABASES TO LINK CHEMISTRY TO BIOLOGY

EPA Science Inventory

Toxicity databases have a special role in predictive toxicology, providing ready access to historical information throughout the workflow of discovery, development, and product safety processes in drug development as well as in review by regulatory agencies. To provide accurate i...
Metadata Management on the SCEC PetaSHA Project: Helping Users Describe, Discover, Understand, and Use Simulation Data in a Large-Scale Scientific Collaboration

NASA Astrophysics Data System (ADS)

Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.

2007-12-01

Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
Evaluation of Interactive Visualization on Mobile Computing Platforms for Selection of Deep Brain Stimulation Parameters

PubMed Central

Butson, Christopher R.; Tamm, Georg; Jain, Sanket; Fogal, Thomas; Krüger, Jens

2012-01-01

In recent years there has been significant growth in the use of patient-specific models to predict the effects of neuromodulation therapies such as deep brain stimulation (DBS). However, translating these models from a research environment to the everyday clinical workflow has been a challenge, primarily due to the complexity of the models and the expertise required in specialized visualization software. In this paper, we deploy the interactive visualization system ImageVis3D Mobile, which has been designed for mobile computing devices such as the iPhone or iPad, in an evaluation environment to visualize models of Parkinson’s disease patients who received DBS therapy. Selection of DBS settings is a significant clinical challenge that requires repeated revisions to achieve optimal therapeutic response, and is often performed without any visual representation of the stimulation system in the patient. We used ImageVis3D Mobile to provide models to movement disorders clinicians and asked them to use the software to determine: 1) which of the four DBS electrode contacts they would select for therapy; and 2) what stimulation settings they would choose. We compared the stimulation protocol chosen from the software versus the stimulation protocol that was chosen via clinical practice (independently of the study). Lastly, we compared the amount of time required to reach these settings using the software versus the time required through standard practice. We found that the stimulation settings chosen using ImageVis3D Mobile were similar to those used in standard of care, but were selected in drastically less time. We show how our visualization system, available directly at the point of care on a device familiar to the clinician, can be used to guide clinical decision making for selection of DBS settings. In our view, the positive impact of the system could also translate to areas other than DBS. PMID:22450824
Maestro Workflow Conductor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Di Natale, Francesco

2017-06-01

MaestroWF is a Python tool and software package for loading YAML study specifications that represents a simulation campaign. The package is capable of parameterizing a study, pulling dependencies automatically, formatting output directories, and managing the flow and execution of the campaign. MaestroWF also provides a set of abstracted objects that can also be used to develop user specific scripts for launching simulation campaigns.
FirebrowseR: an R client to the Broad Institute’s Firehose Pipeline

PubMed Central

Deng, Mario; Brägelmann, Johannes; Kryukov, Ivan; Saraiva-Agostinho, Nuno; Perner, Sven

2017-01-01

With its Firebrowse service (http://firebrowse.org/) the Broad Institute is making large-scale multi-platform omics data analysis results publicly available through a Representational State Transfer (REST) Application Programmable Interface (API). Querying this database through an API client from an arbitrary programming environment is an essential task, allowing other developers and researchers to focus on their analysis and avoid data wrangling. Hence, as a first result, we developed a workflow to automatically generate, test and deploy such clients for rapid response to API changes. Its underlying infrastructure, a combination of free and publicly available web services, facilitates the development of API clients. It decouples changes in server software from the client software by reacting to changes in the RESTful service and removing direct dependencies on a specific implementation of an API. As a second result, FirebrowseR, an R client to the Broad Institute’s RESTful Firehose Pipeline, is provided as a working example, which is built by the means of the presented workflow. The package’s features are demonstrated by an example analysis of cancer gene expression data. Database URL: https://github.com/mariodeng/ PMID:28062517
Updates in metabolomics tools and resources: 2014-2015.

PubMed

Misra, Biswapriya B; van der Hooft, Justin J J

2016-01-01

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources--in the form of tools, software, and databases--is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Echo™ User Manual

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harvey, Dustin Yewell

Echo™ is a MATLAB-based software package designed for robust and scalable analysis of complex data workflows. An alternative to tedious, error-prone conventional processes, Echo is based on three transformative principles for data analysis: self-describing data, name-based indexing, and dynamic resource allocation. The software takes an object-oriented approach to data analysis, intimately connecting measurement data with associated metadata. Echo operations in an analysis workflow automatically track and merge metadata and computation parameters to provide a complete history of the process used to generate final results, while automated figure and report generation tools eliminate the potential to mislabel those results. History reportingmore » and visualization methods provide straightforward auditability of analysis processes. Furthermore, name-based indexing on metadata greatly improves code readability for analyst collaboration and reduces opportunities for errors to occur. Echo efficiently manages large data sets using a framework that seamlessly allocates resources such that only the necessary computations to produce a given result are executed. Echo provides a versatile and extensible framework, allowing advanced users to add their own tools and data classes tailored to their own specific needs. Applying these transformative principles and powerful features, Echo greatly improves analyst efficiency and quality of results in many application areas.« less
A Workflow to Model Microbial Loadings in Watersheds ...

EPA Pesticide Factsheets

Many watershed models simulate overland and instream microbial fate and transport, but few actually provide loading rates on land surfaces and point sources to the water body network. This paper describes the underlying general equations for microbial loading rates associated with 1) land-applied manure on undeveloped areas from domestic animals; 2) direct shedding on undeveloped lands by domestic animals and wildlife; 3) urban or engineered areas; and 4) point sources that directly discharge to streams from septic systems and shedding by domestic animals. A microbial source module, which houses these formulations, is linked within a workflow containing eight models and a set of databases that form a loosely configured modeling infrastructure which supports watershed-scale microbial source-to-receptor modeling by focusing on animal-impacted catchments. A hypothetical example application – accessing, retrieving, and using real-world data – demonstrates the ability of the infrastructure to automate many of the manual steps associated with a standard watershed assessment, culminating with calibrated flow and microbial densities at the pour point of a watershed. In the Proceedings of the International Environmental Modelling and Software Society (iEMSs), 8th International Congress on Environmental Modelling and Software, Toulouse, France
P7-S Combining Workflow-Based Project Organization with Protein-Dependant Data Retrieval for the Retrieval of Extensive Proteome Information

PubMed Central

Glandorf, J.; Thiele, H.; Macht, M.; Vorm, O.; Podtelejnikov, A.

2007-01-01

In the course of a full-scale proteomics experiment, the handling of the data as well as the retrieval of the relevant information from the results is a major challenge due to the massive amount of generated data (gel images, chromatograms, and spectra) as well as associated result information (sequences, literature, etc.). To obtain meaningful information from these data, one has to filter the results in an easy way. Possibilities to do so can be based on GO terms or structural features such as transmembrane domains, involvement in certain pathways, etc. In this presentation we will show how a combination of a software package with a workflow-based result organization (Bruker ProteinScape) and a protein-centered data-mining software (Proxeon ProteinCenter) can assist in the comparison of the results from large projects, such as comparison of cross-platform results from 2D PAGE/MS with shotgun LC-ESI-MS/MS. We will present differences between different technologies and show how these differences can be easily identified and how they allow us to draw conclusions on the involved technologies.
Implementation of a departmental picture archiving and communication system: a productivity and cost analysis.

PubMed

Macyszyn, Luke; Lega, Brad; Bohman, Leif-Erik; Latefi, Ahmad; Smith, Michelle J; Malhotra, Neil R; Welch, William; Grady, Sean M

2013-09-01

Digital radiology enhances productivity and results in long-term cost savings. However, the viewing, storage, and sharing of outside imaging studies on compact discs at ambulatory offices and hospitals pose a number of unique challenges to a surgeon's efficiency and clinical workflow. To improve the efficiency and clinical workflow of an academic neurosurgical practice when evaluating patients with outside radiological studies. Open-source software and commercial hardware were used to design and implement a departmental picture archiving and communications system (PACS). The implementation of a departmental PACS system significantly improved productivity and enhanced collaboration in a variety of clinical settings. Using published data on the rate of information technology problems associated with outside studies on compact discs, this system produced a cost savings ranging from $6250 to $33600 and from $43200 to $72000 for 2 cohorts, urgent transfer and spine clinic patients, respectively, therefore justifying the costs of the system in less than a year. The implementation of a departmental PACS system using open-source software is straightforward and cost-effective and results in significant gains in surgeon productivity when evaluating patients with outside imaging studies.
FirebrowseR: an R client to the Broad Institute's Firehose Pipeline.

PubMed

Deng, Mario; Brägelmann, Johannes; Kryukov, Ivan; Saraiva-Agostinho, Nuno; Perner, Sven

2017-01-01

With its Firebrowse service (http://firebrowse.org/) the Broad Institute is making large-scale multi-platform omics data analysis results publicly available through a Representational State Transfer (REST) Application Programmable Interface (API). Querying this database through an API client from an arbitrary programming environment is an essential task, allowing other developers and researchers to focus on their analysis and avoid data wrangling. Hence, as a first result, we developed a workflow to automatically generate, test and deploy such clients for rapid response to API changes. Its underlying infrastructure, a combination of free and publicly available web services, facilitates the development of API clients. It decouples changes in server software from the client software by reacting to changes in the RESTful service and removing direct dependencies on a specific implementation of an API. As a second result, FirebrowseR, an R client to the Broad Institute's RESTful Firehose Pipeline, is provided as a working example, which is built by the means of the presented workflow. The package's features are demonstrated by an example analysis of cancer gene expression data.Database URL: https://github.com/mariodeng/. © The Author(s) 2017. Published by Oxford University Press.
PeptideDepot: flexible relational database for visual analysis of quantitative proteomic data and integration of existing protein information.

PubMed

Yu, Kebing; Salomon, Arthur R

2009-12-01

Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through MS/MS. Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to various experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our high throughput autonomous proteomic pipeline used in the automated acquisition and post-acquisition analysis of proteomic data.
Clinical Note Creation, Binning, and Artificial Intelligence.

PubMed

Deliberato, Rodrigo Octávio; Celi, Leo Anthony; Stone, David J

2017-08-03

The creation of medical notes in software applications poses an intrinsic problem in workflow as the technology inherently intervenes in the processes of collecting and assembling information, as well as the production of a data-driven note that meets both individual and healthcare system requirements. In addition, the note writing applications in currently available electronic health records (EHRs) do not function to support decision making to any substantial degree. We suggest that artificial intelligence (AI) could be utilized to facilitate the workflows of the data collection and assembly processes, as well as to support the development of personalized, yet data-driven assessments and plans. ©Rodrigo Octávio Deliberato, Leo Anthony Celi, David J Stone. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 03.08.2017.
Full 3-dimensional digital workflow for multicomponent dental appliances: A proof of concept.

PubMed

van der Meer, W Joerd; Vissink, Arjan; Ren, Yijin

2016-04-01

The authors used a 3-dimensional (3D) printer and a bending robot to produce a multicomponent dental appliance to assess whether 3D digital models of the dentition are applicable for a full digital workflow. The authors scanned a volunteer's dentition with an intraoral scanner (Lava Chairside Oral Scanner C.O.S., 3M). A digital impression was used to design 2 multicomponent orthodontic appliances. Biocompatible acrylic baseplates were produced with the aid of a 3D printer. The metal springs and clasps were produced by a bending robot. The fit of the 2 appliances was assessed by 2 experienced orthodontists. The authors assessed both orthodontic appliances with the volunteer's dentition and found the fit to be excellent. Clinicians can fully produce a multicomponent dental appliance consisting of both an acrylic baseplate and other parts, such as clasps, springs, or screws, using a digital workflow process without the need for a physical model of the patient's dentition. Plaster models can be superfluous for orthodontic treatment as digital models can be used in all phases of a full digital workflow in orthodontics. The arduous task of making a multicomponent dental appliance that involves bending wires can possibly be replaced by a computer, design software, a 3D printer, and a bending robot. Copyright © 2016 American Dental Association. Published by Elsevier Inc. All rights reserved.
Routine Digital Pathology Workflow: The Catania Experience

PubMed Central

Fraggetta, Filippo; Garozzo, Salvatore; Zannoni, Gian Franco; Pantanowitz, Liron; Rossi, Esther Diana

2017-01-01

Introduction: Successful implementation of whole slide imaging (WSI) for routine clinical practice has been accomplished in only a few pathology laboratories worldwide. We report the transition to an effective and complete digital surgical pathology workflow in the pathology laboratory at Cannizzaro Hospital in Catania, Italy. Methods: All (100%) permanent histopathology glass slides were digitized at ×20 using Aperio AT2 scanners. Compatible stain and scanning slide racks were employed to streamline operations. eSlide Manager software was bidirectionally interfaced with the anatomic pathology laboratory information system. Virtual slide trays connected to the two-dimensional (2D) barcode tracking system allowed pathologists to confirm that they were correctly assigned slides and that all tissues on these glass slides were scanned. Results: Over 115,000 glass slides were digitized with a scan fail rate of around 1%. Drying glass slides before scanning minimized them sticking to scanner racks. Implementation required introduction of a 2D barcode tracking system and modification of histology workflow processes. Conclusion: Our experience indicates that effective adoption of WSI for primary diagnostic use was more dependent on optimizing preimaging variables and integration with the laboratory information system than on information technology infrastructure and ensuring pathologist buy-in. Implementation of digital pathology for routine practice not only leveraged the benefits of digital imaging but also creates an opportunity for establishing standardization of workflow processes in the pathology laboratory. PMID:29416914

Routine Digital Pathology Workflow: The Catania Experience.

PubMed

Fraggetta, Filippo; Garozzo, Salvatore; Zannoni, Gian Franco; Pantanowitz, Liron; Rossi, Esther Diana

2017-01-01

Successful implementation of whole slide imaging (WSI) for routine clinical practice has been accomplished in only a few pathology laboratories worldwide. We report the transition to an effective and complete digital surgical pathology workflow in the pathology laboratory at Cannizzaro Hospital in Catania, Italy. All (100%) permanent histopathology glass slides were digitized at ×20 using Aperio AT2 scanners. Compatible stain and scanning slide racks were employed to streamline operations. eSlide Manager software was bidirectionally interfaced with the anatomic pathology laboratory information system. Virtual slide trays connected to the two-dimensional (2D) barcode tracking system allowed pathologists to confirm that they were correctly assigned slides and that all tissues on these glass slides were scanned. Over 115,000 glass slides were digitized with a scan fail rate of around 1%. Drying glass slides before scanning minimized them sticking to scanner racks. Implementation required introduction of a 2D barcode tracking system and modification of histology workflow processes. Our experience indicates that effective adoption of WSI for primary diagnostic use was more dependent on optimizing preimaging variables and integration with the laboratory information system than on information technology infrastructure and ensuring pathologist buy-in. Implementation of digital pathology for routine practice not only leveraged the benefits of digital imaging but also creates an opportunity for establishing standardization of workflow processes in the pathology laboratory.
Bridging the provenance gap: opportunities and challenges tracking in and ex silico provenance in sUAS workflows

NASA Astrophysics Data System (ADS)

Thomer, A.

2017-12-01

Data provenance - the record of the varied processes that went into the creation of a dataset, as well as the relationships between resulting data objects - is necessary to support the reusability, reproducibility and reliability of earth science data. In sUAS-based research, capturing provenance can be particularly challenging because of the breadth and distributed nature of the many platforms used to collect, process and analyze data. In any given project, multiple drones, controllers, computers, software systems, sensors, cameras, imaging processing algorithms and data processing workflows are used over sometimes long periods of time. These platforms and processing result in dozens - if not hundreds - of data products in varying stages of readiness-for-analysis and sharing. Provenance tracking mechanisms are needed to make the relationships between these many data products explicit, and therefore more reusable and shareable. In this talk, I discuss opportunities and challenges in tracking provenance in sUAS-based research, and identify gaps in current workflow-capture technologies. I draw on prior work conducted as part of the IMLS-funded Site-Based Data Curation project in which we developed methods of documenting in and ex silico (that is, computational and non-computation) workflows, and demonstrate this approaches applicability to research with sUASes. I conclude with a discussion of ontologies and other semantic technologies that have potential application in sUAS research.
The 2006 NESCent Phyloinformatics Hackathon: A Field Report

PubMed Central

Lapp, Hilmar; Bala, Sendu; Balhoff, James P.; Bouck, Amy; Goto, Naohisa; Holder, Mark; Holland, Richard; Holloway, Alisha; Katayama, Toshiaki; Lewis, Paul O.; Mackey, Aaron J.; Osborne, Brian I.; Piel, William H.; Kosakovsky Pond, Sergei L.; Poon, Art F.Y.; Qiu, Wei-Gang; Stajich, Jason E.; Stoltzfus, Arlin; Thierer, Tobias; Vilella, Albert J.; Vos, Rutger A.; Zmasek, Christian M.; Zwickl, Derrick J.; Vision, Todd J.

2007-01-01

In December, 2006, a group of 26 software developers from some of the most widely used life science programming toolkits and phylogenetic software projects converged on Durham, North Carolina, for a Phyloinformatics Hackathon, an intense five-day collaborative software coding event sponsored by the National Evolutionary Synthesis Center (NESCent). The goal was to help researchers to integrate multiple phylogenetic software tools into automated workflows. Participants addressed deficiencies in interoperability between programs by implementing “glue code” and improving support for phylogenetic data exchange standards (particularly NEXUS) across the toolkits. The work was guided by use-cases compiled in advance by both developers and users, and the code was documented as it was developed. The resulting software is freely available for both users and developers through incorporation into the distributions of several widely-used open-source toolkits. We explain the motivation for the hackathon, how it was organized, and discuss some of the outcomes and lessons learned. We conclude that hackathons are an effective mode of solving problems in software interoperability and usability, and are underutilized in scientific software development.
Querying Provenance Information: Basic Notions and an Example from Paleoclimate Reconstruction

NASA Astrophysics Data System (ADS)

Stodden, V.; Ludaescher, B.; Bocinsky, K.; Kintigh, K.; Kohler, T.; McPhillips, T.; Rush, J.

2016-12-01

Computational models are used to reconstruct and explain past environments and to predict likely future environments. For example, Bocinsky and Kohler have performed a 2,000-year reconstruction of the rain-fed maize agricultural niche in the US Southwest. The resulting academic publications not only contain traditional method descriptions, figures, etc. but also links to code and data for basic transparency and reproducibility. Examples include ResearchCompendia.org and the new project "Merging Science and Cyberinfrastructure Pathways: The Whole Tale." Provenance information provides a further critical element to understand a published study and to possibly extend or challenge the findings of the original authors. We present different notions and uses of provenance information using a computational archaeology example, e.g., the common use of "provenance for others" (for transparency and reproducibility), but also the more elusive but equally important use of "provenance for self". To this end, we distinguish prospective provenance (a.k.a. workflow) from retrospective provenance (a.k.a. data lineage) and show how combinations of both forms of provenance can be used to answer different kinds of important questions about a workflow and its execution. Since many workflows are developed using scripting or special purpose languages such as Python and R, we employ an approach and toolkit called YesWorkflow that brings provenance modeling, capture, and querying into the realm of scripting. YesWorkflow employs the basic W3C PROV standard, as well as the ProvONE extension for sharing and exchanging retrospective and prospective provenance information, respectively. Finally, we argue that the utility of provenance information should be maximized by developing different kinds provenance questions and queries during the early phases of computational workflow design and implementation.
Technical Challenges of Enterprise Imaging: HIMSS-SIIM Collaborative White Paper.

PubMed

Clunie, David A; Dennison, Don K; Cram, Dawn; Persons, Kenneth R; Bronkalla, Mark D; Primo, Henri Rik

2016-10-01

This white paper explores the technical challenges and solutions for acquiring (capturing) and managing enterprise images, particularly those involving visible light applications. The types of acquisition devices used for various general-purpose photography and specialized applications including dermatology, endoscopy, and anatomic pathology are reviewed. The formats and standards used, and the associated metadata requirements and communication protocols for transfer and workflow are considered. Particular emphasis is placed on the importance of metadata capture in both order- and encounter-based workflow. The benefits of using DICOM to provide a standard means of recording and accessing both metadata and image and video data are considered, as is the role of IHE and FHIR.
Antibiogramj: A tool for analysing images from disk diffusion tests.

PubMed

Alonso, C A; Domínguez, C; Heras, J; Mata, E; Pascual, V; Torres, C; Zarazaga, M

2017-05-01

Disk diffusion testing, known as antibiogram, is widely applied in microbiology to determine the antimicrobial susceptibility of microorganisms. The measurement of the diameter of the zone of growth inhibition of microorganisms around the antimicrobial disks in the antibiogram is frequently performed manually by specialists using a ruler. This is a time-consuming and error-prone task that might be simplified using automated or semi-automated inhibition zone readers. However, most readers are usually expensive instruments with embedded software that require significant changes in laboratory design and workflow. Based on the workflow employed by specialists to determine the antimicrobial susceptibility of microorganisms, we have designed a software tool that, from images of disk diffusion tests, semi-automatises the process. Standard computer vision techniques are employed to achieve such an automatisation. We present AntibiogramJ, a user-friendly and open-source software tool to semi-automatically determine, measure and categorise inhibition zones of images from disk diffusion tests. AntibiogramJ is implemented in Java and deals with images captured with any device that incorporates a camera, including digital cameras and mobile phones. The fully automatic procedure of AntibiogramJ for measuring inhibition zones achieves an overall agreement of 87% with an expert microbiologist; moreover, AntibiogramJ includes features to easily detect when the automatic reading is not correct and fix it manually to obtain the correct result. AntibiogramJ is a user-friendly, platform-independent, open-source, and free tool that, up to the best of our knowledge, is the most complete software tool for antibiogram analysis without requiring any investment in new equipment or changes in the laboratory. Copyright © 2017 Elsevier B.V. All rights reserved.
Data management in clinical research: Synthesizing stakeholder perspectives.

PubMed

Johnson, Stephen B; Farach, Frank J; Pelphrey, Kevin; Rozenblit, Leon

2016-04-01

This study assesses data management needs in clinical research from the perspectives of researchers, software analysts and developers. This is a mixed-methods study that employs sublanguage analysis in an innovative manner to link the assessments. We performed content analysis using sublanguage theory on transcribed interviews conducted with researchers at four universities. A business analyst independently extracted potential software features from the transcriptions, which were translated into the sublanguage. This common sublanguage was then used to create survey questions for researchers, analysts and developers about the desirability and difficulty of features. Results were synthesized using the common sublanguage to compare stakeholder perceptions with the original content analysis. Individual researchers exhibited significant diversity of perspectives that did not correlate by role or site. Researchers had mixed feelings about their technologies, and sought improvements in integration, interoperability and interaction as well as engaging with study participants. Researchers and analysts agreed that data integration has higher desirability and mobile technology has lower desirability but disagreed on the desirability of data validation rules. Developers agreed that data integration and validation are the most difficult to implement. Researchers perceive tasks related to study execution, analysis and quality control as highly strategic, in contrast with tactical tasks related to data manipulation. Researchers have only partial technologic support for analysis and quality control, and poor support for study execution. Software for data integration and validation appears critical to support clinical research, but may be expensive to implement. Features to support study workflow, collaboration and engagement have been underappreciated, but may prove to be easy successes. Software developers should consider the strategic goals of researchers with regard to the overall coordination of research projects and teams, workflow connecting data collection with analysis and processes for improving data quality. Copyright © 2016 Elsevier Inc. All rights reserved.
A SOAP Web Service for accessing MODIS land product subsets

DOE Office of Scientific and Technical Information (OSTI.GOV)

SanthanaVannan, Suresh K; Cook, Robert B; Pan, Jerry Yun

2011-01-01

Remote sensing data from satellites have provided valuable information on the state of the earth for several decades. Since March 2000, the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board NASA s Terra and Aqua satellites have been providing estimates of several land parameters useful in understanding earth system processes at global, continental, and regional scales. However, the HDF-EOS file format, specialized software needed to process the HDF-EOS files, data volume, and the high spatial and temporal resolution of MODIS data make it difficult for users wanting to extract small but valuable amounts of information from the MODIS record. Tomore » overcome this usability issue, the NASA-funded Distributed Active Archive Center (DAAC) for Biogeochemical Dynamics at Oak Ridge National Laboratory (ORNL) developed a Web service that provides subsets of MODIS land products using Simple Object Access Protocol (SOAP). The ORNL DAAC MODIS subsetting Web service is a unique way of serving satellite data that exploits a fairly established and popular Internet protocol to allow users access to massive amounts of remote sensing data. The Web service provides MODIS land product subsets up to 201 x 201 km in a non-proprietary comma delimited text file format. Users can programmatically query the Web service to extract MODIS land parameters for real time data integration into models, decision support tools or connect to workflow software. Information regarding the MODIS SOAP subsetting Web service is available on the World Wide Web (WWW) at http://daac.ornl.gov/modiswebservice.« less
3D reconstruction of SEM images by use of optical photogrammetry software.

PubMed

Eulitz, Mona; Reiss, Gebhard

2015-08-01

Reconstruction of the three-dimensional (3D) surface of an object to be examined is widely used for structure analysis in science and many biological questions require information about their true 3D structure. For Scanning Electron Microscopy (SEM) there has been no efficient non-destructive solution for reconstruction of the surface morphology to date. The well-known method of recording stereo pair images generates a 3D stereoscope reconstruction of a section, but not of the complete sample surface. We present a simple and non-destructive method of 3D surface reconstruction from SEM samples based on the principles of optical close range photogrammetry. In optical close range photogrammetry a series of overlapping photos is used to generate a 3D model of the surface of an object. We adapted this method to the special SEM requirements. Instead of moving a detector around the object, the object itself was rotated. A series of overlapping photos was stitched and converted into a 3D model using the software commonly used for optical photogrammetry. A rabbit kidney glomerulus was used to demonstrate the workflow of this adaption. The reconstruction produced a realistic and high-resolution 3D mesh model of the glomerular surface. The study showed that SEM micrographs are suitable for 3D reconstruction by optical photogrammetry. This new approach is a simple and useful method of 3D surface reconstruction and suitable for various applications in research and teaching. Copyright © 2015 Elsevier Inc. All rights reserved.
Concept for individualized patient allocation: ReCompare--remote comparison of particle and photon treatment plans.

PubMed

Lühr, Armin; Löck, Steffen; Roth, Klaus; Helmbrecht, Stephan; Jakobi, Annika; Petersen, Jørgen B; Just, Uwe; Krause, Mechthild; Enghardt, Wolfgang; Baumann, Michael

2014-02-18

Identifying those patients who have a higher chance to be cured with fewer side effects by particle beam therapy than by state-of-the-art photon therapy is essential to guarantee a fair and sufficient access to specialized radiotherapy. The individualized identification requires initiatives by particle as well as non-particle radiotherapy centers to form networks, to establish procedures for the decision process, and to implement means for the remote exchange of relevant patient information. In this work, we want to contribute a practical concept that addresses these requirements. We proposed a concept for individualized patient allocation to photon or particle beam therapy at a non-particle radiotherapy institution that bases on remote treatment plan comparison. We translated this concept into the web-based software tool ReCompare (REmote COMparison of PARticlE and photon treatment plans). We substantiated the feasibility of the proposed concept by demonstrating remote exchange of treatment plans between radiotherapy institutions and the direct comparison of photon and particle treatment plans in photon treatment planning systems. ReCompare worked with several tested standard treatment planning systems, ensured patient data protection, and integrated in the clinical workflow. Our concept supports non-particle radiotherapy institutions with the patient-specific treatment decision on the optimal irradiation modality by providing expertise from a particle therapy center. The software tool ReCompare may help to improve and standardize this personalized treatment decision. It will be available from our website when proton therapy is operational at our facility.
Rapid and efficient localization of depth electrodes and cortical labeling using free and open source medical software in epilepsy surgery candidates

PubMed Central

Princich, Juan Pablo; Wassermann, Demian; Latini, Facundo; Oddo, Silvia; Blenkmann, Alejandro Omar; Seifer, Gustavo; Kochen, Silvia

2013-01-01

Depth intracranial electrodes (IEs) placement is one of the most used procedures to identify the epileptogenic zone (EZ) in surgical treatment of drug resistant epilepsy patients, about 20–30% of this population. IEs localization is therefore a critical issue defining the EZ and its relation with eloquent functional areas. That information is then used to target the resective surgery and has great potential to affect outcome. We designed a methodological procedure intended to avoid the need for highly specialized medical resources and reduce time to identify the anatomical location of IEs, during the first instances of intracranial EEG recordings. This workflow is based on established open source software; 3D Slicer and Freesurfer that uses MRI and Post-implant CT fusion for the localization of IEs and its relation with automatic labeled surrounding cortex. To test this hypothesis we assessed the time elapsed between the surgical implantation process and the final anatomical localization of IEs by means of our proposed method compared against traditional visual analysis of raw post-implant imaging in two groups of patients. All IEs were identified in the first 24 H (6–24 H) of implantation using our method in 4 patients of the first group. For the control group; all IEs were identified by experts with an overall time range of 36 h to 3 days using traditional visual analysis. It included (7 patients), 3 patients implanted with IEs and the same 4 patients from the first group. Time to localization was restrained in this group by the specialized personnel and the image quality available. To validate our method; we trained two inexperienced operators to assess the position of IEs contacts on four patients (5 IEs) using the proposed method. We quantified the discrepancies between operators and we also assessed the efficiency of our method to define the EZ comparing the findings against the results of traditional analysis. PMID:24427112
Chipster: user-friendly analysis software for microarray and other high-throughput data.

PubMed

Kallio, M Aleksi; Tuimala, Jarno T; Hupponen, Taavi; Klemelä, Petri; Gentile, Massimiliano; Scheinin, Ilari; Koski, Mikko; Käki, Janne; Korpelainen, Eija I

2011-10-14

The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available.
Chipster: user-friendly analysis software for microarray and other high-throughput data

PubMed Central

2011-01-01

Background The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Results Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Conclusions Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available. PMID:21999641
Parallel software for lattice N = 4 supersymmetric Yang-Mills theory

NASA Astrophysics Data System (ADS)

Schaich, David; DeGrand, Thomas

2015-05-01

We present new parallel software, SUSY LATTICE, for lattice studies of four-dimensional N = 4 supersymmetric Yang-Mills theory with gauge group SU(N). The lattice action is constructed to exactly preserve a single supersymmetry charge at non-zero lattice spacing, up to additional potential terms included to stabilize numerical simulations. The software evolved from the MILC code for lattice QCD, and retains a similar large-scale framework despite the different target theory. Many routines are adapted from an existing serial code (Catterall and Joseph, 2012), which SUSY LATTICE supersedes. This paper provides an overview of the new parallel software, summarizing the lattice system, describing the applications that are currently provided and explaining their basic workflow for non-experts in lattice gauge theory. We discuss the parallel performance of the code, and highlight some notable aspects of the documentation for those interested in contributing to its future development.
XMM-Newton Remote Interface to Science Analysis Software: First Public Version

NASA Astrophysics Data System (ADS)

Ibarra, A.; Gabriel, C.

2011-07-01

We present the first public beta release of the XMM-Newton Remote Interface to Science Analysis (RISA) software, available through the official XMM-Newton web pages. In a nutshell, RISA is a web based application that encapsulates the XMM-Newton data analysis software. The client identifies observations and creates XMM-Newton workflows. The server processes the client request, creates job templates and sends the jobs to a computer. RISA has been designed to help, at the same time, non-expert and professional XMM-Newton users. Thanks to the predefined threads, non-expert users can easily produce light curves and spectra. And on the other hand, expert user can use the full parameter interface to tune their own analysis. In both cases, the VO compliant client/server design frees the users from having to install any specific software to analyze XMM-Newton data.
The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology.

PubMed

Galdzicki, Michal; Clancy, Kevin P; Oberortner, Ernst; Pocock, Matthew; Quinn, Jacqueline Y; Rodriguez, Cesar A; Roehner, Nicholas; Wilson, Mandy L; Adam, Laura; Anderson, J Christopher; Bartley, Bryan A; Beal, Jacob; Chandran, Deepak; Chen, Joanna; Densmore, Douglas; Endy, Drew; Grünberg, Raik; Hallinan, Jennifer; Hillson, Nathan J; Johnson, Jeffrey D; Kuchinsky, Allan; Lux, Matthew; Misirli, Goksel; Peccoud, Jean; Plahar, Hector A; Sirin, Evren; Stan, Guy-Bart; Villalobos, Alan; Wipat, Anil; Gennari, John H; Myers, Chris J; Sauro, Herbert M

2014-06-01

The re-use of previously validated designs is critical to the evolution of synthetic biology from a research discipline to an engineering practice. Here we describe the Synthetic Biology Open Language (SBOL), a proposed data standard for exchanging designs within the synthetic biology community. SBOL represents synthetic biology designs in a community-driven, formalized format for exchange between software tools, research groups and commercial service providers. The SBOL Developers Group has implemented SBOL as an XML/RDF serialization and provides software libraries and specification documentation to help developers implement SBOL in their own software. We describe early successes, including a demonstration of the utility of SBOL for information exchange between several different software tools and repositories from both academic and industrial partners. As a community-driven standard, SBOL will be updated as synthetic biology evolves to provide specific capabilities for different aspects of the synthetic biology workflow.
VRE4EIC: A Reference Architecture and Components for Research Access

NASA Astrophysics Data System (ADS)

Bailo, Daniele; Jeffery, Keith G.; Atakan, Kuvvet; Harrison, Matt

2017-04-01

VRE4EIC (www. Vre4eic.eu) is a EC H2020 project with the objective of providing a reference architecture and components for a VRE (Virtual Research Environment). SGs (Science gateways) in North America and VLs (Virtual Laboratories) in Australasia are similar - but significantly different - concepts. A VRE provides not only access to ICT services, data, software components and equipment but also provides a collaborative working environment for cooperation and supports the research lifecycle from idea to publication. Europe has a large number of RIs (Research infrastructures); the major ones are coordinated and planned through the ESFRI (European Strategy Forum on Research Infrastructures) roadmap. Most RIs - such as EPOS - provide a user interface portal function, ranging from (1) a simple list of assets (such as services, datasets, software components, workflows, equipment, experts.. although many provide only information about data) with URLs upon which the user can click to download; (2) to an end-user facility for constructing queries to find relevant assets and subsets of them more-or-less integrated as a downloaded combined dataset; (3) in a few cases - for constructing workflows to achieve the scientific objective. The portal has the scope of the individual RI. The aim of VRE4EIC is to provide a reference architecture, software components and a prototype implementation VRE which allows user access and all the portal functions (and more) not only to an individual RI - such as EPOS - but across RIs thus encouraging multidisciplinary research. Two RIs: EPOS and ENVRIplus (itself spanning 21 RIs) are represented within the project as requirements stakeholders , validators of the architecture and evaluators of the prototype system developed. The characterisation of many more RIs - and their requirements - has been done to ensure wide applicability. The virtualisation across RIs is achieved by using a rich metadata catalog based on CERIF (Common European Research Information Format: a EU Recommendation to Member States and supported, developed and promoted by euroCRIS www.eurocris.org ). The VRE4EIC catalog system harvests from individual RI catalogs (with conversion since they use many different metadata formats) to give the user of VRE4IC a 'canonical view' over the RIs and their assets. The VRE4IC user interface provides portal functions for each and all RIs but also a workflow construction facility. The project expects the RIs to use middleware developed in other projects to facilitate workflow deployment across the eIs (e-Infrastructures) such as GEANT, EUDAT, EGI, OpenAIRE and will itself use the same mechanisms. After 15 months of the project we have validated the requirements from the RIs, defined the architecture and started work on the metadata mapping and conversion. The intention is to have the prototype at M24 for evaluation by the RI partners (and some external Ris) leading to a refined architecture and software stack for production use after M36.
Lessons in modern digital field geology: Open source software, 3D techniques, and the new world of digital mapping

NASA Astrophysics Data System (ADS)

Pavlis, Terry; Hurtado, Jose; Langford, Richard; Serpa, Laura

2014-05-01

Although many geologists refuse to admit it, it is time to put paper-based geologic mapping into the historical archives and move to the full potential of digital mapping techniques. For our group, flat map digital geologic mapping is now a routine operation in both research and instruction. Several software options are available, and basic proficiency with the software can be learned in a few hours of instruction and practice. The first practical field GIS software, ArcPad, remains a viable, stable option on Windows-based systems. However, the vendor seems to be moving away from ArcPad in favor of mobile software solutions that are difficult to implement without GIS specialists. Thus, we have pursued a second software option based on the open source program QGIS. Our QGIS system uses the same shapefile-centric data structure as our ArcPad system, including similar pop-up data entry forms and generic graphics for easy data management in the field. The advantage of QGIS is that the same software runs on virtually all common platforms except iOS, although the Android version remains unstable as of this writing. A third software option we are experimenting with for flat map-based field work is Fieldmove, a derivative of the 3D-capable program Move developed by Midland Valley. Our initial experiments with Fieldmove are positive, particularly with the new, inexpensive (<300Euros) Windows tablets. However, the lack of flexibility in data structure makes for cumbersome workflows when trying to interface our existing shapefile-centric data structures to Move. Nonetheless, in spring 2014 we will experiment with full-3D immersion in the field using the full Move software package in combination with ground based LiDAR and photogrammetry. One new workflow suggested by our initial experiments is that field geologists should consider using photogrammetry software to capture 3D visualizations of key outcrops. This process is now straightforward in several software packages, and it affords a previously unheard of potential for communicating the complexity of key exposures. For example, in studies of metamorphic structures we often search for days to find "Rosetta Stone" outcrops that display key geometric relationships. While conventional photographs rarely can capture the essence of the field exposure, capturing a true 3D representation of the exposure with multiple photos from many orientations can solve this communication problem. As spatial databases evolve these 3D models should be readily importable into the database.
Cognitive Learning, Monitoring and Assistance of Industrial Workflows Using Egocentric Sensor Networks

PubMed Central

Bleser, Gabriele; Damen, Dima; Behera, Ardhendu; Hendeby, Gustaf; Mura, Katharina; Miezal, Markus; Gee, Andrew; Petersen, Nils; Maçães, Gustavo; Domingues, Hugo; Gorecky, Dominic; Almeida, Luis; Mayol-Cuevas, Walterio; Calway, Andrew; Cohn, Anthony G.; Hogg, David C.; Stricker, Didier

2015-01-01

Today, the workflows that are involved in industrial assembly and production activities are becoming increasingly complex. To efficiently and safely perform these workflows is demanding on the workers, in particular when it comes to infrequent or repetitive tasks. This burden on the workers can be eased by introducing smart assistance systems. This article presents a scalable concept and an integrated system demonstrator designed for this purpose. The basic idea is to learn workflows from observing multiple expert operators and then transfer the learnt workflow models to novice users. Being entirely learning-based, the proposed system can be applied to various tasks and domains. The above idea has been realized in a prototype, which combines components pushing the state of the art of hardware and software designed with interoperability in mind. The emphasis of this article is on the algorithms developed for the prototype: 1) fusion of inertial and visual sensor information from an on-body sensor network (BSN) to robustly track the user’s pose in magnetically polluted environments; 2) learning-based computer vision algorithms to map the workspace, localize the sensor with respect to the workspace and capture objects, even as they are carried; 3) domain-independent and robust workflow recovery and monitoring algorithms based on spatiotemporal pairwise relations deduced from object and user movement with respect to the scene; and 4) context-sensitive augmented reality (AR) user feedback using a head-mounted display (HMD). A distinguishing key feature of the developed algorithms is that they all operate solely on data from the on-body sensor network and that no external instrumentation is needed. The feasibility of the chosen approach for the complete action-perception-feedback loop is demonstrated on three increasingly complex datasets representing manual industrial tasks. These limited size datasets indicate and highlight the potential of the chosen technology as a combined entity as well as point out limitations of the system. PMID:26126116
GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline.

PubMed

Thanki, Anil S; Soranzo, Nicola; Haerty, Wilfried; Davey, Robert P

2018-03-01

Gene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.

Pathologists' Computer-Assisted Diagnosis: A Mock-up of a Prototype Information System to Facilitate Automation of Pathology Sign-out.

PubMed

Farahani, Navid; Liu, Zheng; Jutt, Dylan; Fine, Jeffrey L

2017-10-01

- Pathologists' computer-assisted diagnosis (pCAD) is a proposed framework for alleviating challenges through the automation of their routine sign-out work. Currently, hypothetical pCAD is based on a triad of advanced image analysis, deep integration with heterogeneous information systems, and a concrete understanding of traditional pathology workflow. Prototyping is an established method for designing complex new computer systems such as pCAD. - To describe, in detail, a prototype of pCAD for the sign-out of a breast cancer specimen. - Deidentified glass slides and data from breast cancer specimens were used. Slides were digitized into whole-slide images with an Aperio ScanScope XT, and screen captures were created by using vendor-provided software. The advanced workflow prototype was constructed by using PowerPoint software. - We modeled an interactive, computer-assisted workflow: pCAD previews whole-slide images in the context of integrated, disparate data and predefined diagnostic tasks and subtasks. Relevant regions of interest (ROIs) would be automatically identified and triaged by the computer. A pathologist's sign-out work would consist of an interactive review of important ROIs, driven by required diagnostic tasks. The interactive session would generate a pathology report automatically. - Using animations and real ROIs, the pCAD prototype demonstrates the hypothetical sign-out in a stepwise fashion, illustrating various interactions and explaining how steps can be automated. The file is publicly available and should be widely compatible. This mock-up is intended to spur discussion and to help usher in the next era of digitization for pathologists by providing desperately needed and long-awaited automation.
Histostitcher™: An informatics software platform for reconstructing whole-mount prostate histology using the extensible imaging platform framework

PubMed Central

Toth, Robert J.; Shih, Natalie; Tomaszewski, John E.; Feldman, Michael D.; Kutter, Oliver; Yu, Daphne N.; Paulus, John C.; Paladini, Ginaluca; Madabhushi, Anant

2014-01-01

Context: Co-registration of ex-vivo histologic images with pre-operative imaging (e.g., magnetic resonance imaging [MRI]) can be used to align and map disease extent, and to identify quantitative imaging signatures. However, ex-vivo histology images are frequently sectioned into quarters prior to imaging. Aims: This work presents Histostitcher™, a software system designed to create a pseudo whole mount histology section (WMHS) from a stitching of four individual histology quadrant images. Materials and Methods: Histostitcher™ uses user-identified fiducials on the boundary of two quadrants to stitch such quadrants. An original prototype of Histostitcher™ was designed using the Matlab programming languages. However, clinical use was limited due to slow performance, computer memory constraints and an inefficient workflow. The latest version was created using the extensible imaging platform (XIP™) architecture in the C++ programming language. A fast, graphics processor unit renderer was designed to intelligently cache the visible parts of the histology quadrants and the workflow was significantly improved to allow modifying existing fiducials, fast transformations of the quadrants and saving/loading sessions. Results: The new stitching platform yielded significantly more efficient workflow and reconstruction than the previous prototype. It was tested on a traditional desktop computer, a Windows 8 Surface Pro table device and a 27 inch multi-touch display, with little performance difference between the different devices. Conclusions: Histostitcher™ is a fast, efficient framework for reconstructing pseudo WMHS from individually imaged quadrants. The highly modular XIP™ framework was used to develop an intuitive interface and future work will entail mapping the disease extent from the pseudo WMHS onto pre-operative MRI. PMID:24843820
Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

PubMed

Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique

2017-01-01

The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
Automated Data Submission for the Data Center

NASA Astrophysics Data System (ADS)

Wright, D.; Beaty, T.; Wei, Y.; Shanafield, H.; Santhana Vannan, S. K.

2014-12-01

Data centers struggle with difficulties related to data submission. Data are acquired through many avenues by many people. Many data submission activities involve intensive manual processes. During the submission process, data end up on varied storage devices. The situation can easily become chaotic. Collecting information on the status of pending data sets is arduous. For data providers, the submission process can be inconsistent and confusing. Scientists generally provide data from previous projects, and archival can be a low priority. Incomplete or poor documentation accompanies many data sets. However, complicated questionnaires deter busy data providers. At the ORNL DAAC, we have semi-automated the data set submission process to create a uniform data product and provide a consistent data provider experience. The formalized workflow makes archival faster for the data center and data set submission easier for data providers. Software modules create a flexible, reusable submission package. Formalized data set submission provides several benefits to the data center. A single data upload area provides one point of entry and ensures data are stored in a consistent location. A central dashboard records pending data set submissions in a single table and simplifies reporting. Flexible role management allows team members to readily coordinate and increases efficiency. Data products and metadata become uniform and easily maintained. As data and metadata standards change, modules can be modified or re-written without affecting workflow. While each data center has unique challenges, the data ingestion process is generally the same: get data from the provider, scientist, or project and capture metadata pertinent to that data. The ORNL DAAC data set submission workflow and software modules can be reused entirely or in part by other data centers looking for a data set submission solution. These data set submission modules will be available on NASA's Earthdata Code Collaborative and by request.
Forecasting production in Liquid Rich Shale plays

NASA Astrophysics Data System (ADS)

Nikfarman, Hanieh

Production from Liquid Rich Shale (LRS) reservoirs is taking center stage in the exploration and production of unconventional reservoirs. Production from the low and ultra-low permeability LRS plays is possible only through multi-fractured horizontal wells (MFHW's). There is no existing workflow that is applicable to forecasting multi-phase production from MFHW's in LRS plays. This project presents a practical and rigorous workflow for forecasting multiphase production from MFHW's in LRS reservoirs. There has been much effort in developing workflows and methodology for forecasting in tight/shale plays in recent years. The existing workflows, however, are applicable only to single phase flow, and are primarily used in shale gas plays. These methodologies do not apply to the multi-phase flow that is inevitable in LRS plays. To account for complexities of multiphase flow in MFHW's the only available technique is dynamic modeling in compositional numerical simulators. These are time consuming and not practical when it comes to forecasting production and estimating reserves for a large number of producers. A workflow was developed, and validated by compositional numerical simulation. The workflow honors physics of flow, and is sufficiently accurate while practical so that an analyst can readily apply it to forecast production and estimate reserves in a large number of producers in a short period of time. To simplify the complex multiphase flow in MFHW, the workflow divides production periods into an initial period where large production and pressure declines are expected, and the subsequent period where production decline may converge into a common trend for a number of producers across an area of interest in the field. Initial period assumes the production is dominated by single-phase flow of oil and uses the tri-linear flow model of Erdal Ozkan to estimate the production history. Commercial software readily available can simulate flow and forecast production in this period. In the subsequent Period, dimensionless rate and dimensionless time functions are introduced that help identify transition from initial period into subsequent period. The production trends in terms of the dimensionless parameters converge for a range of rock permeability and stimulation intensity. This helps forecast production beyond transition to the end of life of well. This workflow is applicable to single fluid system.
A big data approach for climate change indicators processing in the CLIP-C project

NASA Astrophysics Data System (ADS)

D'Anca, Alessandro; Conte, Laura; Palazzo, Cosimo; Fiore, Sandro; Aloisio, Giovanni

2016-04-01

Defining and implementing processing chains with multiple (e.g. tens or hundreds of) data analytics operators can be a real challenge in many practical scientific use cases such as climate change indicators. This is usually done via scripts (e.g. bash) on the client side and requires climate scientists to take care of, implement and replicate workflow-like control logic aspects (which may be error-prone too) in their scripts, along with the expected application-level part. Moreover, the big amount of data and the strong I/O demand pose additional challenges related to the performance. In this regard, production-level tools for climate data analysis are mostly sequential and there is a lack of big data analytics solutions implementing fine-grain data parallelism or adopting stronger parallel I/O strategies, data locality, workflow optimization, etc. High-level solutions leveraging on workflow-enabled big data analytics frameworks for eScience could help scientists in defining and implementing the workflows related to their experiments by exploiting a more declarative, efficient and powerful approach. This talk will start introducing the main needs and challenges regarding big data analytics workflow management for eScience and will then provide some insights about the implementation of some real use cases related to some climate change indicators on large datasets produced in the context of the CLIP-C project - a EU FP7 project aiming at providing access to climate information of direct relevance to a wide variety of users, from scientists to policy makers and private sector decision makers. All the proposed use cases have been implemented exploiting the Ophidia big data analytics framework. The software stack includes an internal workflow management system, which coordinates, orchestrates, and optimises the execution of multiple scientific data analytics and visualization tasks. Real-time workflow monitoring execution is also supported through a graphical user interface. In order to address the challenges of the use cases, the implemented data analytics workflows include parallel data analysis, metadata management, virtual file system tasks, maps generation, rolling of datasets, and import/export of datasets in NetCDF format. The use cases have been implemented on a HPC cluster of 8-nodes (16-cores/node) of the Athena Cluster available at the CMCC Supercomputing Centre. Benchmark results will be also presented during the talk.
Improving Clinical Workflow in Ambulatory Care: Implemented Recommendations in an Innovation Prototype for the Veteran’s Health Administration

PubMed Central

Patterson, Emily S.; Lowry, Svetlana Z.; Ramaiah, Mala; Gibbons, Michael C.; Brick, David; Calco, Robert; Matton, Greg; Miller, Anne; Makar, Ellen; Ferrer, Jorge A.

2015-01-01

Introduction: Human factors workflow analyses in healthcare settings prior to technology implemented are recommended to improve workflow in ambulatory care settings. In this paper we describe how insights from a workflow analysis conducted by NIST were implemented in a software prototype developed for a Veteran’s Health Administration (VHA) VAi2 innovation project and associated lessons learned. Methods: We organize the original recommendations and associated stages and steps visualized in process maps from NIST and the VA’s lessons learned from implementing the recommendations in the VAi2 prototype according to four stages: 1) before the patient visit, 2) during the visit, 3) discharge, and 4) visit documentation. NIST recommendations to improve workflow in ambulatory care (outpatient) settings and process map representations were based on reflective statements collected during one-hour discussions with three physicians. The development of the VAi2 prototype was conducted initially independently from the NIST recommendations, but at a midpoint in the process development, all of the implementation elements were compared with the NIST recommendations and lessons learned were documented. Findings: Story-based displays and templates with default preliminary order sets were used to support scheduling, time-critical notifications, drafting medication orders, and supporting a diagnosis-based workflow. These templates enabled customization to the level of diagnostic uncertainty. Functionality was designed to support cooperative work across interdisciplinary team members, including shared documentation sessions with tracking of text modifications, medication lists, and patient education features. Displays were customized to the role and included access for consultants and site-defined educator teams. Discussion: Workflow, usability, and patient safety can be enhanced through clinician-centered design of electronic health records. The lessons learned from implementing NIST recommendations to improve workflow in ambulatory care using an EHR provide a first step in moving from a billing-centered perspective on how to maintain accurate, comprehensive, and up-to-date information about a group of patients to a clinician-centered perspective. These recommendations point the way towards a “patient visit management system,” which incorporates broader notions of supporting workload management, supporting flexible flow of patients and tasks, enabling accountable distributed work across members of the clinical team, and supporting dynamic tracking of steps in tasks that have longer time distributions. PMID:26290887
Use of contextual inquiry to understand anatomic pathology workflow: Implications for digital pathology adoption

PubMed Central

Ho, Jonhan; Aridor, Orly; Parwani, Anil V.

2012-01-01

Background: For decades anatomic pathology (AP) workflow have been a highly manual process based on the use of an optical microscope and glass slides. Recent innovations in scanning and digitizing of entire glass slides are accelerating a move toward widespread adoption and implementation of a workflow based on digital slides and their supporting information management software. To support the design of digital pathology systems and ensure their adoption into pathology practice, the needs of the main users within the AP workflow, the pathologists, should be identified. Contextual inquiry is a qualitative, user-centered, social method designed to identify and understand users’ needs and is utilized for collecting, interpreting, and aggregating in-detail aspects of work. Objective: Contextual inquiry was utilized to document current AP workflow, identify processes that may benefit from the introduction of digital pathology systems, and establish design requirements for digital pathology systems that will meet pathologists’ needs. Materials and Methods: Pathologists were observed and interviewed at a large academic medical center according to contextual inquiry guidelines established by Holtzblatt et al. 1998. Notes representing user-provided data were documented during observation sessions. An affinity diagram, a hierarchal organization of the notes based on common themes in the data, was created. Five graphical models were developed to help visualize the data including sequence, flow, artifact, physical, and cultural models. Results: A total of six pathologists were observed by a team of two researchers. A total of 254 affinity notes were documented and organized using a system based on topical hierarchy, including 75 third-level, 24 second-level, and five main-level categories, including technology, communication, synthesis/preparation, organization, and workflow. Current AP workflow was labor intensive and lacked scalability. A large number of processes that may possibly improve following the introduction of digital pathology systems were identified. These work processes included case management, case examination and review, and final case reporting. Furthermore, a digital slide system should integrate with the anatomic pathologic laboratory information system. Conclusions: To our knowledge, this is the first study that utilized the contextual inquiry method to document AP workflow. Findings were used to establish key requirements for the design of digital pathology systems. PMID:23243553
Making Sense of Complexity with FRE, a Scientific Workflow System for Climate Modeling (Invited)

NASA Astrophysics Data System (ADS)

Langenhorst, A. R.; Balaji, V.; Yakovlev, A.

2010-12-01

A workflow is a description of a sequence of activities that is both precise and comprehensive. Capturing the workflow of climate experiments provides a record which can be queried or compared, and allows reproducibility of the experiments - sometimes even to the bit level of the model output. This reproducibility helps to verify the integrity of the output data, and enables easy perturbation experiments. GFDL's Flexible Modeling System Runtime Environment (FRE) is a production-level software project which defines and implements building blocks of the workflow as command line tools. The scientific, numerical and technical input needed to complete the workflow of an experiment is recorded in an experiment description file in XML format. Several key features add convenience and automation to the FRE workflow: ● Experiment inheritance makes it possible to define a new experiment with only a reference to the parent experiment and the parameters to override. ● Testing is a basic element of the FRE workflow: experiments define short test runs which are verified before the main experiment is run, and a set of standard experiments are verified with new code releases. ● FRE is flexible enough to support short runs with mere megabytes of data, to high-resolution experiments that run on thousands of processors for months, producing terabytes of output data. Experiments run in segments of model time; after each segment, the state is saved and the model can be checkpointed at that level. Segment length is defined by the user, but the number of segments per system job is calculated to fit optimally in the batch scheduler requirements. FRE provides job control across multiple segments, and tools to monitor and alter the state of long-running experiments. ● Experiments are entered into a Curator Database, which stores query-able metadata about the experiment and the experiment's output. ● FRE includes a set of standardized post-processing functions as well as the ability to incorporate user-level functions. FRE post-processing can take us all the way to the preparing of graphical output for a scientific audience, and publication of data on a public portal. ● Recent FRE development includes incorporating a distributed workflow to support remote computing.
76 FR 66350 - Eighteenth Meeting: RTCA Special Committee 205/EUROCAE WG-71: Software Considerations in...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-26

... Committee 205/EUROCAE WG-71: Software Considerations in Aeronautical Systems AGENCY: Federal Aviation.../EUROCAE WG-71 meeting: Software Considerations in Aeronautical Systems. SUMMARY: The FAA is issuing this notice to advise the public of a meeting of RTCA Special Committee 205/EUROCAE WG-71: Software...
User Manuals for a Primary Care Electronic Medical Record System: A Mixed Methods Study of User- and Vendor-Generated Documents.

PubMed

Shachak, Aviv; Dow, Rustam; Barnsley, Jan; Tu, Karen; Domb, Sharon; Jadad, Alejandro R; Lemieux-Charles, Louise

2013-06-04

Tutorials and user manuals are important forms of impersonal support for using software applications including electronic medical records (EMRs). Differences between user- and vendor documentation may indicate support needs, which are not sufficiently addressed by the official documentation, and reveal new elements that may inform the design of tutorials and user manuals. What are the differences between user-generated tutorials and manuals for an EMR and the official user manual from the software vendor? Effective design of tutorials and user manuals requires careful packaging of information, balance between declarative and procedural texts, an action and task-oriented approach, support for error recognition and recovery, and effective use of visual elements. No previous research compared these elements between formal and informal documents. We conducted an mixed methods study. Seven tutorials and two manuals for an EMR were collected from three family health teams and compared with the official user manual from the software vendor. Documents were qualitatively analyzed using a framework analysis approach in relation to the principles of technical documentation described above. Subsets of the data were quantitatively analyzed using cross-tabulation to compare the types of error information and visual cues in screen captures between user- and vendor-generated manuals. The user-developed tutorials and manuals differed from the vendor-developed manual in that they contained mostly procedural and not declarative information; were customized to the specific workflow, user roles, and patient characteristics; contained more error information related to work processes than to software usage; and used explicit visual cues on screen captures to help users identify window elements. These findings imply that to support EMR implementation, tutorials and manuals need to be customized and adapted to specific organizational contexts and workflows. The main limitation of the study is its generalizability. Future research should address this limitation and may explore alternative approaches to software documentation, such as modular manuals or participatory design.
LTRsift: a graphical user interface for semi-automatic classification and postprocessing of de novo detected LTR retrotransposons

PubMed Central

2012-01-01

Background Long terminal repeat (LTR) retrotransposons are a class of eukaryotic mobile elements characterized by a distinctive sequence similarity-based structure. Hence they are well suited for computational identification. Current software allows for a comprehensive genome-wide de novo detection of such elements. The obvious next step is the classification of newly detected candidates resulting in (super-)families. Such a de novo classification approach based on sequence-based clustering of transposon features has been proposed before, resulting in a preliminary assignment of candidates to families as a basis for subsequent manual refinement. However, such a classification workflow is typically split across a heterogeneous set of glue scripts and generic software (for example, spreadsheets), making it tedious for a human expert to inspect, curate and export the putative families produced by the workflow. Results We have developed LTRsift, an interactive graphical software tool for semi-automatic postprocessing of de novo predicted LTR retrotransposon annotations. Its user-friendly interface offers customizable filtering and classification functionality, displaying the putative candidate groups, their members and their internal structure in a hierarchical fashion. To ease manual work, it also supports graphical user interface-driven reassignment, splitting and further annotation of candidates. Export of grouped candidate sets in standard formats is possible. In two case studies, we demonstrate how LTRsift can be employed in the context of a genome-wide LTR retrotransposon survey effort. Conclusions LTRsift is a useful and convenient tool for semi-automated classification of newly detected LTR retrotransposons based on their internal features. Its efficient implementation allows for convenient and seamless filtering and classification in an integrated environment. Developed for life scientists, it is helpful in postprocessing and refining the output of software for predicting LTR retrotransposons up to the stage of preparing full-length reference sequence libraries. The LTRsift software is freely available at http://www.zbh.uni-hamburg.de/LTRsift under an open-source license. PMID:23131050
A graphic user interface for efficient 3D photo-reconstruction based on free software

NASA Astrophysics Data System (ADS)

Castillo, Carlos; James, Michael; Gómez, Jose A.

2015-04-01

Recently, different studies have stressed the applicability of 3D photo-reconstruction based on Structure from Motion algorithms in a wide range of geoscience applications. For the purpose of image photo-reconstruction, a number of commercial and freely available software packages have been developed (e.g. Agisoft Photoscan, VisualSFM). The workflow involves typically different stages such as image matching, sparse and dense photo-reconstruction, point cloud filtering and georeferencing. For approaches using open and free software, each of these stages usually require different applications. In this communication, we present an easy-to-use graphic user interface (GUI) developed in Matlab® code as a tool for efficient 3D photo-reconstruction making use of powerful existing software: VisualSFM (Wu, 2015) for photo-reconstruction and CloudCompare (Girardeau-Montaut, 2015) for point cloud processing. The GUI performs as a manager of configurations and algorithms, taking advantage of the command line modes of existing software, which allows an intuitive and automated processing workflow for the geoscience user. The GUI includes several additional features: a) a routine for significantly reducing the duration of the image matching operation, normally the most time consuming stage; b) graphical outputs for understanding the overall performance of the algorithm (e.g. camera connectivity, point cloud density); c) a number of useful options typically performed before and after the photo-reconstruction stage (e.g. removal of blurry images, image renaming, vegetation filtering); d) a manager of batch processing for the automated reconstruction of different image datasets. In this study we explore the advantages of this new tool by testing its performance using imagery collected in several soil erosion applications. References Girardeau-Montaut, D. 2015. CloudCompare documentation accessed at http://cloudcompare.org/ Wu, C. 2015. VisualSFM documentation access at http://ccwu.me/vsfm/doc.html#.
LTRsift: a graphical user interface for semi-automatic classification and postprocessing of de novo detected LTR retrotransposons.

PubMed

Steinbiss, Sascha; Kastens, Sascha; Kurtz, Stefan

2012-11-07

Long terminal repeat (LTR) retrotransposons are a class of eukaryotic mobile elements characterized by a distinctive sequence similarity-based structure. Hence they are well suited for computational identification. Current software allows for a comprehensive genome-wide de novo detection of such elements. The obvious next step is the classification of newly detected candidates resulting in (super-)families. Such a de novo classification approach based on sequence-based clustering of transposon features has been proposed before, resulting in a preliminary assignment of candidates to families as a basis for subsequent manual refinement. However, such a classification workflow is typically split across a heterogeneous set of glue scripts and generic software (for example, spreadsheets), making it tedious for a human expert to inspect, curate and export the putative families produced by the workflow. We have developed LTRsift, an interactive graphical software tool for semi-automatic postprocessing of de novo predicted LTR retrotransposon annotations. Its user-friendly interface offers customizable filtering and classification functionality, displaying the putative candidate groups, their members and their internal structure in a hierarchical fashion. To ease manual work, it also supports graphical user interface-driven reassignment, splitting and further annotation of candidates. Export of grouped candidate sets in standard formats is possible. In two case studies, we demonstrate how LTRsift can be employed in the context of a genome-wide LTR retrotransposon survey effort. LTRsift is a useful and convenient tool for semi-automated classification of newly detected LTR retrotransposons based on their internal features. Its efficient implementation allows for convenient and seamless filtering and classification in an integrated environment. Developed for life scientists, it is helpful in postprocessing and refining the output of software for predicting LTR retrotransposons up to the stage of preparing full-length reference sequence libraries. The LTRsift software is freely available at http://www.zbh.uni-hamburg.de/LTRsift under an open-source license.
Unidata's Vision for Transforming Geoscience by Moving Data Services and Software to the Cloud

NASA Astrophysics Data System (ADS)

Ramamurthy, M. K.; Fisher, W.; Yoksas, T.

2014-12-01

Universities are facing many challenges: shrinking budgets, rapidly evolving information technologies, exploding data volumes, multidisciplinary science requirements, and high student expectations. These changes are upending traditional approaches to accessing and using data and software. It is clear that Unidata's products and services must evolve to support new approaches to research and education. After years of hype and ambiguity, cloud computing is maturing in usability in many areas of science and education, bringing the benefits of virtualized and elastic remote services to infrastructure, software, computation, and data. Cloud environments reduce the amount of time and money spent to procure, install, and maintain new hardware and software, and reduce costs through resource pooling and shared infrastructure. Cloud services aimed at providing any resource, at any time, from any place, using any device are increasingly being embraced by all types of organizations. Given this trend and the enormous potential of cloud-based services, Unidata is taking moving to augment its products, services, data delivery mechanisms and applications to align with the cloud-computing paradigm. Specifically, Unidata is working toward establishing a community-based development environment that supports the creation and use of software services to build end-to-end data workflows. The design encourages the creation of services that can be broken into small, independent chunks that provide simple capabilities. Chunks could be used individually to perform a task, or chained into simple or elaborate workflows. The services will also be portable, allowing their use in researchers' own cloud-based computing environments. In this talk, we present a vision for Unidata's future in a cloud-enabled data services and discuss our initial efforts to deploy a subset of Unidata data services and tools in the Amazon EC2 and Microsoft Azure cloud environments, including the transfer of real-time meteorological data into its cloud instances, product generation using those data, and the deployment of TDS, McIDAS ADDE and AWIPS II data servers and the Integrated Data Server visualization tool.
Using Docker Compose for the Simple Deployment of an Integrated Drug Target Screening Platform.

PubMed

List, Markus

2017-06-10

Docker virtualization allows for software tools to be executed in an isolated and controlled environment referred to as a container. In Docker containers, dependencies are provided exactly as intended by the developer and, consequently, they simplify the distribution of scientific software and foster reproducible research. The Docker paradigm is that each container encapsulates one particular software tool. However, to analyze complex biomedical data sets, it is often necessary to combine several software tools into elaborate workflows. To address this challenge, several Docker containers need to be instantiated and properly integrated, which complicates the software deployment process unnecessarily. Here, we demonstrate how an extension to Docker, Docker compose, can be used to mitigate these problems by providing a unified setup routine that deploys several tools in an integrated fashion. We demonstrate the power of this approach by example of a Docker compose setup for a drug target screening platform consisting of five integrated web applications and shared infrastructure, deployable in just two lines of codes.
An open source workflow for 3D printouts of scientific data volumes

NASA Astrophysics Data System (ADS)

Loewe, P.; Klump, J. F.; Wickert, J.; Ludwig, M.; Frigeri, A.

2013-12-01

As the amount of scientific data continues to grow, researchers need new tools to help them visualize complex data. Immersive data-visualisations are helpful, yet fail to provide tactile feedback and sensory feedback on spatial orientation, as provided from tangible objects. The gap in sensory feedback from virtual objects leads to the development of tangible representations of geospatial information to solve real world problems. Examples are animated globes [1], interactive environments like tangible GIS [2], and on demand 3D prints. The production of a tangible representation of a scientific data set is one step in a line of scientific thinking, leading from the physical world into scientific reasoning and back: The process starts with a physical observation, or from a data stream generated by an environmental sensor. This data stream is turned into a geo-referenced data set. This data is turned into a volume representation which is converted into command sequences for the printing device, leading to the creation of a 3D printout. As a last, but crucial step, this new object has to be documented and linked to the associated metadata, and curated in long term repositories to preserve its scientific meaning and context. The workflow to produce tangible 3D data-prints from science data at the German Research Centre for Geosciences (GFZ) was implemented as a software based on the Free and Open Source Geoinformatics tools GRASS GIS and Paraview. The workflow was successfully validated in various application scenarios at GFZ using a RapMan printer to create 3D specimens of elevation models, geological underground models, ice penetrating radar soundings for planetology, and space time stacks for Tsunami model quality assessment. While these first pilot applications have demonstrated the feasibility of the overall approach [3], current research focuses on the provision of the workflow as Software as a Service (SAAS), thematic generalisation of information content and long term curation. [1] http://www.arcscience.com/systemDetails/omniTechnology.html [2] http://video.esri.com/watch/53/landscape-design-with-tangible-gis [3] Löwe et al. (2013), Geophysical Research Abstracts, Vol. 15, EGU2013-1544-1.
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

PubMed Central

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Recent developments in software tools for high-throughput in vitro ADME support with high-resolution MS.

PubMed

Paiva, Anthony; Shou, Wilson Z

2016-08-01

The last several years have seen the rapid adoption of the high-resolution MS (HRMS) for bioanalytical support of high throughput in vitro ADME profiling. Many capable software tools have been developed and refined to process quantitative HRMS bioanalysis data for ADME samples with excellent performance. Additionally, new software applications specifically designed for quan/qual soft spot identification workflows using HRMS have greatly enhanced the quality and efficiency of the structure elucidation process for high throughput metabolite ID in early in vitro ADME profiling. Finally, novel approaches in data acquisition and compression, as well as tools for transferring, archiving and retrieving HRMS data, are being continuously refined to tackle the issue of large data file size typical for HRMS analyses.
Flexible Early Warning Systems with Workflows and Decision Tables

NASA Astrophysics Data System (ADS)

Riedel, F.; Chaves, F.; Zeiner, H.

2012-04-01

An essential part of early warning systems and systems for crisis management are decision support systems that facilitate communication and collaboration. Often official policies specify how different organizations collaborate and what information is communicated to whom. For early warning systems it is crucial that information is exchanged dynamically in a timely manner and all participants get exactly the information they need to fulfil their role in the crisis management process. Information technology obviously lends itself to automate parts of the process. We have experienced however that in current operational systems the information logistics processes are hard-coded, even though they are subject to change. In addition, systems are tailored to the policies and requirements of a certain organization and changes can require major software refactoring. We seek to develop a system that can be deployed and adapted to multiple organizations with different dynamic runtime policies. A major requirement for such a system is that changes can be applied locally without affecting larger parts of the system. In addition to the flexibility regarding changes in policies and processes, the system needs to be able to evolve; when new information sources become available, it should be possible to integrate and use these in the decision process. In general, this kind of flexibility comes with a significant increase in complexity. This implies that only IT professionals can maintain a system that can be reconfigured and adapted; end-users are unable to utilise the provided flexibility. In the business world similar problems arise and previous work suggested using business process management systems (BPMS) or workflow management systems (WfMS) to guide and automate early warning processes or crisis management plans. However, the usability and flexibility of current WfMS are limited, because current notations and user interfaces are still not suitable for end-users, and workflows are usually only suited for rigid processes. We show how improvements can be achieved by using decision tables and rule-based adaptive workflows. Decision tables have been shown to be an intuitive tool that can be used by domain experts to express rule sets that can be interpreted automatically at runtime. Adaptive workflows use a rule-based approach to increase the flexibility of workflows by providing mechanisms to adapt workflows based on context changes, human intervention and availability of services. The combination of workflows, decision tables and rule-based adaption creates a framework that opens up new possibilities for flexible and adaptable workflows, especially, for use in early warning and crisis management systems.

PACS for surgery and interventional radiology: features of a Therapy Imaging and Model Management System (TIMMS).

PubMed

Lemke, Heinz U; Berliner, Leonard

2011-05-01

Appropriate use of information and communication technology (ICT) and mechatronic (MT) systems is viewed by many experts as a means to improve workflow and quality of care in the operating room (OR). This will require a suitable information technology (IT) infrastructure, as well as communication and interface standards, such as specialized extensions of DICOM, to allow data interchange between surgical system components in the OR. A design of such an infrastructure, sometimes referred to as surgical PACS, but better defined as a Therapy Imaging and Model Management System (TIMMS), will be introduced in this article. A TIMMS should support the essential functions that enable and advance image guided therapy, and in the future, a more comprehensive form of patient-model guided therapy. Within this concept, the "image-centric world view" of the classical PACS technology is complemented by an IT "model-centric world view". Such a view is founded in the special patient modelling needs of an increasing number of modern surgical interventions as compared to the imaging intensive working mode of diagnostic radiology, for which PACS was originally conceptualised and developed. The modelling aspects refer to both patient information and workflow modelling. Standards for creating and integrating information about patients, equipment, and procedures are vitally needed when planning for an efficient OR. The DICOM Working Group 24 (WG-24) has been established to develop DICOM objects and services related to image and model guided surgery. To determine these standards, it is important to define step-by-step surgical workflow practices and create interventional workflow models per procedures or per variable cases. As the boundaries between radiation therapy, surgery and interventional radiology are becoming less well-defined, precise patient models will become the greatest common denominator for all therapeutic disciplines. In addition to imaging, the focus of WG-24 is to serve the therapeutic disciplines by enabling modelling technology to be based on standards. Copyright © 2011. Published by Elsevier Ireland Ltd.
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

PubMed Central

2011-01-01

Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples. PMID:21352538
Next-Generation Climate Modeling Science Challenges for Simulation, Workflow and Analysis Systems

NASA Astrophysics Data System (ADS)

Koch, D. M.; Anantharaj, V. G.; Bader, D. C.; Krishnan, H.; Leung, L. R.; Ringler, T.; Taylor, M.; Wehner, M. F.; Williams, D. N.

2016-12-01

We will present two examples of current and future high-resolution climate-modeling research that are challenging existing simulation run-time I/O, model-data movement, storage and publishing, and analysis. In each case, we will consider lessons learned as current workflow systems are broken by these large-data science challenges, as well as strategies to repair or rebuild the systems. First we consider the science and workflow challenges to be posed by the CMIP6 multi-model HighResMIP, involving around a dozen modeling groups performing quarter-degree simulations, in 3-member ensembles for 100 years, with high-frequency (1-6 hourly) diagnostics, which is expected to generate over 4PB of data. An example of science derived from these experiments will be to study how resolution affects the ability of models to capture extreme-events such as hurricanes or atmospheric rivers. Expected methods to transfer (using parallel Globus) and analyze (using parallel "TECA" software tools) HighResMIP data for such feature-tracking by the DOE CASCADE project will be presented. A second example will be from the Accelerated Climate Modeling for Energy (ACME) project, which is currently addressing challenges involving multiple century-scale coupled high resolution (quarter-degree) climate simulations on DOE Leadership Class computers. ACME is anticipating production of over 5PB of data during the next 2 years of simulations, in order to investigate the drivers of water cycle changes, sea-level-rise, and carbon cycle evolution. The ACME workflow, from simulation to data transfer, storage, analysis and publication will be presented. Current and planned methods to accelerate the workflow, including implementing run-time diagnostics, and implementing server-side analysis to avoid moving large datasets will be presented.
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines.

PubMed

Cieślik, Marcin; Mura, Cameron

2011-02-25

Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples.
DOE Office of Scientific and Technical Information (OSTI.GOV)

McCaskey, Alexander J.

There is a lack of state-of-the-art HPC simulation tools for simulating general quantum computing. Furthermore, there are no real software tools that integrate current quantum computers into existing classical HPC workflows. This product, the Quantum Virtual Machine (QVM), solves this problem by providing an extensible framework for pluggable virtual, or physical, quantum processing units (QPUs). It enables the execution of low level quantum assembly codes and returns the results of such executions.
Integrated Sensitivity Analysis Workflow

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedman-Hill, Ernest J.; Hoffman, Edward L.; Gibson, Marcus J.

2014-08-01

Sensitivity analysis is a crucial element of rigorous engineering analysis, but performing such an analysis on a complex model is difficult and time consuming. The mission of the DART Workbench team at Sandia National Laboratories is to lower the barriers to adoption of advanced analysis tools through software integration. The integrated environment guides the engineer in the use of these integrated tools and greatly reduces the cycle time for engineering analysis.
Globus | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

Globus software services provide secure cancer research data transfer, synchronization, and sharing in distributed environments at large scale. These services can be integrated into applications and research data gateways, leveraging Globus identity management, single sign-on, search, and authorization capabilities. Globus Genomics integrates Globus with the Galaxy genomics workflow engine and Amazon Web Services to enable cancer genomics analysis that can elastically scale compute resources with demand.
Technical Note: scuda: A software platform for cumulative dose assessment.

PubMed

Park, Seyoun; McNutt, Todd; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon

2016-10-01

Accurate tracking of anatomical changes and computation of actually delivered dose to the patient are critical for successful adaptive radiation therapy (ART). Additionally, efficient data management and fast processing are practically important for the adoption in clinic as ART involves a large amount of image and treatment data. The purpose of this study was to develop an accurate and efficient Software platform for CUmulative Dose Assessment (scuda) that can be seamlessly integrated into the clinical workflow. scuda consists of deformable image registration (DIR), segmentation, dose computation modules, and a graphical user interface. It is connected to our image PACS and radiotherapy informatics databases from which it automatically queries/retrieves patient images, radiotherapy plan, beam data, and daily treatment information, thus providing an efficient and unified workflow. For accurate registration of the planning CT and daily CBCTs, the authors iteratively correct CBCT intensities by matching local intensity histograms during the DIR process. Contours of the target tumor and critical structures are then propagated from the planning CT to daily CBCTs using the computed deformations. The actual delivered daily dose is computed using the registered CT and patient setup information by a superposition/convolution algorithm, and accumulated using the computed deformation fields. Both DIR and dose computation modules are accelerated by a graphics processing unit. The cumulative dose computation process has been validated on 30 head and neck (HN) cancer cases, showing 3.5 ± 5.0 Gy (mean±STD) absolute mean dose differences between the planned and the actually delivered doses in the parotid glands. On average, DIR, dose computation, and segmentation take 20 s/fraction and 17 min for a 35-fraction treatment including additional computation for dose accumulation. The authors developed a unified software platform that provides accurate and efficient monitoring of anatomical changes and computation of actually delivered dose to the patient, thus realizing an efficient cumulative dose computation workflow. Evaluation on HN cases demonstrated the utility of our platform for monitoring the treatment quality and detecting significant dosimetric variations that are keys to successful ART.
Technical Note: SCUDA: A software platform for cumulative dose assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Park, Seyoun; McNutt, Todd; Quon, Harry

Purpose: Accurate tracking of anatomical changes and computation of actually delivered dose to the patient are critical for successful adaptive radiation therapy (ART). Additionally, efficient data management and fast processing are practically important for the adoption in clinic as ART involves a large amount of image and treatment data. The purpose of this study was to develop an accurate and efficient Software platform for CUmulative Dose Assessment (SCUDA) that can be seamlessly integrated into the clinical workflow. Methods: SCUDA consists of deformable image registration (DIR), segmentation, dose computation modules, and a graphical user interface. It is connected to our imagemore » PACS and radiotherapy informatics databases from which it automatically queries/retrieves patient images, radiotherapy plan, beam data, and daily treatment information, thus providing an efficient and unified workflow. For accurate registration of the planning CT and daily CBCTs, the authors iteratively correct CBCT intensities by matching local intensity histograms during the DIR process. Contours of the target tumor and critical structures are then propagated from the planning CT to daily CBCTs using the computed deformations. The actual delivered daily dose is computed using the registered CT and patient setup information by a superposition/convolution algorithm, and accumulated using the computed deformation fields. Both DIR and dose computation modules are accelerated by a graphics processing unit. Results: The cumulative dose computation process has been validated on 30 head and neck (HN) cancer cases, showing 3.5 ± 5.0 Gy (mean±STD) absolute mean dose differences between the planned and the actually delivered doses in the parotid glands. On average, DIR, dose computation, and segmentation take 20 s/fraction and 17 min for a 35-fraction treatment including additional computation for dose accumulation. Conclusions: The authors developed a unified software platform that provides accurate and efficient monitoring of anatomical changes and computation of actually delivered dose to the patient, thus realizing an efficient cumulative dose computation workflow. Evaluation on HN cases demonstrated the utility of our platform for monitoring the treatment quality and detecting significant dosimetric variations that are keys to successful ART.« less
Building a Generic Virtual Research Environment Framework for Multiple Earth and Space Science Domains and a Diversity of Users.

NASA Astrophysics Data System (ADS)

Wyborn, L. A.; Fraser, R.; Evans, B. J. K.; Friedrich, C.; Klump, J. F.; Lescinsky, D. T.

2017-12-01

Virtual Research Environments (VREs) are now part of academic infrastructures. Online research workflows can be orchestrated whereby data can be accessed from multiple external repositories with processing taking place on public or private clouds, and centralised supercomputers using a mixture of user codes, and well-used community software and libraries. VREs enable distributed members of research teams to actively work together to share data, models, tools, software, workflows, best practices, infrastructures, etc. These environments and their components are increasingly able to support the needs of undergraduate teaching. External to the research sector, they can also be reused by citizen scientists, and be repurposed for industry users to help accelerate the diffusion and hence enable the translation of research innovations. The Virtual Geophysics Laboratory (VGL) in Australia was started in 2012, built using a collaboration between CSIRO, the National Computational Infrastructure (NCI) and Geoscience Australia, with support funding from the Australian Government Department of Education. VGL comprises three main modules that provide an interface to enable users to first select their required data; to choose a tool to process that data; and then access compute infrastructure for execution. VGL was initially built to enable a specific set of researchers in government agencies access to specific data sets and a limited number of tools. Over the years it has evolved into a multi-purpose Earth science platform with access to an increased variety of data (e.g., Natural Hazards, Geochemistry), a broader range of software packages, and an increasing diversity of compute infrastructures. This expansion has been possible because of the approach to loosely couple data, tools and compute resources via interfaces that are built on international standards and accessed as network-enabled services wherever possible. Built originally for researchers that were not fussy about general usability, increasing emphasis on User Interfaces (UIs) and stability will lead to increased uptake in the education and industry sectors. Simultaneously, improvements are being added to facilitate access to data and tools by experienced researchers who want direct access to both data and flexible workflows.
Simulation software: engineer processes before reengineering.

PubMed

Lepley, C J

2001-01-01

People make decisions all the time using intuition. But what happens when you are asked: "Are you sure your predictions are accurate? How much will a mistake cost? What are the risks associated with this change?" Once a new process is engineered, it is difficult to analyze what would have been different if other options had been chosen. Simulating a process can help senior clinical officers solve complex patient flow problems and avoid wasted efforts. Simulation software can give you the data you need to make decisions. The author introduces concepts, methodologies, and applications of computer aided simulation to illustrate their use in making decisions to improve workflow design.
Anatomy of an anesthesia information management system.

PubMed

Shah, Nirav J; Tremper, Kevin K; Kheterpal, Sachin

2011-09-01

Anesthesia information management systems (AIMS) have become more prevalent as more sophisticated hardware and software have increased usability and reliability. National mandates and incentives have driven adoption as well. AIMS can be developed in one of several software models (Web based, client/server, or incorporated into a medical device). Irrespective of the development model, the best AIMS have a feature set that allows for comprehensive management of workflow for an anesthesiologist. Key features include preoperative, intraoperative, and postoperative documentation; quality assurance; billing; compliance and operational reporting; patient and operating room tracking; and integration with hospital electronic medical records. Copyright © 2011 Elsevier Inc. All rights reserved.
Web-based interactive visualization in a Grid-enabled neuroimaging application using HTML5.

PubMed

Siewert, René; Specovius, Svenja; Wu, Jie; Krefting, Dagmar

2012-01-01

Interactive visualization and correction of intermediate results are required in many medical image analysis pipelines. To allow certain interaction in the remote execution of compute- and data-intensive applications, new features of HTML5 are used. They allow for transparent integration of user interaction into Grid- or Cloud-enabled scientific workflows. Both 2D and 3D visualization and data manipulation can be performed through a scientific gateway without the need to install specific software or web browser plugins. The possibilities of web-based visualization are presented along the FreeSurfer-pipeline, a popular compute- and data-intensive software tool for quantitative neuroimaging.
Data Management System

NASA Technical Reports Server (NTRS)

1997-01-01

CENTRA 2000 Inc., a wholly owned subsidiary of Auto-trol technology, obtained permission to use software originally developed at Johnson Space Center for the Space Shuttle and early Space Station projects. To support their enormous information-handling needs, a product data management, electronic document management and work-flow system was designed. Initially, just 33 database tables comprised the original software, which was later expanded to about 100 tables. This system, now called CENTRA 2000, is designed for quick implementation and supports the engineering process from preliminary design through release-to-production. CENTRA 2000 can also handle audit histories and provides a means to ensure new information is distributed. The product has 30 production sites worldwide.
Design Automation in Synthetic Biology.

PubMed

Appleton, Evan; Madsen, Curtis; Roehner, Nicholas; Densmore, Douglas

2017-04-03

Design automation refers to a category of software tools for designing systems that work together in a workflow for designing, building, testing, and analyzing systems with a target behavior. In synthetic biology, these tools are called bio-design automation (BDA) tools. In this review, we discuss the BDA tools areas-specify, design, build, test, and learn-and introduce the existing software tools designed to solve problems in these areas. We then detail the functionality of some of these tools and show how they can be used together to create the desired behavior of two types of modern synthetic genetic regulatory networks. Copyright © 2017 Cold Spring Harbor Laboratory Press; all rights reserved.
Open source tools and toolkits for bioinformatics: significance, and where are we?

PubMed

Stajich, Jason E; Lapp, Hilmar

2006-09-01

This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.
Hidden Uses of Presentation Software--The Ideal Tool for Making Customized Materials for Special Needs Students and Clients.

ERIC Educational Resources Information Center

Gilden, Deborah

This paper discusses how presentation software can be used to design custom materials for a variety of people with special needs, including children and adults with low vision, people with developmental disabilities, and stroke patients with cognitive impairments. Benefits of using presentation software include: (1) presentation software gives the…
RISA: Remote Interface for Science Analysis

NASA Astrophysics Data System (ADS)

Gabriel, C.; Ibarra, A.; de La Calle, I.; Salgado, J.; Osuna, P.; Tapiador, D.

2008-08-01

The Scientific Analysis System (SAS) is the package for interactive and pipeline data reduction of all XMM-Newton data. Freely distributed by ESA to run under many different operating systems, the SAS has been used by almost every one of the 1600 refereed scientific publications obtained so far from the mission. We are developing RISA, the Remote Interface for Science Analysis, which makes it possible to run SAS through fully configurable web service workflows, enabling observers to access and analyse data making use of all of the existing SAS functionalities, without any installation/download of software/data. The workflows run primarily but not exclusively on the ESAC Grid, which offers scalable processing resources, directly connected to the XMM-Newton Science Archive. A first project internal version of RISA was issued in May 2007, a public release is expected already within this year.
Dynamic provisioning of a HEP computing infrastructure on a shared hybrid HPC system

NASA Astrophysics Data System (ADS)

Meier, Konrad; Fleig, Georg; Hauth, Thomas; Janczyk, Michael; Quast, Günter; von Suchodoletz, Dirk; Wiebelt, Bernd

2016-10-01

Experiments in high-energy physics (HEP) rely on elaborate hardware, software and computing systems to sustain the high data rates necessary to study rare physics processes. The Institut fr Experimentelle Kernphysik (EKP) at KIT is a member of the CMS and Belle II experiments, located at the LHC and the Super-KEKB accelerators, respectively. These detectors share the requirement, that enormous amounts of measurement data must be processed and analyzed and a comparable amount of simulated events is required to compare experimental results with theoretical predictions. Classical HEP computing centers are dedicated sites which support multiple experiments and have the required software pre-installed. Nowadays, funding agencies encourage research groups to participate in shared HPC cluster models, where scientist from different domains use the same hardware to increase synergies. This shared usage proves to be challenging for HEP groups, due to their specialized software setup which includes a custom OS (often Scientific Linux), libraries and applications. To overcome this hurdle, the EKP and data center team of the University of Freiburg have developed a system to enable the HEP use case on a shared HPC cluster. To achieve this, an OpenStack-based virtualization layer is installed on top of a bare-metal cluster. While other user groups can run their batch jobs via the Moab workload manager directly on bare-metal, HEP users can request virtual machines with a specialized machine image which contains a dedicated operating system and software stack. In contrast to similar installations, in this hybrid setup, no static partitioning of the cluster into a physical and virtualized segment is required. As a unique feature, the placement of the virtual machine on the cluster nodes is scheduled by Moab and the job lifetime is coupled to the lifetime of the virtual machine. This allows for a seamless integration with the jobs sent by other user groups and honors the fairshare policies of the cluster. The developed thin integration layer between OpenStack and Moab can be adapted to other batch servers and virtualization systems, making the concept also applicable for other cluster operators. This contribution will report on the concept and implementation of an OpenStack-virtualized cluster used for HEP workflows. While the full cluster will be installed in spring 2016, a test-bed setup with 800 cores has been used to study the overall system performance and dedicated HEP jobs were run in a virtualized environment over many weeks. Furthermore, the dynamic integration of the virtualized worker nodes, depending on the workload at the institute's computing system, will be described.
Introducing Python tools for magnetotellurics: MTpy

NASA Astrophysics Data System (ADS)

Krieger, L.; Peacock, J.; Inverarity, K.; Thiel, S.; Robertson, K.

2013-12-01

Within the framework of geophysical exploration techniques, the magnetotelluric method (MT) is relatively immature: It is still not as widely spread as other geophysical methods like seismology, and its processing schemes and data formats are not thoroughly standardized. As a result, the file handling and processing software within the academic community is mainly based on a loose collection of codes, which are sometimes highly adapted to the respective local specifications. Although tools for the estimation of the frequency dependent MT transfer function, as well as inversion and modelling codes, are available, the standards and software for handling MT data are generally not unified throughout the community. To overcome problems that arise from missing standards, and to simplify the general handling of MT data, we have developed the software package "MTpy", which allows the handling, processing, and imaging of magnetotelluric data sets. It is written in Python and the code is open-source. The setup of this package follows the modular approach of successful software packages like GMT or Obspy. It contains sub-packages and modules for various tasks within the standard MT data processing and handling scheme. Besides pure Python classes and functions, MTpy provides wrappers and convenience scripts to call external software, e.g. modelling and inversion codes. Even though still under development, MTpy already contains ca. 250 functions that work on raw and preprocessed data. However, as our aim is not to produce a static collection of software, we rather introduce MTpy as a flexible framework, which will be dynamically extended in the future. It then has the potential to help standardise processing procedures and at same time be a versatile supplement for existing algorithms. We introduce the concept and structure of MTpy, and we illustrate the workflow of MT data processing utilising MTpy on an example data set collected over a geothermal exploration site in South Australia. Workflow of MT data processing. Within the structural diagram, the MTpy sub-packages are shown in red (time series data processing), green (handling of EDI files and impedance tensor data), yellow (connection to modelling/inversion algorithms), black (impedance tensor interpretation, e.g. by Phase Tensor calculations), and blue (generation of visual representations, e.g pseudo sections or resistivity models).

The SCEC Community Modeling Environment(SCEC/CME): A Collaboratory for Seismic Hazard Analysis

NASA Astrophysics Data System (ADS)

Maechling, P. J.; Jordan, T. H.; Minster, J. B.; Moore, R.; Kesselman, C.

2005-12-01

The SCEC Community Modeling Environment (SCEC/CME) Project is an NSF-supported Geosciences/IT partnership that is actively developing an advanced information infrastructure for system-level earthquake science in Southern California. This partnership includes SCEC, USC's Information Sciences Institute (ISI), the San Diego Supercomputer Center (SDSC), the Incorporated Institutions for Research in Seismology (IRIS), and the U.S. Geological Survey. The goal of the SCEC/CME is to develop seismological applications and information technology (IT) infrastructure to support the development of Seismic Hazard Analysis (SHA) programs and other geophysical simulations. The SHA application programs developed on the Project include a Probabilistic Seismic Hazard Analysis system called OpenSHA. OpenSHA computational elements that are currently available include a collection of attenuation relationships, and several Earthquake Rupture Forecasts (ERFs). Geophysicists in the collaboration have also developed Anelastic Wave Models (AWMs) using both finite-difference and finite-element approaches. Earthquake simulations using these codes have been run for a variety of earthquake sources. Rupture Dynamic Model (RDM) codes have also been developed that simulate friction-based fault slip. The SCEC/CME collaboration has also developed IT software and hardware infrastructure to support the development, execution, and analysis of these SHA programs. To support computationally expensive simulations, we have constructed a grid-based scientific workflow system. Using the SCEC grid, project collaborators can submit computations from the SCEC/CME servers to High Performance Computers at USC and TeraGrid High Performance Computing Centers. Data generated and archived by the SCEC/CME is stored in a digital library system, the Storage Resource Broker (SRB). This system provides a robust and secure system for maintaining the association between the data seta and their metadata. To provide an easy-to-use system for constructing SHA computations, a browser-based workflow assembly web portal has been developed. Users can compose complex SHA calculations, specifying SCEC/CME data sets as inputs to calculations, and calling SCEC/CME computational programs to process the data and the output. Knowledge-based software tools have been implemented that utilize ontological descriptions of SHA software and data can validate workflows created with this pathway assembly tool. Data visualization software developed by the collaboration supports analysis and validation of data sets. Several programs have been developed to visualize SCEC/CME data including GMT-based map making software for PSHA codes, 4D wavefield propagation visualization software based on OpenGL, and 3D Geowall-based visualization of earthquakes, faults, and seismic wave propagation. The SCEC/CME Project also helps to sponsor the SCEC UseIT Intern program. The UseIT Intern Program provides research opportunities in both Geosciences and Information Technology to undergraduate students in a variety of fields. The UseIT group has developed a 3D data visualization tool, called SCEC-VDO, as a part of this undergraduate research program.
XML schemas for common bioinformatic data types and their application in workflow systems.

PubMed

Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

2006-11-06

Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.
Scan-layered reconstructions: A pilot study of a nondestructive dental histoanatomical analysis method and digital workflow to create restorations driven by natural dentin and enamel morphology.

PubMed

Malta Barbosa, João; Tovar, Nick; A Tuesta, Pablo; Hirata, Ronaldo; Guimarães, Nuno; Romanini, José C; Moghadam, Marjan; Coelho, Paulo G; Jahangiri, Leila

2017-07-08

This work aims to present a pilot study of a non-destructive dental histo-anatomical analysis technique as well as to push the boundaries of the presently available restorative workflows for the fabrication of highly customized ceramic restorations. An extracted human maxillary central incisor was subject to a micro computed tomography scan and the acquired data was transferred into a workstation, reconstructed, segmented, evaluated and later imported into a Computer-Aided Design/Computer-Aided Manufacturing software for the fabrication of a ceramic resin-bonded prosthesis. The obtained prosthesis presented an encouraging optical behavior and was used clinically as final restoration. The digitally layered restorative replication of natural tooth morphology presents today as a clear possibility. New clinical and laboratory-fabricated, biologically inspired digital restorative protocols are to be expected in the near future. The digitally layered restorative replication of natural tooth morphology presents today as a clear possibility. This pilot study may represent a stimulus for future research and applications of digital imaging as well as digital restorative workflows in service of esthetic dentistry. © 2017 Wiley Periodicals, Inc.
Dynamic Voltage Frequency Scaling Simulator for Real Workflows Energy-Aware Management in Green Cloud Computing

PubMed Central

Cotes-Ruiz, Iván Tomás; Prado, Rocío P.; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás

2017-01-01

Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique. PMID:28085932
Dynamic Voltage Frequency Scaling Simulator for Real Workflows Energy-Aware Management in Green Cloud Computing.

PubMed

Cotes-Ruiz, Iván Tomás; Prado, Rocío P; García-Galán, Sebastián; Muñoz-Expósito, José Enrique; Ruiz-Reyes, Nicolás

2017-01-01

Nowadays, the growing computational capabilities of Cloud systems rely on the reduction of the consumed power of their data centers to make them sustainable and economically profitable. The efficient management of computing resources is at the heart of any energy-aware data center and of special relevance is the adaptation of its performance to workload. Intensive computing applications in diverse areas of science generate complex workload called workflows, whose successful management in terms of energy saving is still at its beginning. WorkflowSim is currently one of the most advanced simulators for research on workflows processing, offering advanced features such as task clustering and failure policies. In this work, an expected power-aware extension of WorkflowSim is presented. This new tool integrates a power model based on a computing-plus-communication design to allow the optimization of new management strategies in energy saving considering computing, reconfiguration and networks costs as well as quality of service, and it incorporates the preeminent strategy for on host energy saving: Dynamic Voltage Frequency Scaling (DVFS). The simulator is designed to be consistent in different real scenarios and to include a wide repertory of DVFS governors. Results showing the validity of the simulator in terms of resources utilization, frequency and voltage scaling, power, energy and time saving are presented. Also, results achieved by the intra-host DVFS strategy with different governors are compared to those of the data center using a recent and successful DVFS-based inter-host scheduling strategy as overlapped mechanism to the DVFS intra-host technique.
Efforts to integrate CMIP metadata and standards into NOAA-GFDL's climate model workflow

NASA Astrophysics Data System (ADS)

Blanton, C.; Lee, M.; Mason, E. E.; Radhakrishnan, A.

2017-12-01

Modeling centers participating in CMIP6 run model simulations, publish requested model output (conforming to community data standards), and document models and simulations using ES-DOC. GFDL developed workflow software implementing some best practices to meet these metadata and documentation requirements. The CMIP6 Data Request defines the variables that should be archived for each experiment and specifies their spatial and temporal structure. We used the Data Request's dreqPy python library to write GFDL model configuration files as an alternative to hand-crafted tables. There was also a largely successful effort to standardize variable names within the model to reduce the additional overhead of translating "GFDL to CMOR" variables at a later stage in the pipeline. The ES-DOC ecosystem provides tools and standards to create, publish, and view various types of community-defined CIM documents, most notably model and simulation documents. Although ES-DOC will automatically create simulation documents during publishing by harvesting NetCDF global attributes, the information must be collected, stored, and placed in the NetCDF files by the workflow. We propose to develop a GUI to collect the simulation document precursors. In addition, a new MIP for CMIP6-CPMIP, a comparison of computational performance of climate models-is documented using machine and performance CIM documents. We used ES-DOC's pyesdoc python library to automatically create these machine and performance documents. We hope that these and similar efforts will become permanent features of the GFDL workflow to facilitate future participation in CMIP-like activities.
speaq 2.0: A complete workflow for high-throughput 1D NMR spectra processing and quantification.

PubMed

Beirnaert, Charlie; Meysman, Pieter; Vu, Trung Nghia; Hermans, Nina; Apers, Sandra; Pieters, Luc; Covaci, Adrian; Laukens, Kris

2018-03-01

Nuclear Magnetic Resonance (NMR) spectroscopy is, together with liquid chromatography-mass spectrometry (LC-MS), the most established platform to perform metabolomics. In contrast to LC-MS however, NMR data is predominantly being processed with commercial software. Meanwhile its data processing remains tedious and dependent on user interventions. As a follow-up to speaq, a previously released workflow for NMR spectral alignment and quantitation, we present speaq 2.0. This completely revised framework to automatically analyze 1D NMR spectra uses wavelets to efficiently summarize the raw spectra with minimal information loss or user interaction. The tool offers a fast and easy workflow that starts with the common approach of peak-picking, followed by grouping, thus avoiding the binning step. This yields a matrix consisting of features, samples and peak values that can be conveniently processed either by using included multivariate statistical functions or by using many other recently developed methods for NMR data analysis. speaq 2.0 facilitates robust and high-throughput metabolomics based on 1D NMR but is also compatible with other NMR frameworks or complementary LC-MS workflows. The methods are benchmarked using a simulated dataset and two publicly available datasets. speaq 2.0 is distributed through the existing speaq R package to provide a complete solution for NMR data processing. The package and the code for the presented case studies are freely available on CRAN (https://cran.r-project.org/package=speaq) and GitHub (https://github.com/beirnaert/speaq).
speaq 2.0: A complete workflow for high-throughput 1D NMR spectra processing and quantification

PubMed Central

Pieters, Luc; Covaci, Adrian

2018-01-01

Nuclear Magnetic Resonance (NMR) spectroscopy is, together with liquid chromatography-mass spectrometry (LC-MS), the most established platform to perform metabolomics. In contrast to LC-MS however, NMR data is predominantly being processed with commercial software. Meanwhile its data processing remains tedious and dependent on user interventions. As a follow-up to speaq, a previously released workflow for NMR spectral alignment and quantitation, we present speaq 2.0. This completely revised framework to automatically analyze 1D NMR spectra uses wavelets to efficiently summarize the raw spectra with minimal information loss or user interaction. The tool offers a fast and easy workflow that starts with the common approach of peak-picking, followed by grouping, thus avoiding the binning step. This yields a matrix consisting of features, samples and peak values that can be conveniently processed either by using included multivariate statistical functions or by using many other recently developed methods for NMR data analysis. speaq 2.0 facilitates robust and high-throughput metabolomics based on 1D NMR but is also compatible with other NMR frameworks or complementary LC-MS workflows. The methods are benchmarked using a simulated dataset and two publicly available datasets. speaq 2.0 is distributed through the existing speaq R package to provide a complete solution for NMR data processing. The package and the code for the presented case studies are freely available on CRAN (https://cran.r-project.org/package=speaq) and GitHub (https://github.com/beirnaert/speaq). PMID:29494588
Workflows for Full Waveform Inversions

NASA Astrophysics Data System (ADS)

Boehm, Christian; Krischer, Lion; Afanasiev, Michael; van Driel, Martin; May, Dave A.; Rietmann, Max; Fichtner, Andreas

2017-04-01

Despite many theoretical advances and the increasing availability of high-performance computing clusters, full seismic waveform inversions still face considerable challenges regarding data and workflow management. While the community has access to solvers which can harness modern heterogeneous computing architectures, the computational bottleneck has fallen to these often manpower-bounded issues that need to be overcome to facilitate further progress. Modern inversions involve huge amounts of data and require a tight integration between numerical PDE solvers, data acquisition and processing systems, nonlinear optimization libraries, and job orchestration frameworks. To this end we created a set of libraries and applications revolving around Salvus (http://salvus.io), a novel software package designed to solve large-scale full waveform inverse problems. This presentation focuses on solving passive source seismic full waveform inversions from local to global scales with Salvus. We discuss (i) design choices for the aforementioned components required for full waveform modeling and inversion, (ii) their implementation in the Salvus framework, and (iii) how it is all tied together by a usable workflow system. We combine state-of-the-art algorithms ranging from high-order finite-element solutions of the wave equation to quasi-Newton optimization algorithms using trust-region methods that can handle inexact derivatives. All is steered by an automated interactive graph-based workflow framework capable of orchestrating all necessary pieces. This naturally facilitates the creation of new Earth models and hopefully sparks new scientific insights. Additionally, and even more importantly, it enhances reproducibility and reliability of the final results.
CMS Configuration Editor: GUI based application for user analysis job

NASA Astrophysics Data System (ADS)

de Cosa, A.

2011-12-01

We present the user interface and the software architecture of the Configuration Editor for the CMS experiment. The analysis workflow is organized in a modular way integrated within the CMS framework that organizes in a flexible way user analysis code. The Python scripting language is adopted to define the job configuration that drives the analysis workflow. It could be a challenging task for users, especially for newcomers, to develop analysis jobs managing the configuration of many required modules. For this reason a graphical tool has been conceived in order to edit and inspect configuration files. A set of common analysis tools defined in the CMS Physics Analysis Toolkit (PAT) can be steered and configured using the Config Editor. A user-defined analysis workflow can be produced starting from a standard configuration file, applying and configuring PAT tools according to the specific user requirements. CMS users can adopt this tool, the Config Editor, to create their analysis visualizing in real time which are the effects of their actions. They can visualize the structure of their configuration, look at the modules included in the workflow, inspect the dependences existing among the modules and check the data flow. They can visualize at which values parameters are set and change them according to what is required by their analysis task. The integration of common tools in the GUI needed to adopt an object-oriented structure in the Python definition of the PAT tools and the definition of a layer of abstraction from which all PAT tools inherit.
An Adaptable Seismic Data Format for Modern Scientific Workflows

NASA Astrophysics Data System (ADS)

Smith, J. A.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Podhorszki, N.; Tromp, J.

2013-12-01

Data storage, exchange, and access play a critical role in modern seismology. Current seismic data formats, such as SEED, SAC, and SEG-Y, were designed with specific applications in mind and are frequently a major bottleneck in implementing efficient workflows. We propose a new modern parallel format that can be adapted for a variety of seismic workflows. The Adaptable Seismic Data Format (ASDF) features high-performance parallel read and write support and the ability to store an arbitrary number of traces of varying sizes. Provenance information is stored inside the file so that users know the origin of the data as well as the precise operations that have been applied to the waveforms. The design of the new format is based on several real-world use cases, including earthquake seismology and seismic interferometry. The metadata is based on the proven XML schemas StationXML and QuakeML. Existing time-series analysis tool-kits are easily interfaced with this new format so that seismologists can use robust, previously developed software packages, such as ObsPy and the SAC library. ADIOS, netCDF4, and HDF5 can be used as the underlying container format. At Princeton University, we have chosen to use ADIOS as the container format because it has shown superior scalability for certain applications, such as dealing with big data on HPC systems. In the context of high-performance computing, we have implemented ASDF into the global adjoint tomography workflow on Oak Ridge National Laboratory's supercomputer Titan.
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages

PubMed Central

Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan

2016-01-01

Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks. PMID:28232861
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages.

PubMed

Silva, Tiago C; Colaprico, Antonio; Olsen, Catharina; D'Angelo, Fulvio; Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan

2016-01-01

Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.
Image Reconstruction is a New Frontier of Machine Learning.

PubMed

Wang, Ge; Ye, Jong Chu; Mueller, Klaus; Fessler, Jeffrey A

2018-06-01

Over past several years, machine learning, or more generally artificial intelligence, has generated overwhelming research interest and attracted unprecedented public attention. As tomographic imaging researchers, we share the excitement from our imaging perspective [item 1) in the Appendix], and organized this special issue dedicated to the theme of "Machine learning for image reconstruction." This special issue is a sister issue of the special issue published in May 2016 of this journal with the theme "Deep learning in medical imaging" [item 2) in the Appendix]. While the previous special issue targeted medical image processing/analysis, this special issue focuses on data-driven tomographic reconstruction. These two special issues are highly complementary, since image reconstruction and image analysis are two of the main pillars for medical imaging. Together we cover the whole workflow of medical imaging: from tomographic raw data/features to reconstructed images and then extracted diagnostic features/readings.
Unipro UGENE: a unified bioinformatics toolkit.

PubMed

Okonechnikov, Konstantin; Golosova, Olga; Fursov, Mikhail

2012-04-15

Unipro UGENE is a multiplatform open-source software with the main goal of assisting molecular biologists without much expertise in bioinformatics to manage, analyze and visualize their data. UGENE integrates widely used bioinformatics tools within a common user interface. The toolkit supports multiple biological data formats and allows the retrieval of data from remote data sources. It provides visualization modules for biological objects such as annotated genome sequences, Next Generation Sequencing (NGS) assembly data, multiple sequence alignments, phylogenetic trees and 3D structures. Most of the integrated algorithms are tuned for maximum performance by the usage of multithreading and special processor instructions. UGENE includes a visual environment for creating reusable workflows that can be launched on local resources or in a High Performance Computing (HPC) environment. UGENE is written in C++ using the Qt framework. The built-in plugin system and structured UGENE API make it possible to extend the toolkit with new functionality. UGENE binaries are freely available for MS Windows, Linux and Mac OS X at http://ugene.unipro.ru/download.html. UGENE code is licensed under the GPLv2; the information about the code licensing and copyright of integrated tools can be found in the LICENSE.3rd_party file provided with the source bundle.
[Development of an ophthalmological clinical information system for inpatient eye clinics].

PubMed

Kortüm, K U; Müller, M; Babenko, A; Kampik, A; Kreutzer, T C

2015-12-01

In times of increased digitalization in healthcare, departments of ophthalmology are faced with the challenge of introducing electronic clinical health records (EHR); however, specialized software for ophthalmology is not available with most major EHR sytems. The aim of this project was to create specific ophthalmological user interfaces for large inpatient eye care providers within a hospitalwide EHR. Additionally the integration of ophthalmic imaging systems, scheduling and surgical documentation should be achieved. The existing EHR i.s.h.med (Siemens, Germany) was modified using advanced business application programming (ABAP) language to create specific ophthalmological user interfaces for reproduction and moreover optimization of the clinical workflow. A user interface for documentation of ambulatory patients with eight tabs was designed. From June 2013 to October 2014 a total of 61,551 patient contact details were documented. For surgical documentation a separate user interface was set up. Digital clinical orders for documentation of registration and scheduling of operations user interfaces were also set up. A direct integration of ophthalmic imaging modalities could be established. An ophthalmologist-orientated EHR for outpatient and surgical documentation for inpatient clinics was created and successfully implemented. By incorporation of imaging procedures the foundation of future smart/big data analyses was created.
Special issue on enabling open and interoperable access to Planetary Science and Heliophysics databases and tools

NASA Astrophysics Data System (ADS)

2018-01-01

The large amount of data generated by modern space missions calls for a change of organization of data distribution and access procedures. Although long term archives exist for telescopic and space-borne observations, high-level functions need to be developed on top of these repositories to make Planetary Science and Heliophysics data more accessible and to favor interoperability. Results of simulations and reference laboratory data also need to be integrated to support and interpret the observations. Interoperable software and interfaces have recently been developed in many scientific domains. The Virtual Observatory (VO) interoperable standards developed for Astronomy by the International Virtual Observatory Alliance (IVOA) can be adapted to Planetary Sciences, as demonstrated by the VESPA (Virtual European Solar and Planetary Access) team within the Europlanet-H2020-RI project. Other communities have developed their own standards: GIS (Geographic Information System) for Earth and planetary surfaces tools, SPASE (Space Physics Archive Search and Extract) for space plasma, PDS4 (NASA Planetary Data System, version 4) and IPDA (International Planetary Data Alliance) for planetary mission archives, etc, and an effort to make them interoperable altogether is starting, including automated workflows to process related data from different sources.
An Open Source Model for Open Access Journal Publication

PubMed Central

Blesius, Carl R.; Williams, Michael A.; Holzbach, Ana; Huntley, Arthur C.; Chueh, Henry

2005-01-01

We describe an electronic journal publication infrastructure that allows a flexible publication workflow, academic exchange around different forms of user submissions, and the exchange of articles between publishers and archives using a common XML based standard. This web-based application is implemented on a freely available open source software stack. This publication demonstrates the Dermatology Online Journal's use of the platform for non-biased independent open access publication. PMID:16779183
Semiautomated Workflow for Clinically Streamlined Glioma Parametric Response Mapping

PubMed Central

Keith, Lauren; Ross, Brian D.; Galbán, Craig J.; Luker, Gary D.; Galbán, Stefanie; Zhao, Binsheng; Guo, Xiaotao; Chenevert, Thomas L.; Hoff, Benjamin A.

2017-01-01

Management of glioblastoma multiforme remains a challenging problem despite recent advances in targeted therapies. Timely assessment of therapeutic agents is hindered by the lack of standard quantitative imaging protocols for determining targeted response. Clinical response assessment for brain tumors is determined by volumetric changes assessed at 10 weeks post-treatment initiation. Further, current clinical criteria fail to use advanced quantitative imaging approaches, such as diffusion and perfusion magnetic resonance imaging. Development of the parametric response mapping (PRM) applied to diffusion-weighted magnetic resonance imaging has provided a sensitive and early biomarker of successful cytotoxic therapy in brain tumors while maintaining a spatial context within the tumor. Although PRM provides an earlier readout than volumetry and sometimes greater sensitivity compared with traditional whole-tumor diffusion statistics, it is not routinely used for patient management; an automated and standardized software for performing the analysis and for the generation of a clinical report document is required for this. We present a semiautomated and seamless workflow for image coregistration, segmentation, and PRM classification of glioblastoma multiforme diffusion-weighted magnetic resonance imaging scans. The software solution can be integrated using local hardware or performed remotely in the cloud while providing connectivity to existing picture archive and communication systems. This is an important step toward implementing PRM analysis of solid tumors in routine clinical practice. PMID:28286871
AtomPy: an open atomic-data curation environment

NASA Astrophysics Data System (ADS)

Bautista, Manuel; Mendoza, Claudio; Boswell, Josiah S; Ajoku, Chukwuemeka

2014-06-01

We present a cloud-computing environment for atomic data curation, networking among atomic data providers and users, teaching-and-learning, and interfacing with spectral modeling software. The system is based on Google-Drive Sheets, Pandas (Python Data Analysis Library) DataFrames, and IPython Notebooks for open community-driven curation of atomic data for scientific and technological applications. The atomic model for each ionic species is contained in a multi-sheet Google-Drive workbook, where the atomic parameters from all known public sources are progressively stored. Metadata (provenance, community discussion, etc.) accompanying every entry in the database are stored through Notebooks. Education tools on the physics of atomic processes as well as their relevance to plasma and spectral modeling are based on IPython Notebooks that integrate written material, images, videos, and active computer-tool workflows. Data processing workflows and collaborative software developments are encouraged and managed through the GitHub social network. Relevant issues this platform intends to address are: (i) data quality by allowing open access to both data producers and users in order to attain completeness, accuracy, consistency, provenance and currentness; (ii) comparisons of different datasets to facilitate accuracy assessment; (iii) downloading to local data structures (i.e. Pandas DataFrames) for further manipulation and analysis by prospective users; and (iv) data preservation by avoiding the discard of outdated sets.

Development of an open source laboratory information management system for 2-D gel electrophoresis-based proteomics workflow

PubMed Central

Morisawa, Hiraku; Hirota, Mikako; Toda, Tosifusa

2006-01-01

Background In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies. Results We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. Conclusion Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved. PMID:17018156
TU-E-BRB-03: Overview of Proposed TG-132 Recommendations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brock, K.

2015-06-15

Deformable image registration (DIR) is developing rapidly and is poised to substantially improve dose fusion accuracy for adaptive and retreatment planning and motion management and PET fusion to enhance contour delineation for treatment planning. However, DIR dose warping accuracy is difficult to quantify, in general, and particularly difficult to do so on a patient-specific basis. As clinical DIR options become more widely available, there is an increased need to understand the implications of incorporating DIR into clinical workflow. Several groups have assessed DIR accuracy in clinically relevant scenarios, but no comprehensive review material is yet available. This session will alsomore » discuss aspects of the AAPM Task Group 132 on the Use of Image Registration and Data Fusion Algorithms and Techniques in Radiotherapy Treatment Planning official report, which provides recommendations for DIR clinical use. We will summarize and compare various commercial DIR software options, outline successful clinical techniques, show specific examples with discussion of appropriate and inappropriate applications of DIR, discuss the clinical implications of DIR, provide an overview of current DIR error analysis research, review QA options and research phantom development and present TG-132 recommendations. Learning Objectives: Compare/contrast commercial DIR software and QA options Overview clinical DIR workflow for retreatment To understand uncertainties introduced by DIR Review TG-132 proposed recommendations.« less
PeptideDepot: Flexible Relational Database for Visual Analysis of Quantitative Proteomic Data and Integration of Existing Protein Information

PubMed Central

Yu, Kebing; Salomon, Arthur R.

2010-01-01

Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through tandem mass spectrometry (MS/MS). Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to a variety of experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our High Throughput Autonomous Proteomic Pipeline (HTAPP) used in the automated acquisition and post-acquisition analysis of proteomic data. PMID:19834895
LCG/AA build infrastructure

NASA Astrophysics Data System (ADS)

Hodgkins, Alex Liam; Diez, Victor; Hegner, Benedikt

2012-12-01

The Software Process & Infrastructure (SPI) project provides a build infrastructure for regular integration testing and release of the LCG Applications Area software stack. In the past, regular builds have been provided using a system which has been constantly growing to include more features like server-client communication, long-term build history and a summary web interface using present-day web technologies. However, the ad-hoc style of software development resulted in a setup that is hard to monitor, inflexible and difficult to expand. The new version of the infrastructure is based on the Django Python framework, which allows for a structured and modular design, facilitating later additions. Transparency in the workflows and ease of monitoring has been one of the priorities in the design. Formerly missing functionality like on-demand builds or release triggering will support the transition to a more agile development process.
Implementation of Task-Tracking Software for Clinical IT Management.

PubMed

Purohit, Anne-Maria; Brutscheck, Clemens; Prokosch, Hans-Ulrich; Ganslandt, Thomas; Schneider, Martin

2017-01-01

Often in clinical IT departments, many different methods and IT systems are used for task-tracking and project organization. Based on managers' personal preferences and knowledge about project management methods, tools differ from team to team and even from employee to employee. This causes communication problems, especially when tasks need to be done in cooperation with different teams. Monitoring tasks and resources becomes impossible: there are no defined deliverables, which prevents reliable deadlines. Because of these problems, we implemented task-tracking software which is now in use across all seven teams at the University Hospital Erlangen. Over a period of seven months, a working group defined types of tasks (project, routine task, etc.), workflows, and views to monitor the tasks of the 7 divisions, 20 teams and 340 different IT services. The software has been in use since December 2016.
An MRM-based workflow for quantifying cardiac mitochondrial protein phosphorylation in murine and human tissue.

PubMed

Lam, Maggie P Y; Scruggs, Sarah B; Kim, Tae-Young; Zong, Chenggong; Lau, Edward; Wang, Ding; Ryan, Christopher M; Faull, Kym F; Ping, Peipei

2012-08-03

The regulation of mitochondrial function is essential for cardiomyocyte adaptation to cellular stress. While it has long been understood that phosphorylation regulates flux through metabolic pathways, novel phosphorylation sites are continually being discovered in all functionally distinct areas of the mitochondrial proteome. Extracting biologically meaningful information from these phosphorylation sites requires an adaptable, sensitive, specific and robust method for their quantification. Here we report a multiple reaction monitoring-based mass spectrometric workflow for quantifying site-specific phosphorylation of mitochondrial proteins. Specifically, chromatographic and mass spectrometric conditions for 68 transitions derived from 23 murine and human phosphopeptides, and their corresponding unmodified peptides, were optimized. These methods enabled the quantification of endogenous phosphopeptides from the outer mitochondrial membrane protein VDAC, and the inner membrane proteins ANT and ETC complexes I, III and V. The development of this quantitative workflow is a pivotal step for advancing our knowledge and understanding of the regulatory effects of mitochondrial protein phosphorylation in cardiac physiology and pathophysiology. This article is part of a Special Issue entitled: Translational Proteomics. Copyright © 2012 Elsevier B.V. All rights reserved.
MTpy - Python Tools for Magnetotelluric Data Processing and Analysis

NASA Astrophysics Data System (ADS)

Krieger, Lars; Peacock, Jared; Thiel, Stephan; Inverarity, Kent; Kirkby, Alison; Robertson, Kate; Soeffky, Paul; Didana, Yohannes

2014-05-01

We present the Python package MTpy, which provides functions for the processing, analysis, and handling of magnetotelluric (MT) data sets. MT is a relatively immature and not widely applied geophysical method in comparison to other geophysical techniques such as seismology. As a result, the data processing within the academic MT community is not thoroughly standardised and is often based on a loose collection of software, adapted to the respective local specifications. We have developed MTpy to overcome problems that arise from missing standards, and to provide a simplification of the general handling of MT data. MTpy is written in Python, and the open-source code is freely available from a GitHub repository. The setup follows the modular approach of successful geoscience software packages such as GMT or Obspy. It contains sub-packages and modules for the various tasks within the standard work-flow of MT data processing and interpretation. In order to allow the inclusion of already existing and well established software, MTpy does not only provide pure Python classes and functions, but also wrapping command-line scripts to run standalone tools, e.g. modelling and inversion codes. Our aim is to provide a flexible framework, which is open for future dynamic extensions. MTpy has the potential to promote the standardisation of processing procedures and at same time be a versatile supplement for existing algorithms. Here, we introduce the concept and structure of MTpy, and we illustrate the workflow of MT data processing, interpretation, and visualisation utilising MTpy on example data sets collected over different regions of Australia and the USA.
Neurophysiological analytics for all! Free open-source software tools for documenting, analyzing, visualizing, and sharing using electronic notebooks

PubMed Central

2016-01-01

Neurophysiology requires an extensive workflow of information analysis routines, which often includes incompatible proprietary software, introducing limitations based on financial costs, transfer of data between platforms, and the ability to share. An ecosystem of free open-source software exists to fill these gaps, including thousands of analysis and plotting packages written in Python and R, which can be implemented in a sharable and reproducible format, such as the Jupyter electronic notebook. This tool chain can largely replace current routines by importing data, producing analyses, and generating publication-quality graphics. An electronic notebook like Jupyter allows these analyses, along with documentation of procedures, to display locally or remotely in an internet browser, which can be saved as an HTML, PDF, or other file format for sharing with team members and the scientific community. The present report illustrates these methods using data from electrophysiological recordings of the musk shrew vagus—a model system to investigate gut-brain communication, for example, in cancer chemotherapy-induced emesis. We show methods for spike sorting (including statistical validation), spike train analysis, and analysis of compound action potentials in notebooks. Raw data and code are available from notebooks in data supplements or from an executable online version, which replicates all analyses without installing software—an implementation of reproducible research. This demonstrates the promise of combining disparate analyses into one platform, along with the ease of sharing this work. In an age of diverse, high-throughput computational workflows, this methodology can increase efficiency, transparency, and the collaborative potential of neurophysiological research. PMID:27098025
PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

PubMed

Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G; Hardoim, Pablo R; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M T

2017-03-04

Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( Saccharum spp.) and in maize ( Zea mays ). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.
PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

PubMed Central

Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G.; Hardoim, Pablo R.; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M. T.

2017-01-01

Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp.) and in maize (Zea mays). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms. PMID:29657283
HTSstation: a web application and open-access libraries for high-throughput sequencing data analysis.

PubMed

David, Fabrice P A; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

2014-01-01

The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.
HTSstation: A Web Application and Open-Access Libraries for High-Throughput Sequencing Data Analysis

PubMed Central

David, Fabrice P. A.; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J.; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

2014-01-01

The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch. PMID:24475057
Performing statistical analyses on quantitative data in Taverna workflows: an example using R and maxdBrowse to identify differentially-expressed genes from microarray data.

PubMed

Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

2008-08-07

There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.
Performing statistical analyses on quantitative data in Taverna workflows: An example using R and maxdBrowse to identify differentially-expressed genes from microarray data

PubMed Central

Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

2008-01-01

Background There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Results Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Conclusion Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data. PMID:18687127
Concept for individualized patient allocation: ReCompare—remote comparison of particle and photon treatment plans

PubMed Central

2014-01-01

Background Identifying those patients who have a higher chance to be cured with fewer side effects by particle beam therapy than by state-of-the-art photon therapy is essential to guarantee a fair and sufficient access to specialized radiotherapy. The individualized identification requires initiatives by particle as well as non-particle radiotherapy centers to form networks, to establish procedures for the decision process, and to implement means for the remote exchange of relevant patient information. In this work, we want to contribute a practical concept that addresses these requirements. Methods We proposed a concept for individualized patient allocation to photon or particle beam therapy at a non-particle radiotherapy institution that bases on remote treatment plan comparison. We translated this concept into the web-based software tool ReCompare (REmote COMparison of PARticlE and photon treatment plans). Results We substantiated the feasibility of the proposed concept by demonstrating remote exchange of treatment plans between radiotherapy institutions and the direct comparison of photon and particle treatment plans in photon treatment planning systems. ReCompare worked with several tested standard treatment planning systems, ensured patient data protection, and integrated in the clinical workflow. Conclusions Our concept supports non-particle radiotherapy institutions with the patient-specific treatment decision on the optimal irradiation modality by providing expertise from a particle therapy center. The software tool ReCompare may help to improve and standardize this personalized treatment decision. It will be available from our website when proton therapy is operational at our facility. PMID:24548333
MicroSIFT Courseware Evaluations. [Set 11 (223-259), Set 12 (260-293), and a Special Set of 99 LIBRA Reviews of Junior High School Science Software, Including Subject and Title Indexes Covering Sets 1-12 and Special Set L].

ERIC Educational Resources Information Center

Northwest Regional Educational Lab., Portland, OR.

This document consists of 170 microcomputer software package evaluations prepared by the MicroSIFT (Microcomputer Software and Information for Teachers) Clearinghouse at the Northwest Regional Education Laboratory. Set 11 consists of 37 packages. Set 12 consists of 34 packages. A special unnumbered set, entitled LIBRA Reviews, treats 99 packages…
Optimizing Perioperative Decision Making: Improved Information for Clinical Workflow Planning

PubMed Central

Doebbeling, Bradley N.; Burton, Matthew M.; Wiebke, Eric A.; Miller, Spencer; Baxter, Laurence; Miller, Donald; Alvarez, Jorge; Pekny, Joseph

2012-01-01

Perioperative care is complex and involves multiple interconnected subsystems. Delayed starts, prolonged cases and overtime are common. Surgical procedures account for 40–70% of hospital revenues and 30–40% of total costs. Most planning and scheduling in healthcare is done without modern planning tools, which have potential for improving access by assisting in operations planning support. We identified key planning scenarios of interest to perioperative leaders, in order to examine the feasibility of applying combinatorial optimization software solving some of those planning issues in the operative setting. Perioperative leaders desire a broad range of tools for planning and assessing alternate solutions. Our modeled solutions generated feasible solutions that varied as expected, based on resource and policy assumptions and found better utilization of scarce resources. Combinatorial optimization modeling can effectively evaluate alternatives to support key decisions for planning clinical workflow and improving care efficiency and satisfaction. PMID:23304284
A case study for cloud based high throughput analysis of NGS data using the globus genomics system

DOE PAGES

Bhuvaneshwar, Krithika; Sulakhe, Dinanath; Gauba, Robinder; ...

2015-01-01

Next generation sequencing (NGS) technologies produce massive amounts of data requiring a powerful computational infrastructure, high quality bioinformatics software, and skilled personnel to operate the tools. We present a case study of a practical solution to this data management and analysis challenge that simplifies terabyte scale data handling and provides advanced tools for NGS data analysis. These capabilities are implemented using the “Globus Genomics” system, which is an enhanced Galaxy workflow system made available as a service that offers users the capability to process and transfer data easily, reliably and quickly to address end-to-end NGS analysis requirements. The Globus Genomicsmore » system is built on Amazon's cloud computing infrastructure. The system takes advantage of elastic scaling of compute resources to run multiple workflows in parallel and it also helps meet the scale-out analysis needs of modern translational genomics research.« less
Optimizing perioperative decision making: improved information for clinical workflow planning.

PubMed

Doebbeling, Bradley N; Burton, Matthew M; Wiebke, Eric A; Miller, Spencer; Baxter, Laurence; Miller, Donald; Alvarez, Jorge; Pekny, Joseph

2012-01-01

Perioperative care is complex and involves multiple interconnected subsystems. Delayed starts, prolonged cases and overtime are common. Surgical procedures account for 40-70% of hospital revenues and 30-40% of total costs. Most planning and scheduling in healthcare is done without modern planning tools, which have potential for improving access by assisting in operations planning support. We identified key planning scenarios of interest to perioperative leaders, in order to examine the feasibility of applying combinatorial optimization software solving some of those planning issues in the operative setting. Perioperative leaders desire a broad range of tools for planning and assessing alternate solutions. Our modeled solutions generated feasible solutions that varied as expected, based on resource and policy assumptions and found better utilization of scarce resources. Combinatorial optimization modeling can effectively evaluate alternatives to support key decisions for planning clinical workflow and improving care efficiency and satisfaction.
Virtual Field Reconnaissance to enable multi-site collaboration in geoscience fieldwork in Chile.

NASA Astrophysics Data System (ADS)

Hughes, Leanne; Bateson, Luke; Ford, Jonathan; Napier, Bruce; Creixell, Christian; Contreras, Juan-Pablo; Vallette, Jane

2017-04-01

The unique challenges of geological mapping in remote terrains can make cross-organisation collaboration challenging. Cooperation between the British and Chilean Geological Surveys and the Chilean national mining company used the BGS digital Mapping Workflow and virtual field reconnaissance software (GeoVisionary) to undertake geological mapping in a complex area of Andean Geology. The international team undertook a pre-field evaluation using GeoVisionary to integrate massive volumes of data and interpret high resolution satellite imagery, terrain models and existing geological information to capture, manipulate and understand geological features and re-interpret existing maps. This digital interpretation was then taken into the field and verified using the BGS digital data capture system (SIGMA.mobile). This allowed the production of final geological interpretation and creation of a geological map. This presentation describes the digital mapping workflow used in Chile and highlights the key advantages of increased efficiency and communication to colleagues, stakeholders and funding bodies.

aTRAM 2.0: An Improved, Flexible Locus Assembler for NGS Data

PubMed Central

Allen, Julie M; LaFrance, Raphael; Folk, Ryan A; Johnson, Kevin P; Guralnick, Robert P

2018-01-01

Massive strides have been made in technologies for collecting genome-scale data. However, tools for efficiently and flexibly assembling raw outputs into downstream analytical workflows are still nascent. aTRAM 1.0 was designed to assemble any locus from genome sequencing data but was neither optimized for efficiency nor able to serve as a single toolkit for all assembly needs. We have completely re-implemented aTRAM and redesigned its structure for faster read retrieval while adding a number of key features to improve flexibility and functionality. The software can now (1) assemble single- or paired-end data, (2) utilize both read directions in the database, (3) use an additional de novo assembly module, and (4) leverage new built-in pipelines to automate common workflows in phylogenomics. Owing to reimplementation of databasing strategies, we demonstrate that aTRAM 2.0 is much faster across all applications compared to the previous version. PMID:29881251
A case study for cloud based high throughput analysis of NGS data using the globus genomics system

PubMed Central

Bhuvaneshwar, Krithika; Sulakhe, Dinanath; Gauba, Robinder; Rodriguez, Alex; Madduri, Ravi; Dave, Utpal; Lacinski, Lukasz; Foster, Ian; Gusev, Yuriy; Madhavan, Subha

2014-01-01

Next generation sequencing (NGS) technologies produce massive amounts of data requiring a powerful computational infrastructure, high quality bioinformatics software, and skilled personnel to operate the tools. We present a case study of a practical solution to this data management and analysis challenge that simplifies terabyte scale data handling and provides advanced tools for NGS data analysis. These capabilities are implemented using the “Globus Genomics” system, which is an enhanced Galaxy workflow system made available as a service that offers users the capability to process and transfer data easily, reliably and quickly to address end-to-endNGS analysis requirements. The Globus Genomics system is built on Amazon 's cloud computing infrastructure. The system takes advantage of elastic scaling of compute resources to run multiple workflows in parallel and it also helps meet the scale-out analysis needs of modern translational genomics research. PMID:26925205
An image analysis system for near-infrared (NIR) fluorescence lymph imaging

NASA Astrophysics Data System (ADS)

Zhang, Jingdan; Zhou, Shaohua Kevin; Xiang, Xiaoyan; Rasmussen, John C.; Sevick-Muraca, Eva M.

2011-03-01

Quantitative analysis of lymphatic function is crucial for understanding the lymphatic system and diagnosing the associated diseases. Recently, a near-infrared (NIR) fluorescence imaging system is developed for real-time imaging lymphatic propulsion by intradermal injection of microdose of a NIR fluorophore distal to the lymphatics of interest. However, the previous analysis software3, 4 is underdeveloped, requiring extensive time and effort to analyze a NIR image sequence. In this paper, we develop a number of image processing techniques to automate the data analysis workflow, including an object tracking algorithm to stabilize the subject and remove the motion artifacts, an image representation named flow map to characterize lymphatic flow more reliably, and an automatic algorithm to compute lymph velocity and frequency of propulsion. By integrating all these techniques to a system, the analysis workflow significantly reduces the amount of required user interaction and improves the reliability of the measurement.
Community-driven computational biology with Debian Linux.

PubMed

Möller, Steffen; Krabbenhöft, Hajo Nils; Tille, Andreas; Paleino, David; Williams, Alan; Wolstencroft, Katy; Goble, Carole; Holland, Richard; Belhachemi, Dominique; Plessy, Charles

2010-12-21

The Open Source movement and its technologies are popular in the bioinformatics community because they provide freely available tools and resources for research. In order to feed the steady demand for updates on software and associated data, a service infrastructure is required for sharing and providing these tools to heterogeneous computing environments. The Debian Med initiative provides ready and coherent software packages for medical informatics and bioinformatics. These packages can be used together in Taverna workflows via the UseCase plugin to manage execution on local or remote machines. If such packages are available in cloud computing environments, the underlying hardware and the analysis pipelines can be shared along with the software. Debian Med closes the gap between developers and users. It provides a simple method for offering new releases of software and data resources, thus provisioning a local infrastructure for computational biology. For geographically distributed teams it can ensure they are working on the same versions of tools, in the same conditions. This contributes to the world-wide networking of researchers.
The Five 'R's' for Developing Trusted Software Frameworks to increase confidence in, and maximise reuse of, Open Source Software.

NASA Astrophysics Data System (ADS)

Fraser, Ryan; Gross, Lutz; Wyborn, Lesley; Evans, Ben; Klump, Jens

2015-04-01

Recent investments in HPC, cloud and Petascale data stores, have dramatically increased the scale and resolution that earth science challenges can now be tackled. These new infrastructures are highly parallelised and to fully utilise them and access the large volumes of earth science data now available, a new approach to software stack engineering needs to be developed. The size, complexity and cost of the new infrastructures mean any software deployed has to be reliable, trusted and reusable. Increasingly software is available via open source repositories, but these usually only enable code to be discovered and downloaded. As a user it is hard for a scientist to judge the suitability and quality of individual codes: rarely is there information on how and where codes can be run, what the critical dependencies are, and in particular, on the version requirements and licensing of the underlying software stack. A trusted software framework is proposed to enable reliable software to be discovered, accessed and then deployed on multiple hardware environments. More specifically, this framework will enable those who generate the software, and those who fund the development of software, to gain credit for the effort, IP, time and dollars spent, and facilitate quantification of the impact of individual codes. For scientific users, the framework delivers reviewed and benchmarked scientific software with mechanisms to reproduce results. The trusted framework will have five separate, but connected components: Register, Review, Reference, Run, and Repeat. 1) The Register component will facilitate discovery of relevant software from multiple open source code repositories. The registration process of the code should include information about licensing, hardware environments it can be run on, define appropriate validation (testing) procedures and list the critical dependencies. 2) The Review component is targeting on the verification of the software typically against a set of benchmark cases. This will be achieved by linking the code in the software framework to peer review forums such as Mozilla Science or appropriate Journals (e.g. Geoscientific Model Development Journal) to assist users to know which codes to trust. 3) Referencing will be accomplished by linking the Software Framework to groups such as Figshare or ImpactStory that help disseminate and measure the impact of scientific research, including program code. 4) The Run component will draw on information supplied in the registration process, benchmark cases described in the review and relevant information to instantiate the scientific code on the selected environment. 5) The Repeat component will tap into existing Provenance Workflow engines that will automatically capture information that relate to a particular run of that software, including identification of all input and output artefacts, and all elements and transactions within that workflow. The proposed trusted software framework will enable users to rapidly discover and access reliable code, reduce the time to deploy it and greatly facilitate sharing, reuse and reinstallation of code. Properly designed it could enable an ability to scale out to massively parallel systems and be accessed nationally/ internationally for multiple use cases, including Supercomputer centres, cloud facilities, and local computers.
Cytoscape: the network visualization tool for GenomeSpace workflows.

PubMed

Demchak, Barry; Hull, Tim; Reich, Michael; Liefeld, Ted; Smoot, Michael; Ideker, Trey; Mesirov, Jill P

2014-01-01

Modern genomic analysis often requires workflows incorporating multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows between them. One such tool is Cytoscape 3, a popular application that enables analysis and visualization of graph-oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over 850 times since the release of its first version in September, 2013.
Cytoscape: the network visualization tool for GenomeSpace workflows

PubMed Central

Demchak, Barry; Hull, Tim; Reich, Michael; Liefeld, Ted; Smoot, Michael; Ideker, Trey; Mesirov, Jill P.

2014-01-01

Modern genomic analysis often requires workflows incorporating multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows between them. One such tool is Cytoscape 3, a popular application that enables analysis and visualization of graph-oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over 850 times since the release of its first version in September, 2013. PMID:25165537
Enhancing the Breadth and Efficacy of Therapeutic Vaccines for Breast Cancer

DTIC Science & Technology

2014-10-01

sequence data produced by the Slansky team following their single-cell emulsion RT-PCR technique; however, it can be packaged and shared for use...cell emulsion RT-PCR. Additional modifications were made to our epitope discovery workflow to increase efficacy of transcript and neoantigen candidate...the MiTCR [8] open source software package developed by MiLaboratory. MiTCR is a highly efficient and fast approach to CDR3 extraction, clonotype
Implementation of Cyberinfrastructure and Data Management Workflow for a Large-Scale Sensor Network

NASA Astrophysics Data System (ADS)

Jones, A. S.; Horsburgh, J. S.

2014-12-01

Monitoring with in situ environmental sensors and other forms of field-based observation presents many challenges for data management, particularly for large-scale networks consisting of multiple sites, sensors, and personnel. The availability and utility of these data in addressing scientific questions relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into functional data products. It also depends on the ability of researchers to share and access the data in useable formats. In addition to addressing the challenges presented by the quantity of data, monitoring networks need practices to ensure high data quality, including procedures and tools for post processing. Data quality is further enhanced if practitioners are able to track equipment, deployments, calibrations, and other events related to site maintenance and associate these details with observational data. In this presentation we will describe the overall workflow that we have developed for research groups and sites conducting long term monitoring using in situ sensors. Features of the workflow include: software tools to automate the transfer of data from field sites to databases, a Python-based program for data quality control post-processing, a web-based application for online discovery and visualization of data, and a data model and web interface for managing physical infrastructure. By automating the data management workflow, the time from collection to analysis is reduced and sharing and publication is facilitated. The incorporation of metadata standards and descriptions and the use of open-source tools enhances the sustainability and reusability of the data. We will describe the workflow and tools that we have developed in the context of the iUTAH (innovative Urban Transitions and Aridregion Hydrosustainability) monitoring network. The iUTAH network consists of aquatic and climate sensors deployed in three watersheds to monitor Gradients Along Mountain to Urban Transitions (GAMUT). The variety of environmental sensors and the multi-watershed, multi-institutional nature of the network necessitate a well-planned and efficient workflow for acquiring, managing, and sharing sensor data, which should be useful for similar large-scale and long-term networks.
Software for MR image overlay guided needle insertions: the clinical translation process

NASA Astrophysics Data System (ADS)

Ungi, Tamas; U-Thainual, Paweena; Fritz, Jan; Iordachita, Iulian I.; Flammang, Aaron J.; Carrino, John A.; Fichtinger, Gabor

2013-03-01

PURPOSE: Needle guidance software using augmented reality image overlay was translated from the experimental phase to support preclinical and clinical studies. Major functional and structural changes were needed to meet clinical requirements. We present the process applied to fulfill these requirements, and selected features that may be applied in the translational phase of other image-guided surgical navigation systems. METHODS: We used an agile software development process for rapid adaptation to unforeseen clinical requests. The process is based on iterations of operating room test sessions, feedback discussions, and software development sprints. The open-source application framework of 3D Slicer and the NA-MIC kit provided sufficient flexibility and stable software foundations for this work. RESULTS: All requirements were addressed in a process with 19 operating room test iterations. Most features developed in this phase were related to workflow simplification and operator feedback. CONCLUSION: Efficient and affordable modifications were facilitated by an open source application framework and frequent clinical feedback sessions. Results of cadaver experiments show that software requirements were successfully solved after a limited number of operating room tests.
Technical Note: PLASTIMATCH MABS, an open source tool for automatic image segmentation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zaffino, Paolo; Spadea, Maria Francesca

Purpose: Multiatlas based segmentation is largely used in many clinical and research applications. Due to its good performances, it has recently been included in some commercial platforms for radiotherapy planning and surgery guidance. Anyway, to date, a software with no restrictions about the anatomical district and image modality is still missing. In this paper we introduce PLASTIMATCH MABS, an open source software that can be used with any image modality for automatic segmentation. Methods: PLASTIMATCH MABS workflow consists of two main parts: (1) an offline phase, where optimal registration and voting parameters are tuned and (2) an online phase, wheremore » a new patient is labeled from scratch by using the same parameters as identified in the former phase. Several registration strategies, as well as different voting criteria can be selected. A flexible atlas selection scheme is also available. To prove the effectiveness of the proposed software across anatomical districts and image modalities, it was tested on two very different scenarios: head and neck (H&N) CT segmentation for radiotherapy application, and magnetic resonance image brain labeling for neuroscience investigation. Results: For the neurological study, minimum dice was equal to 0.76 (investigated structures: left and right caudate, putamen, thalamus, and hippocampus). For head and neck case, minimum dice was 0.42 for the most challenging structures (optic nerves and submandibular glands) and 0.62 for the other ones (mandible, brainstem, and parotid glands). Time required to obtain the labels was compatible with a real clinical workflow (35 and 120 min). Conclusions: The proposed software fills a gap in the multiatlas based segmentation field, since all currently available tools (both for commercial and for research purposes) are restricted to a well specified application. Furthermore, it can be adopted as a platform for exploring MABS parameters and as a reference implementation for comparing against other segmentation algorithms.« less
SimITK: visual programming of the ITK image-processing library within Simulink.

PubMed

Dickinson, Andrew W L; Abolmaesumi, Purang; Gobbi, David G; Mousavi, Parvin

2014-04-01

The Insight Segmentation and Registration Toolkit (ITK) is a software library used for image analysis, visualization, and image-guided surgery applications. ITK is a collection of C++ classes that poses the challenge of a steep learning curve should the user not have appropriate C++ programming experience. To remove the programming complexities and facilitate rapid prototyping, an implementation of ITK within a higher-level visual programming environment is presented: SimITK. ITK functionalities are automatically wrapped into "blocks" within Simulink, the visual programming environment of MATLAB, where these blocks can be connected to form workflows: visual schematics that closely represent the structure of a C++ program. The heavily templated C++ nature of ITK does not facilitate direct interaction between Simulink and ITK; an intermediary is required to convert respective data types and allow intercommunication. As such, a SimITK "Virtual Block" has been developed that serves as a wrapper around an ITK class which is capable of resolving the ITK data types to native Simulink data types. Part of the challenge surrounding this implementation involves automatically capturing and storing the pertinent class information that need to be refined from an initial state prior to being reflected within the final block representation. The primary result from the SimITK wrapping procedure is multiple Simulink block libraries. From these libraries, blocks are selected and interconnected to demonstrate two examples: a 3D segmentation workflow and a 3D multimodal registration workflow. Compared to their pure-code equivalents, the workflows highlight ITK usability through an alternative visual interpretation of the code that abstracts away potentially confusing technicalities.
Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry

NASA Astrophysics Data System (ADS)

Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Sacks, David B.; Yu, Yi-Kuo

2018-06-01

Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data

PubMed Central

2014-01-01

Background Mandatory deposit of raw microarray data files for public access, prior to study publication, provides significant opportunities to conduct new bioinformatics analyses within and across multiple datasets. Analysis of raw microarray data files (e.g. Affymetrix CEL files) can be time consuming, complex, and requires fundamental computational and bioinformatics skills. The development of analytical workflows to automate these tasks simplifies the processing of, improves the efficiency of, and serves to standardize multiple and sequential analyses. Once installed, workflows facilitate the tedious steps required to run rapid intra- and inter-dataset comparisons. Results We developed a workflow to facilitate and standardize Meta-Analysis of Affymetrix Microarray Data analysis (MAAMD) in Kepler. Two freely available stand-alone software tools, R and AltAnalyze were embedded in MAAMD. The inputs of MAAMD are user-editable csv files, which contain sample information and parameters describing the locations of input files and required tools. MAAMD was tested by analyzing 4 different GEO datasets from mice and drosophila. MAAMD automates data downloading, data organization, data quality control assesment, differential gene expression analysis, clustering analysis, pathway visualization, gene-set enrichment analysis, and cross-species orthologous-gene comparisons. MAAMD was utilized to identify gene orthologues responding to hypoxia or hyperoxia in both mice and drosophila. The entire set of analyses for 4 datasets (34 total microarrays) finished in ~ one hour. Conclusions MAAMD saves time, minimizes the required computer skills, and offers a standardized procedure for users to analyze microarray datasets and make new intra- and inter-dataset comparisons. PMID:24621103
Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry.

PubMed

Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo

2018-06-05

Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
Agile Data Management with the Global Change Information System

NASA Astrophysics Data System (ADS)

Duggan, B.; Aulenbach, S.; Tilmes, C.; Goldstein, J.

2013-12-01

We describe experiences applying agile software development techniques to the realm of data management during the development of the Global Change Information System (GCIS), a web service and API for authoritative global change information under development by the US Global Change Research Program. Some of the challenges during system design and implementation have been : (1) balancing the need for a rigorous mechanism for ensuring information quality with the realities of large data sets whose contents are often in flux, (2) utilizing existing data to inform decisions about the scope and nature of new data, and (3) continuously incorporating new knowledge and concepts into a relational data model. The workflow for managing the content of the system has much in common with the development of the system itself. We examine various aspects of agile software development and discuss whether or how we have been able to use them for data curation as well as software development.
Quality Improvement With Discrete Event Simulation: A Primer for Radiologists.

PubMed

Booker, Michael T; O'Connell, Ryan J; Desai, Bhushan; Duddalwar, Vinay A

2016-04-01

The application of simulation software in health care has transformed quality and process improvement. Specifically, software based on discrete-event simulation (DES) has shown the ability to improve radiology workflows and systems. Nevertheless, despite the successful application of DES in the medical literature, the power and value of simulation remains underutilized. For this reason, the basics of DES modeling are introduced, with specific attention to medical imaging. In an effort to provide readers with the tools necessary to begin their own DES analyses, the practical steps of choosing a software package and building a basic radiology model are discussed. In addition, three radiology system examples are presented, with accompanying DES models that assist in analysis and decision making. Through these simulations, we provide readers with an understanding of the theory, requirements, and benefits of implementing DES in their own radiology practices. Copyright © 2016 American College of Radiology. All rights reserved.
RINGMesh: A programming library for developing mesh-based geomodeling applications

NASA Astrophysics Data System (ADS)

Pellerin, Jeanne; Botella, Arnaud; Bonneau, François; Mazuyer, Antoine; Chauvin, Benjamin; Lévy, Bruno; Caumon, Guillaume

2017-07-01

RINGMesh is a C++ open-source programming library for manipulating discretized geological models. It is designed to ease the development of applications and workflows that use discretized 3D models. It is neither a geomodeler, nor a meshing software. RINGMesh implements functionalities to read discretized surface-based or volumetric structural models and to check their validity. The models can be then exported in various file formats. RINGMesh provides data structures to represent geological structural models, either defined by their discretized boundary surfaces, and/or by discretized volumes. A programming interface allows to develop of new geomodeling methods, and to plug in external software. The goal of RINGMesh is to help researchers to focus on the implementation of their specific method rather than on tedious tasks common to many applications. The documented code is open-source and distributed under the modified BSD license. It is available at https://www.ring-team.org/index.php/software/ringmesh.
OpenMS: a flexible open-source software platform for mass spectrometry data analysis.

PubMed

Röst, Hannes L; Sachsenberg, Timo; Aiche, Stephan; Bielow, Chris; Weisser, Hendrik; Aicheler, Fabian; Andreotti, Sandro; Ehrlich, Hans-Christian; Gutenbrunner, Petra; Kenar, Erhan; Liang, Xiao; Nahnsen, Sven; Nilse, Lars; Pfeuffer, Julianus; Rosenberger, George; Rurik, Marc; Schmitt, Uwe; Veit, Johannes; Walzer, Mathias; Wojnar, David; Wolski, Witold E; Schilling, Oliver; Choudhary, Jyoti S; Malmström, Lars; Aebersold, Ruedi; Reinert, Knut; Kohlbacher, Oliver

2016-08-30

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.
Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Chase Qishi; Zhu, Michelle Mengxia

The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.