parallel tools platform: Topics by Science.gov

Sample records for parallel tools platform

A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

1998-01-01

Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
A software platform for continuum modeling of ion channels based on unstructured mesh

NASA Astrophysics Data System (ADS)

Tu, B.; Bai, S. Y.; Chen, M. X.; Xie, Y.; Zhang, L. B.; Lu, B. Z.

2014-01-01

Most traditional continuum molecular modeling adopted finite difference or finite volume methods which were based on a structured mesh (grid). Unstructured meshes were only occasionally used, but an increased number of applications emerge in molecular simulations. To facilitate the continuum modeling of biomolecular systems based on unstructured meshes, we are developing a software platform with tools which are particularly beneficial to those approaches. This work describes the software system specifically for the simulation of a typical, complex molecular procedure: ion transport through a three-dimensional channel system that consists of a protein and a membrane. The platform contains three parts: a meshing tool chain for ion channel systems, a parallel finite element solver for the Poisson-Nernst-Planck equations describing the electrodiffusion process of ion transport, and a visualization program for continuum molecular modeling. The meshing tool chain in the platform, which consists of a set of mesh generation tools, is able to generate high-quality surface and volume meshes for ion channel systems. The parallel finite element solver in our platform is based on the parallel adaptive finite element package PHG which wass developed by one of the authors [1]. As a featured component of the platform, a new visualization program, VCMM, has specifically been developed for continuum molecular modeling with an emphasis on providing useful facilities for unstructured mesh-based methods and for their output analysis and visualization. VCMM provides a graphic user interface and consists of three modules: a molecular module, a meshing module and a numerical module. A demonstration of the platform is provided with a study of two real proteins, the connexin 26 and hemolysin ion channels.
Parallel processing implementation for the coupled transport of photons and electrons using OpenMP

NASA Astrophysics Data System (ADS)

Doerner, Edgardo

2016-05-01

In this work the use of OpenMP to implement the parallel processing of the Monte Carlo (MC) simulation of the coupled transport for photons and electrons is presented. This implementation was carried out using a modified EGSnrc platform which enables the use of the Microsoft Visual Studio 2013 (VS2013) environment, together with the developing tools available in the Intel Parallel Studio XE 2015 (XE2015). The performance study of this new implementation was carried out in a desktop PC with a multi-core CPU, taking as a reference the performance of the original platform. The results were satisfactory, both in terms of scalability as parallelization efficiency.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Three-Point Gear/Lead Screw Positioning

NASA Technical Reports Server (NTRS)

Calco, Frank S.

1993-01-01

Triple-ganged-lead-screw positioning mechanism drives movable plate toward or away from fixed plate and keeps plates parallel to each other. Designed for use in tuning microwave resonant cavity. Other potential applications include adjustable bed plates and cantilever tail stocks in machine tools, adjustable platforms for optical equipment, and lifting platforms.
Techniques and Tools for Performance Tuning of Parallel and Distributed Scientific Applications

NASA Technical Reports Server (NTRS)

Sarukkai, Sekhar R.; VanderWijngaart, Rob F.; Castagnera, Karen (Technical Monitor)

1994-01-01

Performance degradation in scientific computing on parallel and distributed computer systems can be caused by numerous factors. In this half-day tutorial we explain what are the important methodological issues involved in obtaining codes that have good performance potential. Then we discuss what are the possible obstacles in realizing that potential on contemporary hardware platforms, and give an overview of the software tools currently available for identifying the performance bottlenecks. Finally, some realistic examples are used to illustrate the actual use and utility of such tools.
A high-performance spatial database based approach for pathology imaging algorithm evaluation

PubMed Central

Wang, Fusheng; Kong, Jun; Gao, Jingjing; Cooper, Lee A.D.; Kurc, Tahsin; Zhou, Zhengwen; Adler, David; Vergara-Niedermayr, Cristobal; Katigbak, Bryan; Brat, Daniel J.; Saltz, Joel H.

2013-01-01

Background: Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison with human annotations, combine results from multiple algorithms for performance improvement, and facilitate algorithm sensitivity studies. The sizes of images and image analysis results in pathology image analysis pose significant challenges in algorithm evaluation. We present an efficient parallel spatial database approach to model, normalize, manage, and query large volumes of analytical image result data. This provides an efficient platform for algorithm evaluation. Our experiments with a set of brain tumor images demonstrate the application, scalability, and effectiveness of the platform. Context: The paper describes an approach and platform for evaluation of pathology image analysis algorithms. The platform facilitates algorithm evaluation through a high-performance database built on the Pathology Analytic Imaging Standards (PAIS) data model. Aims: (1) Develop a framework to support algorithm evaluation by modeling and managing analytical results and human annotations from pathology images; (2) Create a robust data normalization tool for converting, validating, and fixing spatial data from algorithm or human annotations; (3) Develop a set of queries to support data sampling and result comparisons; (4) Achieve high performance computation capacity via a parallel data management infrastructure, parallel data loading and spatial indexing optimizations in this infrastructure. Materials and Methods: We have considered two scenarios for algorithm evaluation: (1) algorithm comparison where multiple result sets from different methods are compared and consolidated; and (2) algorithm validation where algorithm results are compared with human annotations. We have developed a spatial normalization toolkit to validate and normalize spatial boundaries produced by image analysis algorithms or human annotations. The validated data were formatted based on the PAIS data model and loaded into a spatial database. To support efficient data loading, we have implemented a parallel data loading tool that takes advantage of multi-core CPUs to accelerate data injection. The spatial database manages both geometric shapes and image features or classifications, and enables spatial sampling, result comparison, and result aggregation through expressive structured query language (SQL) queries with spatial extensions. To provide scalable and efficient query support, we have employed a shared nothing parallel database architecture, which distributes data homogenously across multiple database partitions to take advantage of parallel computation power and implements spatial indexing to achieve high I/O throughput. Results: Our work proposes a high performance, parallel spatial database platform for algorithm validation and comparison. This platform was evaluated by storing, managing, and comparing analysis results from a set of brain tumor whole slide images. The tools we develop are open source and available to download. Conclusions: Pathology image algorithm validation and comparison are essential to iterative algorithm development and refinement. One critical component is the support for queries involving spatial predicates and comparisons. In our work, we develop an efficient data model and parallel database approach to model, normalize, manage and query large volumes of analytical image result data. Our experiments demonstrate that the data partitioning strategy and the grid-based indexing result in good data distribution across database nodes and reduce I/O overhead in spatial join queries through parallel retrieval of relevant data and quick subsetting of datasets. The set of tools in the framework provide a full pipeline to normalize, load, manage and query analytical results for algorithm evaluation. PMID:23599905
Distributed and parallel approach for handle and perform huge datasets

NASA Astrophysics Data System (ADS)

Konopko, Joanna

2015-12-01

Big Data refers to the dynamic, large and disparate volumes of data comes from many different sources (tools, machines, sensors, mobile devices) uncorrelated with each others. It requires new, innovative and scalable technology to collect, host and analytically process the vast amount of data. Proper architecture of the system that perform huge data sets is needed. In this paper, the comparison of distributed and parallel system architecture is presented on the example of MapReduce (MR) Hadoop platform and parallel database platform (DBMS). This paper also analyzes the problem of performing and handling valuable information from petabytes of data. The both paradigms: MapReduce and parallel DBMS are described and compared. The hybrid architecture approach is also proposed and could be used to solve the analyzed problem of storing and processing Big Data.
Clinical validation of the 50 gene AmpliSeq Cancer Panel V2 for use on a next generation sequencing platform using formalin fixed, paraffin embedded and fine needle aspiration tumour specimens.

PubMed

Rathi, Vivek; Wright, Gavin; Constantin, Diana; Chang, Siok; Pham, Huong; Jones, Kerryn; Palios, Atha; Mclachlan, Sue-Anne; Conron, Matthew; McKelvie, Penny; Williams, Richard

2017-01-01

The advent of massively parallel sequencing has caused a paradigm shift in the ways cancer is treated, as personalised therapy becomes a reality. More and more laboratories are looking to introduce next generation sequencing (NGS) as a tool for mutational analysis, as this technology has many advantages compared to conventional platforms like Sanger sequencing. In Australia all massively parallel sequencing platforms are still considered in-house in vitro diagnostic tools by the National Association of Testing Authorities (NATA) and a comprehensive analytical validation of all assays, and not just mere verification, is a strict requirement before accreditation can be granted for clinical testing on these platforms. Analytical validation of assays on NGS platforms can prove to be extremely challenging for pathology laboratories. Although there are many affordable and easily accessible NGS instruments available, there are no standardised guidelines as yet for clinical validation of NGS assays. We present an accreditation development procedure that was both comprehensive and applicable in a setting of hospital laboratory for NGS services. This approach may also be applied to other NGS applications in service laboratories. Copyright © 2016 Royal College of Pathologists of Australasia. Published by Elsevier B.V. All rights reserved.
Long Read Alignment with Parallel MapReduce Cloud Platform

PubMed Central

Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

2015-01-01

Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887
Long Read Alignment with Parallel MapReduce Cloud Platform.

PubMed

Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

2015-01-01

Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

NASA Technical Reports Server (NTRS)

Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

1997-01-01

The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Hribar, Michelle; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but the task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study,potentials of applying some of the techniques to realistic aerospace applications will be presented
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Hribar, M.; Waheed, A.; Yan, J.; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but this task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study, we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study, potentials of applying some of the techniques to realistic aerospace applications will be presented.
Reconstructing evolutionary trees in parallel for massive sequences.

PubMed

Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

2017-12-14

Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
Scaling Support Vector Machines On Modern HPC Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

You, Yang; Fu, Haohuan; Song, Shuaiwen

2015-02-01

We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
Aztec user`s guide. Version 1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.

1995-10-01

Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
The adverse outcome pathway knowledge base

EPA Science Inventory

The rapid advancement of the Adverse Outcome Pathway (AOP) framework has been paralleled by the development of tools to store, analyse, and explore AOPs. The AOP Knowledge Base (AOP-KB) project has brought three independently developed platforms (Effectopedia, AOP-Wiki, and AOP-X...
A high performance scientific cloud computing environment for materials simulations

NASA Astrophysics Data System (ADS)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Updating the Micro-Tom TILLING platform.

PubMed

Okabe, Yoshihiro; Ariizumi, Tohru; Ezura, Hiroshi

2013-03-01

The dwarf tomato variety Micro-Tom is regarded as a model system for functional genomics studies in tomato. Various tomato genomic tools in the genetic background of Micro-Tom have been established, such as mutant collections, genome information and a metabolomic database. Recent advances in tomato genome sequencing have brought about a significant need for reverse genetics tools that are accessible to the larger community, because a great number of gene sequences have become available from public databases. To meet the requests from the tomato research community, we have developed the Micro-Tom Targeting-Induced Local Lesions IN Genomes (TILLING) platform, which is comprised of more than 5000 EMS-mutagenized lines. The platform serves as a reverse genetics tool for efficiently identifying mutant alleles in parallel with the development of Micro-Tom mutant collections. The combination of Micro-Tom mutant libraries and the TILLING approach enables researchers to accelerate the isolation of desirable mutants for unraveling gene function or breeding. To upgrade the genomic tool of Micro-Tom, the development of a new mutagenized population is underway. In this paper, the current status of the Micro-Tom TILLING platform and its future prospects are described.

DNA Assembly with De Bruijn Graphs Using an FPGA Platform.

PubMed

Poirier, Carl; Gosselin, Benoit; Fortier, Paul

2018-01-01

This paper presents an FPGA implementation of a DNA assembly algorithm, called Ray, initially developed to run on parallel CPUs. The OpenCL language is used and the focus is placed on modifying and optimizing the original algorithm to better suit the new parallelization tool and the radically different hardware architecture. The results show that the execution time is roughly one fourth that of the CPU and factoring energy consumption yields a tenfold savings.
A review of bioinformatic methods for forensic DNA analyses.

PubMed

Liu, Yao-Yuan; Harbison, SallyAnn

2018-03-01

Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis.

PubMed

Simonyan, Vahan; Mazumder, Raja

2014-09-30

The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis

PubMed Central

Simonyan, Vahan; Mazumder, Raja

2014-01-01

The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis. PMID:25271953
PIPER: Performance Insight for Programmers and Exascale Runtimes: Guiding the Development of the Exascale Software Stack

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mellor-Crummey, John

The PIPER project set out to develop methodologies and software for measurement, analysis, attribution, and presentation of performance data for extreme-scale systems. Goals of the project were to support analysis of massive multi-scale parallelism, heterogeneous architectures, multi-faceted performance concerns, and to support both post-mortem performance analysis to identify program features that contribute to problematic performance and on-line performance analysis to drive adaptation. This final report summarizes the research and development activity at Rice University as part of the PIPER project. Producing a complete suite of performance tools for exascale platforms during the course of this project was impossible since bothmore » hardware and software for exascale systems is still a moving target. For that reason, the project focused broadly on the development of new techniques for measurement and analysis of performance on modern parallel architectures, enhancements to HPCToolkit’s software infrastructure to support our research goals or use on sophisticated applications, engaging developers of multithreaded runtimes to explore how support for tools should be integrated into their designs, engaging operating system developers with feature requests for enhanced monitoring support, engaging vendors with requests that they add hardware measure- ment capabilities and software interfaces needed by tools as they design new components of HPC platforms including processors, accelerators and networks, and finally collaborations with partners interested in using HPCToolkit to analyze and tune scalable parallel applications.« less
A Cross-Platform Infrastructure for Scalable Runtime Application Performance Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jack Dongarra; Shirley Moore; Bart Miller, Jeffrey Hollingsworth

2005-03-15

The purpose of this project was to build an extensible cross-platform infrastructure to facilitate the development of accurate and portable performance analysis tools for current and future high performance computing (HPC) architectures. Major accomplishments include tools and techniques for multidimensional performance analysis, as well as improved support for dynamic performance monitoring of multithreaded and multiprocess applications. Previous performance tool development has been limited by the burden of having to re-write a platform-dependent low-level substrate for each architecture/operating system pair in order to obtain the necessary performance data from the system. Manual interpretation of performance data is not scalable for large-scalemore » long-running applications. The infrastructure developed by this project provides a foundation for building portable and scalable performance analysis tools, with the end goal being to provide application developers with the information they need to analyze, understand, and tune the performance of terascale applications on HPC architectures. The backend portion of the infrastructure provides runtime instrumentation capability and access to hardware performance counters, with thread-safety for shared memory environments and a communication substrate to support instrumentation of multiprocess and distributed programs. Front end interfaces provides tool developers with a well-defined, platform-independent set of calls for requesting performance data. End-user tools have been developed that demonstrate runtime data collection, on-line and off-line analysis of performance data, and multidimensional performance analysis. The infrastructure is based on two underlying performance instrumentation technologies. These technologies are the PAPI cross-platform library interface to hardware performance counters and the cross-platform Dyninst library interface for runtime modification of executable images. The Paradyn and KOJAK projects have made use of this infrastructure to build performance measurement and analysis tools that scale to long-running programs on large parallel and distributed systems and that automate much of the search for performance bottlenecks.« less
Allinea Parallel Profiling and Debugging Tools on the Peregrine System |

Science.gov Websites

client for your platform. (Mac/Windows/Linux) Configuration to connect to Peregrine: Open the Allinea view it # directly through x11 forwarding just type 'map', # it will open a GUI. $ map # to profile an enable x-forwarding when connecting to # Peregrine. $ map # This will open the GUI Debugging using
Advanced mathematical on-line analysis in nuclear experiments. Usage of parallel computing CUDA routines in standard root analysis

NASA Astrophysics Data System (ADS)

Grzeszczuk, A.; Kowalski, S.

2015-04-01

Compute Unified Device Architecture (CUDA) is a parallel computing platform developed by Nvidia for increase speed of graphics by usage of parallel mode for processes calculation. The success of this solution has opened technology General-Purpose Graphic Processor Units (GPGPUs) for applications not coupled with graphics. The GPGPUs system can be applying as effective tool for reducing huge number of data for pulse shape analysis measures, by on-line recalculation or by very quick system of compression. The simplified structure of CUDA system and model of programming based on example Nvidia GForce GTX580 card are presented by our poster contribution in stand-alone version and as ROOT application.
Geospatial Applications on Different Parallel and Distributed Systems in enviroGRIDS Project

NASA Astrophysics Data System (ADS)

Rodila, D.; Bacu, V.; Gorgan, D.

2012-04-01

The execution of Earth Science applications and services on parallel and distributed systems has become a necessity especially due to the large amounts of Geospatial data these applications require and the large geographical areas they cover. The parallelization of these applications comes to solve important performance issues and can spread from task parallelism to data parallelism as well. Parallel and distributed architectures such as Grid, Cloud, Multicore, etc. seem to offer the necessary functionalities to solve important problems in the Earth Science domain: storing, distribution, management, processing and security of Geospatial data, execution of complex processing through task and data parallelism, etc. A main goal of the FP7-funded project enviroGRIDS (Black Sea Catchment Observation and Assessment System supporting Sustainable Development) [1] is the development of a Spatial Data Infrastructure targeting this catchment region but also the development of standardized and specialized tools for storing, analyzing, processing and visualizing the Geospatial data concerning this area. For achieving these objectives, the enviroGRIDS deals with the execution of different Earth Science applications, such as hydrological models, Geospatial Web services standardized by the Open Geospatial Consortium (OGC) and others, on parallel and distributed architecture to maximize the obtained performance. This presentation analysis the integration and execution of Geospatial applications on different parallel and distributed architectures and the possibility of choosing among these architectures based on application characteristics and user requirements through a specialized component. Versions of the proposed platform have been used in enviroGRIDS project on different use cases such as: the execution of Geospatial Web services both on Web and Grid infrastructures [2] and the execution of SWAT hydrological models both on Grid and Multicore architectures [3]. The current focus is to integrate in the proposed platform the Cloud infrastructure, which is still a paradigm with critical problems to be solved despite the great efforts and investments. Cloud computing comes as a new way of delivering resources while using a large set of old as well as new technologies and tools for providing the necessary functionalities. The main challenges in the Cloud computing, most of them identified also in the Open Cloud Manifesto 2009, address resource management and monitoring, data and application interoperability and portability, security, scalability, software licensing, etc. We propose a platform able to execute different Geospatial applications on different parallel and distributed architectures such as Grid, Cloud, Multicore, etc. with the possibility of choosing among these architectures based on application characteristics and complexity, user requirements, necessary performances, cost support, etc. The execution redirection on a selected architecture is realized through a specialized component and has the purpose of offering a flexible way in achieving the best performances considering the existing restrictions.
Extending the BEAGLE library to a multi-FPGA platform.

PubMed

Jin, Zheming; Bakos, Jason D

2013-01-19

Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.
The JASMIN Analysis Platform - bridging the gap between traditional climate data practicies and data-centric analysis paradigms

NASA Astrophysics Data System (ADS)

Pascoe, Stephen; Iwi, Alan; kershaw, philip; Stephens, Ag; Lawrence, Bryan

2014-05-01

The advent of large-scale data and the consequential analysis problems have led to two new challenges for the research community: how to share such data to get the maximum value and how to carry out efficient analysis. Solving both challenges require a form of parallelisation: the first is social parallelisation (involving trust and information sharing), the second data parallelisation (involving new algorithms and tools). The JASMIN infrastructure supports both kinds of parallelism by providing a multi-tennent environment with petabyte-scale storage, VM provisioning and batch cluster facilities. The JASMIN Analysis Platform (JAP) is an analysis software layer for JASMIN which emphasises ease of transition from a researcher's local environment to JASMIN. JAP brings together tools traditionally used by multiple communities and configures them to work together, enabling users to move analysis from their local environment to JASMIN without rewriting code. JAP also provides facilities to exploit JASMIN's parallel capabilities whilst maintaining their familiar analysis environment where ever possible. Modern opensource analysis tools typically have multiple dependent packages, increasing the installation burden on system administrators. When you consider a suite of tools, often with both common and conflicting dependencies, analysis pipelines can become locked to a particular installation simply because of the effort required to reconstruct the dependency tree. JAP addresses this problem by providing a consistent suite of RPMs compatible with RedHat Enterprise Linux and CentOS 6.4. Researchers can install JAP locally, either as RPMs or through a pre-built VM image, giving them the confidence to know moving analysis to JASMIN will not disrupt their environment. Analysis parallelisation is in it's infancy in climate sciences, with few tools capable of exploiting any parallel environment beyond manual scripting of the use of multiple processors. JAP begins to bridge this gap through a veriety of higher-level tools for parallelisation and job scheduling such as IPython-parallel and MPI support for interactive analysis languages. We find that enabling even simple parallelisation of workflows, together with the state of the art I/O performance of JASMIN storage, provides many users with the large increases in efficiency they need to scale their analyses to conteporary data volumes and tackly new, previously inaccessible, problems.
Biologically driven neural platform invoking parallel electrophoretic separation and urinary metabolite screening.

PubMed

Page, Tessa; Nguyen, Huong Thi Huynh; Hilts, Lindsey; Ramos, Lorena; Hanrahan, Grady

2012-06-01

This work reveals a computational framework for parallel electrophoretic separation of complex biological macromolecules and model urinary metabolites. More specifically, the implementation of a particle swarm optimization (PSO) algorithm on a neural network platform for multiparameter optimization of multiplexed 24-capillary electrophoresis technology with UV detection is highlighted. Two experimental systems were examined: (1) separation of purified rabbit metallothioneins and (2) separation of model toluene urinary metabolites and selected organic acids. Results proved superior to the use of neural networks employing standard back propagation when examining training error, fitting response, and predictive abilities. Simulation runs were obtained as a result of metaheuristic examination of the global search space with experimental responses in good agreement with predicted values. Full separation of selected analytes was realized after employing optimal model conditions. This framework provides guidance for the application of metaheuristic computational tools to aid in future studies involving parallel chemical separation and screening. Adaptable pseudo-code is provided to enable users of varied software packages and modeling framework to implement the PSO algorithm for their desired use.
Planetary-Scale Geospatial Data Analysis Techniques in Google's Earth Engine Platform (Invited)

NASA Astrophysics Data System (ADS)

Hancher, M.

2013-12-01

Geoscientists have more and more access to new tools for large-scale computing. With any tool, some tasks are easy and other tasks hard. It is natural to look to new computing platforms to increase the scale and efficiency of existing techniques, but there is a more exiting opportunity to discover and develop a new vocabulary of fundamental analysis idioms that are made easy and effective by these new tools. Google's Earth Engine platform is a cloud computing environment for earth data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog includes a nearly complete archive of scenes from Landsat 4, 5, 7, and 8 that have been processed by the USGS, as well as a wide variety of other remotely-sensed and ancillary data products. Earth Engine supports a just-in-time computation model that enables real-time preview during algorithm development and debugging as well as during experimental data analysis and open-ended data exploration. Data processing operations are performed in parallel across many computers in Google's datacenters. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, resampling, and associating image metadata with pixel data. Early applications of Earth Engine have included the development of Google's global cloud-free fifteen-meter base map and global multi-decadal time-lapse animations, as well as numerous large and small experimental analyses by scientists from a range of academic, government, and non-governmental institutions, working in a wide variety of application areas including forestry, agriculture, urban mapping, and species habitat modeling. Patterns in the successes and failures of these early efforts have begun to emerge, sketching the outlines of a new set of simple and effective approaches to geospatial data analysis.
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Ye; Ma, Xiaosong; Liu, Qing Gary

2015-01-01

Parallel application benchmarks are indispensable for evaluating/optimizing HPC software and hardware. However, it is very challenging and costly to obtain high-fidelity benchmarks reflecting the scale and complexity of state-of-the-art parallel applications. Hand-extracted synthetic benchmarks are time-and labor-intensive to create. Real applications themselves, while offering most accurate performance evaluation, are expensive to compile, port, reconfigure, and often plainly inaccessible due to security or ownership concerns. This work contributes APPRIME, a novel tool for trace-based automatic parallel benchmark generation. Taking as input standard communication-I/O traces of an application's execution, it couples accurate automatic phase identification with statistical regeneration of event parameters tomore » create compact, portable, and to some degree reconfigurable parallel application benchmarks. Experiments with four NAS Parallel Benchmarks (NPB) and three real scientific simulation codes confirm the fidelity of APPRIME benchmarks. They retain the original applications' performance characteristics, in particular the relative performance across platforms.« less
On Designing Multicore-Aware Simulators for Systems Biology Endowed with OnLine Statistics

PubMed Central

Calcagno, Cristina; Coppo, Mario

2014-01-01

The paper arguments are on enabling methodologies for the design of a fully parallel, online, interactive tool aiming to support the bioinformatics scientists .In particular, the features of these methodologies, supported by the FastFlow parallel programming framework, are shown on a simulation tool to perform the modeling, the tuning, and the sensitivity analysis of stochastic biological models. A stochastic simulation needs thousands of independent simulation trajectories turning into big data that should be analysed by statistic and data mining tools. In the considered approach the two stages are pipelined in such a way that the simulation stage streams out the partial results of all simulation trajectories to the analysis stage that immediately produces a partial result. The simulation-analysis workflow is validated for performance and effectiveness of the online analysis in capturing biological systems behavior on a multicore platform and representative proof-of-concept biological systems. The exploited methodologies include pattern-based parallel programming and data streaming that provide key features to the software designers such as performance portability and efficient in-memory (big) data management and movement. Two paradigmatic classes of biological systems exhibiting multistable and oscillatory behavior are used as a testbed. PMID:25050327
On designing multicore-aware simulators for systems biology endowed with OnLine statistics.

PubMed

Aldinucci, Marco; Calcagno, Cristina; Coppo, Mario; Damiani, Ferruccio; Drocco, Maurizio; Sciacca, Eva; Spinella, Salvatore; Torquati, Massimo; Troina, Angelo

2014-01-01

The paper arguments are on enabling methodologies for the design of a fully parallel, online, interactive tool aiming to support the bioinformatics scientists .In particular, the features of these methodologies, supported by the FastFlow parallel programming framework, are shown on a simulation tool to perform the modeling, the tuning, and the sensitivity analysis of stochastic biological models. A stochastic simulation needs thousands of independent simulation trajectories turning into big data that should be analysed by statistic and data mining tools. In the considered approach the two stages are pipelined in such a way that the simulation stage streams out the partial results of all simulation trajectories to the analysis stage that immediately produces a partial result. The simulation-analysis workflow is validated for performance and effectiveness of the online analysis in capturing biological systems behavior on a multicore platform and representative proof-of-concept biological systems. The exploited methodologies include pattern-based parallel programming and data streaming that provide key features to the software designers such as performance portability and efficient in-memory (big) data management and movement. Two paradigmatic classes of biological systems exhibiting multistable and oscillatory behavior are used as a testbed.
P-Hint-Hunt: a deep parallelized whole genome DNA methylation detection tool.

PubMed

Peng, Shaoliang; Yang, Shunyun; Gao, Ming; Liao, Xiangke; Liu, Jie; Yang, Canqun; Wu, Chengkun; Yu, Wenqiang

2017-03-14

The increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today's relative tools almost suffer from inaccuracies and time-consuming problems. In our study, we designed a new DNA methylation prediction tool ("Hint-Hunt") to solve the problem. By having an optimal complex alignment computation and Smith-Waterman matrix dynamic programming, Hint-Hunt could analyze and predict the DNA methylation status. But when Hint-Hunt tried to predict DNA methylation status with large-scale dataset, there are still slow speed and low temporal-spatial efficiency problems. In order to solve the problems of Smith-Waterman dynamic programming and low temporal-spatial efficiency, we further design a deep parallelized whole genome DNA methylation detection tool ("P-Hint-Hunt") on Tianhe-2 (TH-2) supercomputer. To the best of our knowledge, P-Hint-Hunt is the first parallel DNA methylation detection tool with a high speed-up to process large-scale dataset, and could run both on CPU and Intel Xeon Phi coprocessors. Moreover, we deploy and evaluate Hint-Hunt and P-Hint-Hunt on TH-2 supercomputer in different scales. The experimental results illuminate our tools eliminate the deviation caused by bisulfite treatment in mapping procedure and the multi-level parallel program yields a 48 times speed-up with 64 threads. P-Hint-Hunt gain a deep acceleration on CPU and Intel Xeon Phi heterogeneous platform, which gives full play of the advantages of multi-cores (CPU) and many-cores (Phi).
A novel medical image data-based multi-physics simulation platform for computational life sciences.

PubMed

Neufeld, Esra; Szczerba, Dominik; Chavannes, Nicolas; Kuster, Niels

2013-04-06

Simulating and modelling complex biological systems in computational life sciences requires specialized software tools that can perform medical image data-based modelling, jointly visualize the data and computational results, and handle large, complex, realistic and often noisy anatomical models. The required novel solvers must provide the power to model the physics, biology and physiology of living tissue within the full complexity of the human anatomy (e.g. neuronal activity, perfusion and ultrasound propagation). A multi-physics simulation platform satisfying these requirements has been developed for applications including device development and optimization, safety assessment, basic research, and treatment planning. This simulation platform consists of detailed, parametrized anatomical models, a segmentation and meshing tool, a wide range of solvers and optimizers, a framework for the rapid development of specialized and parallelized finite element method solvers, a visualization toolkit-based visualization engine, a Python scripting interface for customized applications, a coupling framework, and more. Core components are cross-platform compatible and use open formats. Several examples of applications are presented: hyperthermia cancer treatment planning, tumour growth modelling, evaluating the magneto-haemodynamic effect as a biomarker and physics-based morphing of anatomical models.
PRANAS: A New Platform for Retinal Analysis and Simulation.

PubMed

Cessac, Bruno; Kornprobst, Pierre; Kraria, Selim; Nasser, Hassan; Pamplona, Daniela; Portelli, Geoffrey; Viéville, Thierry

2017-01-01

The retina encodes visual scenes by trains of action potentials that are sent to the brain via the optic nerve. In this paper, we describe a new free access user-end software allowing to better understand this coding. It is called PRANAS (https://pranas.inria.fr), standing for Platform for Retinal ANalysis And Simulation. PRANAS targets neuroscientists and modelers by providing a unique set of retina-related tools. PRANAS integrates a retina simulator allowing large scale simulations while keeping a strong biological plausibility and a toolbox for the analysis of spike train population statistics. The statistical method (entropy maximization under constraints) takes into account both spatial and temporal correlations as constraints, allowing to analyze the effects of memory on statistics. PRANAS also integrates a tool computing and representing in 3D (time-space) receptive fields. All these tools are accessible through a friendly graphical user interface. The most CPU-costly of them have been implemented to run in parallel.

Accelerating large-scale protein structure alignments with graphics processing units

PubMed Central

2012-01-01

Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

PubMed

Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

2018-04-11

Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
A new tool for supervised classification of satellite images available on web servers: Google Maps as a case study

NASA Astrophysics Data System (ADS)

García-Flores, Agustín.; Paz-Gallardo, Abel; Plaza, Antonio; Li, Jun

2016-10-01

This paper describes a new web platform dedicated to the classification of satellite images called Hypergim. The current implementation of this platform enables users to perform classification of satellite images from any part of the world thanks to the worldwide maps provided by Google Maps. To perform this classification, Hypergim uses unsupervised algorithms like Isodata and K-means. Here, we present an extension of the original platform in which we adapt Hypergim in order to use supervised algorithms to improve the classification results. This involves a significant modification of the user interface, providing the user with a way to obtain samples of classes present in the images to use in the training phase of the classification process. Another main goal of this development is to improve the runtime of the image classification process. To achieve this goal, we use a parallel implementation of the Random Forest classification algorithm. This implementation is a modification of the well-known CURFIL software package. The use of this type of algorithms to perform image classification is widespread today thanks to its precision and ease of training. The actual implementation of Random Forest was developed using CUDA platform, which enables us to exploit the potential of several models of NVIDIA graphics processing units using them to execute general purpose computing tasks as image classification algorithms. As well as CUDA, we use other parallel libraries as Intel Boost, taking advantage of the multithreading capabilities of modern CPUs. To ensure the best possible results, the platform is deployed in a cluster of commodity graphics processing units (GPUs), so that multiple users can use the tool in a concurrent way. The experimental results indicate that this new algorithm widely outperform the previous unsupervised algorithms implemented in Hypergim, both in runtime as well as precision of the actual classification of the images.
MRUniNovo: an efficient tool for de novo peptide sequencing utilizing the hadoop distributed computing framework.

PubMed

Li, Chuang; Chen, Tao; He, Qiang; Zhu, Yunping; Li, Kenli

2017-03-15

Tandem mass spectrometry-based de novo peptide sequencing is a complex and time-consuming process. The current algorithms for de novo peptide sequencing cannot rapidly and thoroughly process large mass spectrometry datasets. In this paper, we propose MRUniNovo, a novel tool for parallel de novo peptide sequencing. MRUniNovo parallelizes UniNovo based on the Hadoop compute platform. Our experimental results demonstrate that MRUniNovo significantly reduces the computation time of de novo peptide sequencing without sacrificing the correctness and accuracy of the results, and thus can process very large datasets that UniNovo cannot. MRUniNovo is an open source software tool implemented in java. The source code and the parameter settings are available at http://bioinfo.hupo.org.cn/MRUniNovo/index.php. s131020002@hnu.edu.cn ; taochen1019@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Parallel tools GUI framework-DOE SBIR phase I final technical report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Galarowicz, James

2013-12-05

Many parallel performance, profiling, and debugging tools require a graphical way of displaying the very large datasets typically gathered from high performance computing (HPC) applications. Most tool projects create their graphical user interfaces (GUI) from scratch, many times spending their project resources on simply redeveloping commonly used infrastructure. Our goal was to create a multiplatform GUI framework, based on Nokia/Digia’s popular Qt libraries, which will specifically address the needs of these parallel tools. The Parallel Tools GUI Framework (PTGF) uses a plugin architecture facilitating rapid GUI development and reduced development costs for new and existing tool projects by allowing themore » reuse of many common GUI elements, called “widgets.” Widgets created include, 2D data visualizations, a source code viewer with syntax highlighting, and integrated help and welcome screens. Application programming interface (API) design was focused on minimizing the time to getting a functional tool working. Having a standard, unified, and userfriendly interface which operates on multiple platforms will benefit HPC application developers by reducing training time and allowing users to move between tools rapidly during a single session. However, Argo Navis Technologies LLC will not be submitting a DOE SBIR Phase II proposal and commercialization plan for the PTGF project. Our preliminary estimates for gross income over the next several years was based upon initial customer interest and income generated by similar projects. Unfortunately, as we further assessed the market during Phase I, we grew to realize that there was not enough demand to warrant such a large investment. While we do find that the project is worth our continued investment of time and money, we do not think it worthy of the DOE's investment at this time. We are grateful that the DOE has afforded us the opportunity to make this assessment, and come to this conclusion.« less
Automated Generation of Message-Passing Programs: An Evaluation Using CAPTools

NASA Technical Reports Server (NTRS)

Hribar, Michelle R.; Jin, Haoqiang; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

Scientists at NASA Ames Research Center have been developing computational aeroscience applications on highly parallel architectures over the past ten years. During that same time period, a steady transition of hardware and system software also occurred, forcing us to expend great efforts into migrating and re-coding our applications. As applications and machine architectures become increasingly complex, the cost and time required for this process will become prohibitive. In this paper, we present the first set of results in our evaluation of interactive parallelization tools. In particular, we evaluate CAPTool's ability to parallelize computational aeroscience applications. CAPTools was tested on serial versions of the NAS Parallel Benchmarks and ARC3D, a computational fluid dynamics application, on two platforms: the SGI Origin 2000 and the Cray T3E. This evaluation includes performance, amount of user interaction required, limitations and portability. Based on these results, a discussion on the feasibility of computer aided parallelization of aerospace applications is presented along with suggestions for future work.
MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME.

PubMed

Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

2016-01-01

Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.
Control mechanism of double-rotator-structure ternary optical computer

NASA Astrophysics Data System (ADS)

Kai, SONG; Liping, YAN

2017-03-01

Double-rotator-structure ternary optical processor (DRSTOP) has two characteristics, namely, giant data-bits parallel computing and reconfigurable processor, which can handle thousands of data bits in parallel, and can run much faster than computers and other optical computer systems so far. In order to put DRSTOP into practical application, this paper established a series of methods, namely, task classification method, data-bits allocation method, control information generation method, control information formatting and sending method, and decoded results obtaining method and so on. These methods form the control mechanism of DRSTOP. This control mechanism makes DRSTOP become an automated computing platform. Compared with the traditional calculation tools, DRSTOP computing platform can ease the contradiction between high energy consumption and big data computing due to greatly reducing the cost of communications and I/O. Finally, the paper designed a set of experiments for DRSTOP control mechanism to verify its feasibility and correctness. Experimental results showed that the control mechanism is correct, feasible and efficient.
a Hadoop-Based Distributed Framework for Efficient Managing and Processing Big Remote Sensing Images

NASA Astrophysics Data System (ADS)

Wang, C.; Hu, F.; Hu, X.; Zhao, S.; Wen, W.; Yang, C.

2015-07-01

Various sensors from airborne and satellite platforms are producing large volumes of remote sensing images for mapping, environmental monitoring, disaster management, military intelligence, and others. However, it is challenging to efficiently storage, query and process such big data due to the data- and computing- intensive issues. In this paper, a Hadoop-based framework is proposed to manage and process the big remote sensing data in a distributed and parallel manner. Especially, remote sensing data can be directly fetched from other data platforms into the Hadoop Distributed File System (HDFS). The Orfeo toolbox, a ready-to-use tool for large image processing, is integrated into MapReduce to provide affluent image processing operations. With the integration of HDFS, Orfeo toolbox and MapReduce, these remote sensing images can be directly processed in parallel in a scalable computing environment. The experiment results show that the proposed framework can efficiently manage and process such big remote sensing data.
Hybrid MPI/OpenMP Implementation of the ORAC Molecular Dynamics Program for Generalized Ensemble and Fast Switching Alchemical Simulations.

PubMed

Procacci, Piero

2016-06-27

We present a new release (6.0β) of the ORAC program [Marsili et al. J. Comput. Chem. 2010, 31, 1106-1116] with a hybrid OpenMP/MPI (open multiprocessing message passing interface) multilevel parallelism tailored for generalized ensemble (GE) and fast switching double annihilation (FS-DAM) nonequilibrium technology aimed at evaluating the binding free energy in drug-receptor system on high performance computing platforms. The production of the GE or FS-DAM trajectories is handled using a weak scaling parallel approach on the MPI level only, while a strong scaling force decomposition scheme is implemented for intranode computations with shared memory access at the OpenMP level. The efficiency, simplicity, and inherent parallel nature of the ORAC implementation of the FS-DAM algorithm, project the code as a possible effective tool for a second generation high throughput virtual screening in drug discovery and design. The code, along with documentation, testing, and ancillary tools, is distributed under the provisions of the General Public License and can be freely downloaded at www.chim.unifi.it/orac .
Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters

PubMed Central

Bajaj, Chandrajit

2009-01-01

Tomographic imaging and computer simulations are increasingly yielding massive datasets. Interactive and exploratory visualizations have rapidly become indispensable tools to study large volumetric imaging and simulation data. Our scalable isosurface visualization framework on commodity off-the-shelf clusters is an end-to-end parallel and progressive platform, from initial data access to the final display. Interactive browsing of extracted isosurfaces is made possible by using parallel isosurface extraction, and rendering in conjunction with a new specialized piece of image compositing hardware called Metabuffer. In this paper, we focus on the back end scalability by introducing a fully parallel and out-of-core isosurface extraction algorithm. It achieves scalability by using both parallel and out-of-core processing and parallel disks. It statically partitions the volume data to parallel disks with a balanced workload spectrum, and builds I/O-optimal external interval trees to minimize the number of I/O operations of loading large data from disk. We also describe an isosurface compression scheme that is efficient for progress extraction, transmission and storage of isosurfaces. PMID:19756231
An interactive parallel programming environment applied in atmospheric science

NASA Technical Reports Server (NTRS)

vonLaszewski, G.

1996-01-01

This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.
Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

PubMed Central

Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

2008-01-01

Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Portability and Cross-Platform Performance of an MPI-Based Parallel Polygon Renderer

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1999-01-01

Visualizing the results of computations performed on large-scale parallel computers is a challenging problem, due to the size of the datasets involved. One approach is to perform the visualization and graphics operations in place, exploiting the available parallelism to obtain the necessary rendering performance. Over the past several years, we have been developing algorithms and software to support visualization applications on NASA's parallel supercomputers. Our results have been incorporated into a parallel polygon rendering system called PGL. PGL was initially developed on tightly-coupled distributed-memory message-passing systems, including Intel's iPSC/860 and Paragon, and IBM's SP2. Over the past year, we have ported it to a variety of additional platforms, including the HP Exemplar, SGI Origin2OOO, Cray T3E, and clusters of Sun workstations. In implementing PGL, we have had two primary goals: cross-platform portability and high performance. Portability is important because (1) our manpower resources are limited, making it difficult to develop and maintain multiple versions of the code, and (2) NASA's complement of parallel computing platforms is diverse and subject to frequent change. Performance is important in delivering adequate rendering rates for complex scenes and ensuring that parallel computing resources are used effectively. Unfortunately, these two goals are often at odds. In this paper we report on our experiences with portability and performance of the PGL polygon renderer across a range of parallel computing platforms.
ParDRe: faster parallel duplicated reads removal tool for sequencing studies.

PubMed

González-Domínguez, Jorge; Schmidt, Bertil

2016-05-15

Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe, a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of Single-End or Paired-End sequences from fasta or fastq files. It uses a novel bitwise approach to compare the suffixes of DNA strings and employs hybrid MPI/multithreading to reduce runtime on multicore systems. We show that ParDRe is up to 27.29 times faster than Fulcrum (a representative state-of-the-art tool) on a platform with two 8-core Sandy-Bridge processors. Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/pardre/ jgonzalezd@udc.es. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Six-degree-of-freedom parallel minimanipulator with three inextensible limbs

NASA Technical Reports Server (NTRS)

Tahmasebi, Farhad (Inventor); Tsai, Lung-Wen (Inventor)

1994-01-01

A Six-Degree-of-Freedom Parallel-Manipulator having three inextensible limbs for manipulating a platform is described. The three inextensible limbs are attached via universal joints to the platform at non-collinear points. Each of the inextensible limbs is also attached via universal joints to a two-degree-of-freedom parallel driver such as a five-bar linkage, a pantograph, or a bidirectional linear stepper motor. The drivers move the lower ends of the limbs parallel to a fixed base and thereby provide manipulation of the platform. The actuators are mounted on the fixed base without using any power transmission devices such as gears or belts.
Extending the BEAGLE library to a multi-FPGA platform

PubMed Central

2013-01-01

Background Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein’s pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein’s pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. Results The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform’s peak memory bandwidth and the implementation’s memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE’s CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE’s GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. Conclusions The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor. PMID:23331707
The NIST SPIDER, A Robot Crane

PubMed Central

Albus, James; Bostelman, Roger; Dagalakis, Nicholas

1992-01-01

The Robot Systems Division of the National Institute of Standards and Technology has been experimenting for several years with new concepts for robot cranes. These concepts utilize the basic idea of the Stewart Platform parallel link manipulator. The unique feature of the NIST approach is to use cables as the parallel links and to use winches as the actuators. So long as the cables are all in tension, the load is kinematically constrained, and the cables resist perturbing forces and moments with equal stiffness to both positive and negative loads. The result is that the suspended load is constrained with a mechanical stiffness determined by the elasticity of the cables, the suspended weight, and the geometry of the mechanism. Based on these concepts, a revolutionary new type of robot crane, the NIST SPIDER (Stewart Platform Instrumented Drive Environmental Robot) has been developed that can control the position, velocity, and force of tools and heavy machinery in all six degrees of freedom (x, y, z, roll, pitch, and yaw). Depending on what is suspended from its work platform, the SPIDER can perform a variety of tasks. Examples are: cutting, excavating and grading, shaping and finishing, lifting and positioning. A 6 m version of the SPIDER has been built and critical performance characteristics analyzed. PMID:28053439
The NIST SPIDER, A Robot Crane.

PubMed

Albus, James; Bostelman, Roger; Dagalakis, Nicholas

1992-01-01

The Robot Systems Division of the National Institute of Standards and Technology has been experimenting for several years with new concepts for robot cranes. These concepts utilize the basic idea of the Stewart Platform parallel link manipulator. The unique feature of the NIST approach is to use cables as the parallel links and to use winches as the actuators. So long as the cables are all in tension, the load is kinematically constrained, and the cables resist perturbing forces and moments with equal stiffness to both positive and negative loads. The result is that the suspended load is constrained with a mechanical stiffness determined by the elasticity of the cables, the suspended weight, and the geometry of the mechanism. Based on these concepts, a revolutionary new type of robot crane, the NIST SPIDER (Stewart Platform Instrumented Drive Environmental Robot) has been developed that can control the position, velocity, and force of tools and heavy machinery in all six degrees of freedom ( x, y, z , roll, pitch, and yaw). Depending on what is suspended from its work platform, the SPIDER can perform a variety of tasks. Examples are: cutting, excavating and grading, shaping and finishing, lifting and positioning. A 6 m version of the SPIDER has been built and critical performance characteristics analyzed.
MOLA: a bootable, self-configuring system for virtual screening using AutoDock4/Vina on computer clusters.

PubMed

Abreu, Rui Mv; Froufe, Hugo Jc; Queiroz, Maria João Rp; Ferreira, Isabel Cfr

2010-10-28

Virtual screening of small molecules using molecular docking has become an important tool in drug discovery. However, large scale virtual screening is time demanding and usually requires dedicated computer clusters. There are a number of software tools that perform virtual screening using AutoDock4 but they require access to dedicated Linux computer clusters. Also no software is available for performing virtual screening with Vina using computer clusters. In this paper we present MOLA, an easy-to-use graphical user interface tool that automates parallel virtual screening using AutoDock4 and/or Vina in bootable non-dedicated computer clusters. MOLA automates several tasks including: ligand preparation, parallel AutoDock4/Vina jobs distribution and result analysis. When the virtual screening project finishes, an open-office spreadsheet file opens with the ligands ranked by binding energy and distance to the active site. All results files can automatically be recorded on an USB-flash drive or on the hard-disk drive using VirtualBox. MOLA works inside a customized Live CD GNU/Linux operating system, developed by us, that bypass the original operating system installed on the computers used in the cluster. This operating system boots from a CD on the master node and then clusters other computers as slave nodes via ethernet connections. MOLA is an ideal virtual screening tool for non-experienced users, with a limited number of multi-platform heterogeneous computers available and no access to dedicated Linux computer clusters. When a virtual screening project finishes, the computers can just be restarted to their original operating system. The originality of MOLA lies on the fact that, any platform-independent computer available can he added to the cluster, without ever using the computer hard-disk drive and without interfering with the installed operating system. With a cluster of 10 processors, and a potential maximum speed-up of 10x, the parallel algorithm of MOLA performed with a speed-up of 8,64× using AutoDock4 and 8,60× using Vina.

Bioinformatics training: selecting an appropriate learning content management system--an example from the European Bioinformatics Institute.

PubMed

Wright, Victoria Ann; Vaughan, Brendan W; Laurent, Thomas; Lopez, Rodrigo; Brooksbank, Cath; Schneider, Maria Victoria

2010-11-01

Today's molecular life scientists are well educated in the emerging experimental tools of their trade, but when it comes to training on the myriad of resources and tools for dealing with biological data, a less ideal situation emerges. Often bioinformatics users receive no formal training on how to make the most of the bioinformatics resources and tools available in the public domain. The European Bioinformatics Institute, which is part of the European Molecular Biology Laboratory (EMBL-EBI), holds the world's most comprehensive collection of molecular data, and training the research community to exploit this information is embedded in the EBI's mission. We have evaluated eLearning, in parallel with face-to-face courses, as a means of training users of our data resources and tools. We anticipate that eLearning will become an increasingly important vehicle for delivering training to our growing user base, so we have undertaken an extensive review of Learning Content Management Systems (LCMSs). Here, we describe the process that we used, which considered the requirements of trainees, trainers and systems administrators, as well as taking into account our organizational values and needs. This review describes the literature survey, user discussions and scripted platform testing that we performed to narrow down our choice of platform from 36 to a single platform. We hope that it will serve as guidance for others who are seeking to incorporate eLearning into their bioinformatics training programmes.
Developing parallel GeoFEST(P) using the PYRAMID AMR library

NASA Technical Reports Server (NTRS)

Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Tisdale, Robert E.

2004-01-01

The PYRAMID parallel unstructured adaptive mesh refinement (AMR) library has been coupled with the GeoFEST geophysical finite element simulation tool to support parallel active tectonics simulations. Specifically, we have demonstrated modeling of coseismic and postseismic surface displacement due to a simulated Earthquake for the Landers system of interacting faults in Southern California. The new software demonstrated a 25-times resolution improvement and a 4-times reduction in time to solution over the sequential baseline milestone case. Simulations on workstations using a few tens of thousands of stress displacement finite elements can now be expanded to multiple millions of elements with greater than 98% scaled efficiency on various parallel platforms over many hundreds of processors. Our most recent work has demonstrated that we can dynamically adapt the computational grid as stress grows on a fault. In this paper, we will describe the major issues and challenges associated with coupling these two programs to create GeoFEST(P). Performance and visualization results will also be described.
Conceptual design of a hybrid parallel mechanism for mask exchanging of TMT

NASA Astrophysics Data System (ADS)

Wang, Jianping; Zhou, Hongfei; Li, Kexuan; Zhou, Zengxiang; Zhai, Chao

2015-10-01

Mask exchange system is an important part of the Multi-Object Broadband Imaging Echellette (MOBIE) on the Thirty Meter Telescope (TMT). To solve the problem of stiffness changing with the gravity vector of the mask exchange system in the MOBIE, the hybrid parallel mechanism design method was introduced into the whole research. By using the characteristics of high stiffness and precision of parallel structure, combined with large moving range of serial structure, a conceptual design of a hybrid parallel mask exchange system based on 3-RPS parallel mechanism was presented. According to the position requirements of the MOBIE, the SolidWorks structure model of the hybrid parallel mask exchange robot was established and the appropriate installation position without interfering with the related components and light path in the MOBIE of TMT was analyzed. Simulation results in SolidWorks suggested that 3-RPS parallel platform had good stiffness property in different gravity vector directions. Furthermore, through the research of the mechanism theory, the inverse kinematics solution of the 3-RPS parallel platform was calculated and the mathematical relationship between the attitude angle of moving platform and the angle of ball-hinges on the moving platform was established, in order to analyze the attitude adjustment ability of the hybrid parallel mask exchange robot. The proposed conceptual design has some guiding significance for the design of mask exchange system of the MOBIE on TMT.
Archaeology in the Mississippi River Floodplain at Sand Run Slough, Iowa,

DTIC Science & Technology

1987-06-01

28 Contracting 36/. 40 51/.45 87/.43 Parallel 36/. 40 23/.20 59/.29 N=87 N=115 N=202 End Flake 75/.86 85/.74 160/.79 Side Flake 12/.14 30 /.26 42/.21...13 12/.09 TECHNOLOGY AND MORPHOLOGY .* Platform Types N=39 N=10 N=44 N=89 Bifacial Tool Edge Unfaceted 18/.51 3/. 30 19/.43 40 /.45 Faceted 2/.06 2/.20 7...25/.45 10/.53 26/. 40 61/.44 " Bifacial Wear 30 /.54 9/.47 37/.57 76/.54 Table 4. 10. Chert types and unhafted flake tool types from Stratum I, 13LA38
Coupled Ocean/Atmospheric Mesoscale Prediction System (COAMPS), Version 5.0 (User’s Guide)

DTIC Science & Technology

2010-03-30

provides tools for common modeling functions, as well as regridding, data decomposition, and communication on parallel computers. NRL/MR/7320--10...specified gncomDir. If running COAMPS at the DSRC (e.g. BABBAGE, DAVINCI , or EINSTEIN), the global NCOM files will be copied to /scr/[user]/COAMPS/data...the site (DSRC or local) and the platform (BABBAGE. DAVINCI , EINSTEIN, or local machine) on which COAMPS is being run. site=navy_dsrc (for DSRC
Declarative language design for interactive visualization.

PubMed

Heer, Jeffrey; Bostock, Michael

2010-01-01

We investigate the design of declarative, domain-specific languages for constructing interactive visualizations. By separating specification from execution, declarative languages can simplify development, enable unobtrusive optimization, and support retargeting across platforms. We describe the design of the Protovis specification language and its implementation within an object-oriented, statically-typed programming language (Java). We demonstrate how to support rich visualizations without requiring a toolkit-specific data model and extend Protovis to enable declarative specification of animated transitions. To support cross-platform deployment, we introduce rendering and event-handling infrastructures decoupled from the runtime platform, letting designers retarget visualization specifications (e.g., from desktop to mobile phone) with reduced effort. We also explore optimizations such as runtime compilation of visualization specifications, parallelized execution, and hardware-accelerated rendering. We present benchmark studies measuring the performance gains provided by these optimizations and compare performance to existing Java-based visualization tools, demonstrating scalability improvements exceeding an order of magnitude.
Implementation of a parallel protein structure alignment service on cloud.

PubMed

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Implementation of a Parallel Protein Structure Alignment Service on Cloud

PubMed Central

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

NASA Astrophysics Data System (ADS)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
MOLNs: A CLOUD PLATFORM FOR INTERACTIVE, REPRODUCIBLE, AND SCALABLE SPATIAL STOCHASTIC COMPUTATIONAL EXPERIMENTS IN SYSTEMS BIOLOGY USING PyURDME

PubMed Central

Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas

2017-01-01

Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments. PMID:28190948
Software Engineering for Scientific Computer Simulations

NASA Astrophysics Data System (ADS)

Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.

2004-11-01

Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.
User’s Guide for the Coupled Ocean/Atmospheric Mesoscale Prediction System (COAMPS) Version 5.0

DTIC Science & Technology

2010-03-30

provides tools for common modeling functions, as well as regridding, data decomposition, and communication on parallel computers. NRL/MR/7320...specified gncomDir. If running COAMPS at the DSRC (e.g. BABBAGE, DAVINCI , or EINSTEIN), the global NCOM files will be copied to /scr/[user]/COAMPS/data...the site (DSRC or local) and the platform (BABBAGE. DAVINCI , EINSTEIN, or local machine) on which COAMPS is being run. site=navy_dsrc (for DSRC
The VISPA Internet Platform for Students

NASA Astrophysics Data System (ADS)

Asseldonk, D. v.; Erdmann, M.; Fischer, R.; Glaser, C.; Müller, G.; Quast, T.; Rieger, M.; Urban, M.

2016-04-01

The VISPA internet platform enables users to remotely run Python scripts and view resulting plots or inspect their output data. With a standard web browser as the only user requirement on the client-side, the system becomes suitable for blended learning approaches for university physics students. VISPA was used in two consecutive years each by approx. 100 third year physics students at the RWTH Aachen University for their homework assignments. For example, in one exercise students gained a deeper understanding of Einsteins mass-energy relation by analyzing experimental data of electron-positron pairs revealing J / Ψ and Z particles. Because the students were free to choose their working hours, only few users accessed the platform simultaneously. The positive feedback from students and the stability of the platform lead to further development of the concept. This year, students accessed the platform in parallel while they analyzed the data recorded by demonstrated experiments live in the lecture hall. The platform is based on experience in the development of professional analysis tools. It combines core technologies from previous projects: an object-oriented C++ library, a modular data-driven analysis flow, and visual analysis steering. We present the platform and discuss its benefits in the context of teaching based on surveys that are conducted each semester.
Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

NASA Astrophysics Data System (ADS)

Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

2012-12-01

The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system. Outputs can be summarised and visualised using the full power of Python's many scientific tools, including Scipy, Matplotlib, Pandas and CDAT. This rich user experience is delivered through the user's web browser; maintaining the interactive feel of a workstation-based environment with the parallel power of a remote data-centric processing facility.
A Parallel Point Matching Algorithm for Landmark Based Image Registration Using Multicore Platform

PubMed Central

Yang, Lin; Gong, Leiguang; Zhang, Hong; Nosher, John L.; Foran, David J.

2013-01-01

Point matching is crucial for many computer vision applications. Establishing the correspondence between a large number of data points is a computationally intensive process. Some point matching related applications, such as medical image registration, require real time or near real time performance if applied to critical clinical applications like image assisted surgery. In this paper, we report a new multicore platform based parallel algorithm for fast point matching in the context of landmark based medical image registration. We introduced a non-regular data partition algorithm which utilizes the K-means clustering algorithm to group the landmarks based on the number of available processing cores, which optimize the memory usage and data transfer. We have tested our method using the IBM Cell Broadband Engine (Cell/B.E.) platform. The results demonstrated a significant speed up over its sequential implementation. The proposed data partition and parallelization algorithm, though tested only on one multicore platform, is generic by its design. Therefore the parallel algorithm can be extended to other computing platforms, as well as other point matching related applications. PMID:24308014
Enhancing knowledge discovery from cancer genomics data with Galaxy

PubMed Central

Albuquerque, Marco A.; Grande, Bruno M.; Ritch, Elie J.; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K.; Shah, Sohrab P.; Boutros, Paul C.

2017-01-01

Abstract The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. PMID:28327945
Enhancing knowledge discovery from cancer genomics data with Galaxy.

PubMed

Albuquerque, Marco A; Grande, Bruno M; Ritch, Elie J; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K; Shah, Sohrab P; Boutros, Paul C; Morin, Ryan D

2017-05-01

The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. © The Author 2017. Published by Oxford University Press.
LLVM Infrastructure and Tools Project Summary

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCormick, Patrick Sean

2017-11-06

This project works with the open source LLVM Compiler Infrastructure (http://llvm.org) to provide tools and capabilities that address needs and challenges faced by ECP community (applications, libraries, and other components of the software stack). Our focus is on providing a more productive development environment that enables (i) improved compilation times and code generation for parallelism, (ii) additional features/capabilities within the design and implementations of LLVM components for improved platform/performance portability and (iii) improved aspects related to composition of the underlying implementation details of the programming environment, capturing resource utilization, overheads, etc. -- including runtime systems that are often not easilymore » addressed by application and library developers.« less
Concurrent Collections (CnC): A new approach to parallel programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Knobe, Kathleen

2010-05-07

A common approach in designing parallel languages is to provide some high level handles to manipulate the use of the parallel platform. This exposes some aspects of the target platform, for example, shared vs. distributed memory. It may expose some but not all types of parallelism, for example, data parallelism but not task parallelism. This approach must find a balance between the desire to provide a simple view for the domain expert and provide sufficient power for tuning. This is hard for any given architecture and harder if the language is to apply to a range of architectures. Either simplicitymore » or power is lost. Instead of viewing the language design problem as one of providing the programmer with high level handles, we view the problem as one of designing an interface. On one side of this interface is the programmer (domain expert) who knows the application but needs no knowledge of any aspects of the platform. On the other side of the interface is the performance expert (programmer or program) who demands maximal flexibility for optimizing the mapping to a wide range of target platforms (parallel / serial, shared / distributed, homogeneous / heterogeneous, etc.) but needs no knowledge of the domain. Concurrent Collections (CnC) is based on this separation of concerns. The talk will present CnC and its benefits. About the speaker. Kathleen Knobe has focused throughout her career on parallelism especially compiler technology, runtime system design and language design. She worked at Compass (aka Massachusetts Computer Associates) from 1980 to 1991 designing compilers for a wide range of parallel platforms for Thinking Machines, MasPar, Alliant, Numerix, and several government projects. In 1991 she decided to finish her education. After graduating from MIT in 1997, she joined Digital Equipment’s Cambridge Research Lab (CRL). She stayed through the DEC/Compaq/HP mergers and when CRL was acquired and absorbed by Intel. She currently works in the Software and Services Group / Technology Pathfinding and Innovation.« less
Concurrent Collections (CnC): A new approach to parallel programming

ScienceCinema

Knobe, Kathleen

2018-04-16

A common approach in designing parallel languages is to provide some high level handles to manipulate the use of the parallel platform. This exposes some aspects of the target platform, for example, shared vs. distributed memory. It may expose some but not all types of parallelism, for example, data parallelism but not task parallelism. This approach must find a balance between the desire to provide a simple view for the domain expert and provide sufficient power for tuning. This is hard for any given architecture and harder if the language is to apply to a range of architectures. Either simplicity or power is lost. Instead of viewing the language design problem as one of providing the programmer with high level handles, we view the problem as one of designing an interface. On one side of this interface is the programmer (domain expert) who knows the application but needs no knowledge of any aspects of the platform. On the other side of the interface is the performance expert (programmer or program) who demands maximal flexibility for optimizing the mapping to a wide range of target platforms (parallel / serial, shared / distributed, homogeneous / heterogeneous, etc.) but needs no knowledge of the domain. Concurrent Collections (CnC) is based on this separation of concerns. The talk will present CnC and its benefits. About the speaker. Kathleen Knobe has focused throughout her career on parallelism especially compiler technology, runtime system design and language design. She worked at Compass (aka Massachusetts Computer Associates) from 1980 to 1991 designing compilers for a wide range of parallel platforms for Thinking Machines, MasPar, Alliant, Numerix, and several government projects. In 1991 she decided to finish her education. After graduating from MIT in 1997, she joined Digital Equipmentâs Cambridge Research Lab (CRL). She stayed through the DEC/Compaq/HP mergers and when CRL was acquired and absorbed by Intel. She currently works in the Software and Services Group / Technology Pathfinding and Innovation.

PARAMO: A Parallel Predictive Modeling Platform for Healthcare Analytic Research using Electronic Health Records

PubMed Central

Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R.; Stewart, Walter F.; Malin, Bradley; Sun, Jimeng

2014-01-01

Objective Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: 1) cohort construction, 2) feature construction, 3) cross-validation, 4) feature selection, and 5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. Methods To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which 1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, 2) schedules the tasks in a topological ordering of the graph, and 3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. Results We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3 hours in parallel compared to 9 days if running sequentially. Conclusion This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines that are specialized for health data researchers. PMID:24370496
PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records.

PubMed

Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R; Stewart, Walter F; Malin, Bradley; Sun, Jimeng

2014-04-01

Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: (1) cohort construction, (2) feature construction, (3) cross-validation, (4) feature selection, and (5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which (1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, (2) schedules the tasks in a topological ordering of the graph, and (3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3h in parallel compared to 9days if running sequentially. This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines that are specialized for health data researchers. Copyright © 2013 Elsevier Inc. All rights reserved.
Supporting Building Portfolio Investment and Policy Decision Making through an Integrated Building Utility Data Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aziz, Azizan; Lasternas, Bertrand; Alschuler, Elena

The American Recovery and Reinvestment Act stimulus funding of 2009 for smart grid projects resulted in the tripling of smart meters deployment. In 2012, the Green Button initiative provided utility customers with access to their real-time1 energy usage. The availability of finely granular data provides an enormous potential for energy data analytics and energy benchmarking. The sheer volume of time-series utility data from a large number of buildings also poses challenges in data collection, quality control, and database management for rigorous and meaningful analyses. In this paper, we will describe a building portfolio-level data analytics tool for operational optimization, businessmore » investment and policy assessment using 15-minute to monthly intervals utility data. The analytics tool is developed on top of the U.S. Department of Energy’s Standard Energy Efficiency Data (SEED) platform, an open source software application that manages energy performance data of large groups of buildings. To support the significantly large volume of granular interval data, we integrated a parallel time-series database to the existing relational database. The time-series database improves on the current utility data input, focusing on real-time data collection, storage, analytics and data quality control. The fully integrated data platform supports APIs for utility apps development by third party software developers. These apps will provide actionable intelligence for building owners and facilities managers. Unlike a commercial system, this platform is an open source platform funded by the U.S. Government, accessible to the public, researchers and other developers, to support initiatives in reducing building energy consumption.« less
Parallel Implementation of Triangular Cellular Automata for Computing Two-Dimensional Elastodynamic Response on Arbitrary Domains

NASA Astrophysics Data System (ADS)

Leamy, Michael J.; Springer, Adam C.

In this research we report parallel implementation of a Cellular Automata-based simulation tool for computing elastodynamic response on complex, two-dimensional domains. Elastodynamic simulation using Cellular Automata (CA) has recently been presented as an alternative, inherently object-oriented technique for accurately and efficiently computing linear and nonlinear wave propagation in arbitrarily-shaped geometries. The local, autonomous nature of the method should lead to straight-forward and efficient parallelization. We address this notion on symmetric multiprocessor (SMP) hardware using a Java-based object-oriented CA code implementing triangular state machines (i.e., automata) and the MPI bindings written in Java (MPJ Express). We use MPJ Express to reconfigure our existing CA code to distribute a domain's automata to cores present on a dual quad-core shared-memory system (eight total processors). We note that this message passing parallelization strategy is directly applicable to computer clustered computing, which will be the focus of follow-on research. Results on the shared memory platform indicate nearly-ideal, linear speed-up. We conclude that the CA-based elastodynamic simulator is easily configured to run in parallel, and yields excellent speed-up on SMP hardware.
The Automated Instrumentation and Monitoring System (AIMS) reference manual

NASA Technical Reports Server (NTRS)

Yan, Jerry; Hontalas, Philip; Listgarten, Sherry

1993-01-01

Whether a researcher is designing the 'next parallel programming paradigm,' another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of execution traces can help computer designers and software architects to uncover system behavior and to take advantage of specific application characteristics and hardware features. A software tool kit that facilitates performance evaluation of parallel applications on multiprocessors is described. The Automated Instrumentation and Monitoring System (AIMS) has four major software components: a source code instrumentor which automatically inserts active event recorders into the program's source code before compilation; a run time performance-monitoring library, which collects performance data; a trace file animation and analysis tool kit which reconstructs program execution from the trace file; and a trace post-processor which compensate for data collection overhead. Besides being used as prototype for developing new techniques for instrumenting, monitoring, and visualizing parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware test beds to evaluate their impact on user productivity. Currently, AIMS instrumentors accept FORTRAN and C parallel programs written for Intel's NX operating system on the iPSC family of multi computers. A run-time performance-monitoring library for the iPSC/860 is included in this release. We plan to release monitors for other platforms (such as PVM and TMC's CM-5) in the near future. Performance data collected can be graphically displayed on workstations (e.g. Sun Sparc and SGI) supporting X-Windows (in particular, Xl IR5, Motif 1.1.3).
Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

PubMed

Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

2016-11-01

Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.
Performance Analysis and Scaling Behavior of the Terrestrial Systems Modeling Platform TerrSysMP in Large-Scale Supercomputing Environments

NASA Astrophysics Data System (ADS)

Kollet, S. J.; Goergen, K.; Gasper, F.; Shresta, P.; Sulis, M.; Rihani, J.; Simmer, C.; Vereecken, H.

2013-12-01

In studies of the terrestrial hydrologic, energy and biogeochemical cycles, integrated multi-physics simulation platforms take a central role in characterizing non-linear interactions, variances and uncertainties of system states and fluxes in reciprocity with observations. Recently developed integrated simulation platforms attempt to honor the complexity of the terrestrial system across multiple time and space scales from the deeper subsurface including groundwater dynamics into the atmosphere. Technically, this requires the coupling of atmospheric, land surface, and subsurface-surface flow models in supercomputing environments, while ensuring a high-degree of efficiency in the utilization of e.g., standard Linux clusters and massively parallel resources. A systematic performance analysis including profiling and tracing in such an application is crucial in the understanding of the runtime behavior, to identify optimum model settings, and is an efficient way to distinguish potential parallel deficiencies. On sophisticated leadership-class supercomputers, such as the 28-rack 5.9 petaFLOP IBM Blue Gene/Q 'JUQUEEN' of the Jülich Supercomputing Centre (JSC), this is a challenging task, but even more so important, when complex coupled component models are to be analysed. Here we want to present our experience from coupling, application tuning (e.g. 5-times speedup through compiler optimizations), parallel scaling and performance monitoring of the parallel Terrestrial Systems Modeling Platform TerrSysMP. The modeling platform consists of the weather prediction system COSMO of the German Weather Service; the Community Land Model, CLM of NCAR; and the variably saturated surface-subsurface flow code ParFlow. The model system relies on the Multiple Program Multiple Data (MPMD) execution model where the external Ocean-Atmosphere-Sea-Ice-Soil coupler (OASIS3) links the component models. TerrSysMP has been instrumented with the performance analysis tool Scalasca and analyzed on JUQUEEN with processor counts on the order of 10,000. The instrumentation is used in weak and strong scaling studies with real data cases and hypothetical idealized numerical experiments for detailed profiling and tracing analysis. The profiling is not only useful in identifying wait states that are due to the MPMD execution model, but also in fine-tuning resource allocation to the component models in search of the most suitable load balancing. This is especially necessary, as with numerical experiments that cover multiple (high resolution) spatial scales, the time stepping, coupling frequencies, and communication overheads are constantly shifting, which makes it necessary to re-determine the model setup with each new experimental design.
Volume-rendering on a 3D hyperwall: A molecular visualization platform for research, education and outreach.

PubMed

MacDougall, Preston J; Henze, Christopher E; Volkov, Anatoliy

2016-11-01

We present a unique platform for molecular visualization and design that uses novel subatomic feature detection software in tandem with 3D hyperwall visualization technology. We demonstrate the fleshing-out of pharmacophores in drug molecules, as well as reactive sites in catalysts, focusing on subatomic features. Topological analysis with picometer resolution, in conjunction with interactive volume-rendering of the Laplacian of the electronic charge density, leads to new insight into docking and catalysis. Visual data-mining is done efficiently and in parallel using a 4×4 3D hyperwall (a tiled array of 3D monitors driven independently by slave GPUs but displaying high-resolution, synchronized and functionally-related images). The visual texture of images for a wide variety of molecular systems are intuitive to experienced chemists but also appealing to neophytes, making the platform simultaneously useful as a tool for advanced research as well as for pedagogical and STEM education outreach purposes. Copyright Â© 2016. Published by Elsevier Inc.
Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patchett, John M; Ahrens, James P; Lo, Li - Ta

2010-10-15

Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less
CONRAD—A software framework for cone-beam imaging in radiology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maier, Andreas; Choi, Jang-Hwan; Riess, Christian

2013-11-15

Purpose: In the community of x-ray imaging, there is a multitude of tools and applications that are used in scientific practice. Many of these tools are proprietary and can only be used within a certain lab. Often the same algorithm is implemented multiple times by different groups in order to enable comparison. In an effort to tackle this problem, the authors created CONRAD, a software framework that provides many of the tools that are required to simulate basic processes in x-ray imaging and perform image reconstruction with consideration of nonlinear physical effects.Methods: CONRAD is a Java-based state-of-the-art software platform withmore » extensive documentation. It is based on platform-independent technologies. Special libraries offer access to hardware acceleration such as OpenCL. There is an easy-to-use interface for parallel processing. The software package includes different simulation tools that are able to generate up to 4D projection and volume data and respective vector motion fields. Well known reconstruction algorithms such as FBP, DBP, and ART are included. All algorithms in the package are referenced to a scientific source.Results: A total of 13 different phantoms and 30 processing steps have already been integrated into the platform at the time of writing. The platform comprises 74.000 nonblank lines of code out of which 19% are used for documentation. The software package is available for download at http://conrad.stanford.edu. To demonstrate the use of the package, the authors reconstructed images from two different scanners, a table top system and a clinical C-arm system. Runtimes were evaluated using the RabbitCT platform and demonstrate state-of-the-art runtimes with 2.5 s for the 256 problem size and 12.4 s for the 512 problem size.Conclusions: As a common software framework, CONRAD enables the medical physics community to share algorithms and develop new ideas. In particular this offers new opportunities for scientific collaboration and quantitative performance comparison between the methods of different groups.« less
Using Performance Tools to Support Experiments in HPC Resilience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naughton, III, Thomas J; Boehm, Swen; Engelmann, Christian

2014-01-01

The high performance computing (HPC) community is working to address fault tolerance and resilience concerns for current and future large scale computing platforms. This is driving enhancements in the programming environ- ments, specifically research on enhancing message passing libraries to support fault tolerant computing capabilities. The community has also recognized that tools for resilience experimentation are greatly lacking. However, we argue that there are several parallels between performance tools and resilience tools . As such, we believe the rich set of HPC performance-focused tools can be extended (repurposed) to benefit the resilience community. In this paper, we describe the initialmore » motivation to leverage standard HPC per- formance analysis techniques to aid in developing diagnostic tools to assist fault tolerance experiments for HPC applications. These diagnosis procedures help to provide context for the system when the errors (failures) occurred. We describe our initial work in leveraging an MPI performance trace tool to assist in provid- ing global context during fault injection experiments. Such tools will assist the HPC resilience community as they extend existing and new application codes to support fault tolerances.« less
Modern Computational Techniques for the HMMER Sequence Analysis

PubMed Central

2013-01-01

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
MerCat: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from metagenomic and/or metatranscriptomic sequencing data

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.

MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).
Xyce parallel electronic simulator users guide, version 6.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users' guide, Version 6.0.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users guide, version 6.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Parallel k-means++ for Multiple Shared-Memory Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mackey, Patrick S.; Lewis, Robert R.

2016-09-22

In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varyingmore » data sizes.« less
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data

NASA Astrophysics Data System (ADS)

Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.

2016-12-01

Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1996-01-01

We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1997-01-01

We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.

Using Intel's Knight Landing Processor to Accelerate Global Nested Air Quality Prediction Modeling System (GNAQPMS) Model

NASA Astrophysics Data System (ADS)

Wang, H.; Chen, H.; Chen, X.; Wu, Q.; Wang, Z.

2016-12-01

The Global Nested Air Quality Prediction Modeling System for Hg (GNAQPMS-Hg) is a global chemical transport model coupled Hg transport module to investigate the mercury pollution. In this study, we present our work of transplanting the GNAQPMS model on Intel Xeon Phi processor, Knights Landing (KNL) to accelerate the model. KNL is the second-generation product adopting Many Integrated Core Architecture (MIC) architecture. Compared with the first generation Knight Corner (KNC), KNL has more new hardware features, that it can be used as unique processor as well as coprocessor with other CPU. According to the Vtune tool, the high overhead modules in GNAQPMS model have been addressed, including CBMZ gas chemistry, advection and convection module, and wet deposition module. These high overhead modules were accelerated by optimizing code and using new techniques of KNL. The following optimized measures was done: 1) Changing the pure MPI parallel mode to hybrid parallel mode with MPI and OpenMP; 2.Vectorizing the code to using the 512-bit wide vector computation unit. 3. Reducing unnecessary memory access and calculation. 4. Reducing Thread Local Storage (TLS) for common variables with each OpenMP thread in CBMZ. 5. Changing the way of global communication from files writing and reading to MPI functions. After optimization, the performance of GNAQPMS is greatly increased both on CPU and KNL platform, the single-node test showed that optimized version has 2.6x speedup on two sockets CPU platform and 3.3x speedup on one socket KNL platform compared with the baseline version code, which means the KNL has 1.29x speedup when compared with 2 sockets CPU platform.
Performance Analysis and Portability of the PLUM Load Balancing System

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Gabow, Harold N.

1998-01-01

The ability to dynamically adapt an unstructured mesh is a powerful tool for solving computational problems with evolving physical features; however, an efficient parallel implementation is rather difficult. To address this problem, we have developed PLUM, an automatic portable framework for performing adaptive numerical computations in a message-passing environment. PLUM requires that all data be globally redistributed after each mesh adaption to achieve load balance. We present an algorithm for minimizing this remapping overhead by guaranteeing an optimal processor reassignment. We also show that the data redistribution cost can be significantly reduced by applying our heuristic processor reassignment algorithm to the default mapping of the parallel partitioner. Portability is examined by comparing performance on a SP2, an Origin2000, and a T3E. Results show that PLUM can be successfully ported to different platforms without any code modifications.
Xyce parallel electronic simulator : users' guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

2011-05-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.« less
Development, Verification and Validation of Parallel, Scalable Volume of Fluid CFD Program for Propulsion Applications

NASA Technical Reports Server (NTRS)

West, Jeff; Yang, H. Q.

2014-01-01

There are many instances involving liquid/gas interfaces and their dynamics in the design of liquid engine powered rockets such as the Space Launch System (SLS). Some examples of these applications are: Propellant tank draining and slosh, subcritical condition injector analysis for gas generators, preburners and thrust chambers, water deluge mitigation for launch induced environments and even solid rocket motor liquid slag dynamics. Commercially available CFD programs simulating gas/liquid interfaces using the Volume of Fluid approach are currently limited in their parallel scalability. In 2010 for instance, an internal NASA/MSFC review of three commercial tools revealed that parallel scalability was seriously compromised at 8 cpus and no additional speedup was possible after 32 cpus. Other non-interface CFD applications at the time were demonstrating useful parallel scalability up to 4,096 processors or more. Based on this review, NASA/MSFC initiated an effort to implement a Volume of Fluid implementation within the unstructured mesh, pressure-based algorithm CFD program, Loci-STREAM. After verification was achieved by comparing results to the commercial CFD program CFD-Ace+, and validation by direct comparison with data, Loci-STREAM-VoF is now the production CFD tool for propellant slosh force and slosh damping rate simulations at NASA/MSFC. On these applications, good parallel scalability has been demonstrated for problems sizes of tens of millions of cells and thousands of cpu cores. Ongoing efforts are focused on the application of Loci-STREAM-VoF to predict the transient flow patterns of water on the SLS Mobile Launch Platform in order to support the phasing of water for launch environment mitigation so that vehicle determinantal effects are not realized.
New Techniques for Deep Learning with Geospatial Data using TensorFlow, Earth Engine, and Google Cloud Platform

NASA Astrophysics Data System (ADS)

Hancher, M.

2017-12-01

Recent years have seen promising results from many research teams applying deep learning techniques to geospatial data processing. In that same timeframe, TensorFlow has emerged as the most popular framework for deep learning in general, and Google has assembled petabytes of Earth observation data from a wide variety of sources and made them available in analysis-ready form in the cloud through Google Earth Engine. Nevertheless, developing and applying deep learning to geospatial data at scale has been somewhat cumbersome to date. We present a new set of tools and techniques that simplify this process. Our approach combines the strengths of several underlying tools: TensorFlow for its expressive deep learning framework; Earth Engine for data management, preprocessing, postprocessing, and visualization; and other tools in Google Cloud Platform to train TensorFlow models at scale, perform additional custom parallel data processing, and drive the entire process from a single familiar Python development environment. These tools can be used to easily apply standard deep neural networks, convolutional neural networks, and other custom model architectures to a variety of geospatial data structures. We discuss our experiences applying these and related tools to a range of machine learning problems, including classic problems like cloud detection, building detection, land cover classification, as well as more novel problems like illegal fishing detection. Our improved tools will make it easier for geospatial data scientists to apply modern deep learning techniques to their own problems, and will also make it easier for machine learning researchers to advance the state of the art of those techniques.
Xyce Parallel Electronic Simulator Users' Guide Version 6.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Understanding the Cray X1 System

NASA Technical Reports Server (NTRS)

Cheung, Samson

2004-01-01

This paper helps the reader understand the characteristics of the Cray X1 vector supercomputer system, and provides hints and information to enable the reader to port codes to the system. It provides a comparison between the basic performance of the X1 platform and other platforms that are available at NASA Ames Research Center. A set of codes, solving the Laplacian equation with different parallel paradigms, is used to understand some features of the X1 compiler. An example code from the NAS Parallel Benchmarks is used to demonstrate performance optimization on the X1 platform.
Dynamics and control of cable-suspended parallel robots for giant telescopes

NASA Astrophysics Data System (ADS)

Zhuang, Peng; Yao, Zhengqiu

2006-06-01

A cable-suspended parallel robot utilizes the basic idea of Stewart platform but replaces parallel links with cables and linear actuators with winches. It has many advantages over a conventional crane. The concept of applying a cable-suspended parallel robot into the construction and maintenance of giant telescope is presented in this paper. Compared with the mass and travel of the moving platform of the robot, the mass and deformation of the cables can be disregarded. Based on the premises, the kinematic and dynamic models of the robot are built. Through simulation, the inertia and gravity of moving platform are found to have dominant effect on the dynamic characteristic of the robot, while the dynamics of actuators can be disregarded, so a simplified dynamic model applicable to real-time control is obtained. Moreover, according to control-law partitioning approach and optimization theory, a workspace model-based controller is proposed considering the characteristic that the cables can only pull but not push. The simulation results indicate that the controller possesses good accuracy in pose and speed tracking, and keeps the cables in reliable tension by maintaining the minimum strain above a certain given value, thus ensures smooth motion and accurate localization for moving platform.
ms2: A molecular simulation tool for thermodynamic properties

NASA Astrophysics Data System (ADS)

Deublein, Stephan; Eckl, Bernhard; Stoll, Jürgen; Lishchuk, Sergey V.; Guevara-Carrion, Gabriela; Glass, Colin W.; Merker, Thorsten; Bernreuther, Martin; Hasse, Hans; Vrabec, Jadran

2011-11-01

This work presents the molecular simulation program ms2 that is designed for the calculation of thermodynamic properties of bulk fluids in equilibrium consisting of small electro-neutral molecules. ms2 features the two main molecular simulation techniques, molecular dynamics (MD) and Monte-Carlo. It supports the calculation of vapor-liquid equilibria of pure fluids and multi-component mixtures described by rigid molecular models on the basis of the grand equilibrium method. Furthermore, it is capable of sampling various classical ensembles and yields numerous thermodynamic properties. To evaluate the chemical potential, Widom's test molecule method and gradual insertion are implemented. Transport properties are determined by equilibrium MD simulations following the Green-Kubo formalism. ms2 is designed to meet the requirements of academia and industry, particularly achieving short response times and straightforward handling. It is written in Fortran90 and optimized for a fast execution on a broad range of computer architectures, spanning from single processor PCs over PC-clusters and vector computers to high-end parallel machines. The standard Message Passing Interface (MPI) is used for parallelization and ms2 is therefore easily portable to different computing platforms. Feature tools facilitate the interaction with the code and the interpretation of input and output files. The accuracy and reliability of ms2 has been shown for a large variety of fluids in preceding work. Program summaryProgram title:ms2 Catalogue identifier: AEJF_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJF_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Special Licence supplied by the authors No. of lines in distributed program, including test data, etc.: 82 794 No. of bytes in distributed program, including test data, etc.: 793 705 Distribution format: tar.gz Programming language: Fortran90 Computer: The simulation tool ms2 is usable on a wide variety of platforms, from single processor machines over PC-clusters and vector computers to vector-parallel architectures. (Tested with Fortran compilers: gfortran, Intel, PathScale, Portland Group and Sun Studio.) Operating system: Unix/Linux, Windows Has the code been vectorized or parallelized?: Yes. Message Passing Interface (MPI) protocol Scalability. Excellent scalability up to 16 processors for molecular dynamics and >512 processors for Monte-Carlo simulations. RAM:ms2 runs on single processors with 512 MB RAM. The memory demand rises with increasing number of processors used per node and increasing number of molecules. Classification: 7.7, 7.9, 12 External routines: Message Passing Interface (MPI) Nature of problem: Calculation of application oriented thermodynamic properties for rigid electro-neutral molecules: vapor-liquid equilibria, thermal and caloric data as well as transport properties of pure fluids and multi-component mixtures. Solution method: Molecular dynamics, Monte-Carlo, various classical ensembles, grand equilibrium method, Green-Kubo formalism. Restrictions: No. The system size is user-defined. Typical problems addressed by ms2 can be solved by simulating systems containing typically 2000 molecules or less. Unusual features: Feature tools are available for creating input files, analyzing simulation results and visualizing molecular trajectories. Additional comments: Sample makefiles for multiple operation platforms are provided. Documentation is provided with the installation package and is available at http://www.ms-2.de. Running time: The running time of ms2 depends on the problem set, the system size and the number of processes used in the simulation. Running four processes on a "Nehalem" processor, simulations calculating VLE data take between two and twelve hours, calculating transport properties between six and 24 hours.
Turbine airfoil to shroud attachment method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Campbell, Christian X; Kulkarni, Anand A; James, Allister W

2014-12-23

Bi-casting a platform (50) onto an end portion (42) of a turbine airfoil (31) after forming a coating of a fugitive material (56) on the end portion. After bi-casting the platform, the coating is dissolved and removed to relieve differential thermal shrinkage stress between the airfoil and platform. The thickness of the coating is varied around the end portion in proportion to varying amounts of local differential process shrinkage. The coating may be sprayed (76A, 76B) onto the end portion in opposite directions parallel to a chord line (41) of the airfoil or parallel to a mid-platform length (80) ofmore » the platform to form respective layers tapering in thickness from the leading (32) and trailing (34) edges along the suction side (36) of the airfoil.« less
Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

PubMed

Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

2017-01-01

Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.
SU-E-T-157: CARMEN: A MatLab-Based Research Platform for Monte Carlo Treatment Planning (MCTP) and Customized System for Planning Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baeza, J.A.; Ureba, A.; Jimenez-Ortega, E.

Purpose: Although there exist several radiotherapy research platforms, such as: CERR, the most widely used and referenced; SlicerRT, which allows treatment plan comparison from various sources; and MMCTP, a full MCTP system; it is still needed a full MCTP toolset that provides users complete control of calculation grids, interpolation methods and filters in order to “fairly” compare results from different TPSs, supporting verification with experimental measurements. Methods: This work presents CARMEN, a MatLab-based platform including multicore and GPGPU accelerated functions for loading RT data; designing treatment plans; and evaluating dose matrices and experimental data.CARMEN supports anatomic and functional imaging inmore » DICOM format, as well as RTSTRUCT, RTPLAN and RTDOSE. Besides, it contains numerous tools to accomplish the MCTP process, managing egs4phant and phase space files.CARMEN planning mode assist in designing IMRT, VMAT and MERT treatments via both inverse and direct optimization. The evaluation mode contains a comprehensive toolset (e.g. 2D/3D gamma evaluation, difference matrices, profiles, DVH, etc.) to compare datasets from commercial TPS, MC simulations (i.e. 3ddose) and radiochromic film in a user-controlled manner. Results: CARMEN has been validated against commercial RTPs and well-established evaluation tools, showing coherent behavior of its multiple algorithms. Furthermore, CARMEN platform has been used to generate competitive complex treatment that has been published in comparative studies. Conclusion: A new research oriented MCTP platform with a customized validation toolset has been presented. Despite of being coded with a high-level programming language, CARMEN is agile due to the use of parallel algorithms. The wide-spread use of MatLab provides straightforward access to CARMEN’s algorithms to most researchers. Similarly, our platform can benefit from the MatLab community scientific developments as filters, registration algorithms etc. Finally, CARMEN arises the importance of grid and filtering control in treatment plan comparison.« less
Imaging electric field dynamics with graphene optoelectronics.

PubMed

Horng, Jason; Balch, Halleh B; McGuire, Allister F; Tsai, Hsin-Zon; Forrester, Patrick R; Crommie, Michael F; Cui, Bianxiao; Wang, Feng

2016-12-16

The use of electric fields for signalling and control in liquids is widespread, spanning bioelectric activity in cells to electrical manipulation of microstructures in lab-on-a-chip devices. However, an appropriate tool to resolve the spatio-temporal distribution of electric fields over a large dynamic range has yet to be developed. Here we present a label-free method to image local electric fields in real time and under ambient conditions. Our technique combines the unique gate-variable optical transitions of graphene with a critically coupled planar waveguide platform that enables highly sensitive detection of local electric fields with a voltage sensitivity of a few microvolts, a spatial resolution of tens of micrometres and a frequency response over tens of kilohertz. Our imaging platform enables parallel detection of electric fields over a large field of view and can be tailored to broad applications spanning lab-on-a-chip device engineering to analysis of bioelectric phenomena.
Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline.

PubMed

Reid, Jeffrey G; Carroll, Andrew; Veeraraghavan, Narayanan; Dahdouli, Mahmoud; Sundquist, Andreas; English, Adam; Bainbridge, Matthew; White, Simon; Salerno, William; Buhay, Christian; Yu, Fuli; Muzny, Donna; Daly, Richard; Duyk, Geoff; Gibbs, Richard A; Boerwinkle, Eric

2014-01-29

Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.
DOVIS: an implementation for high-throughput virtual screening using AutoDock.

PubMed

Zhang, Shuxing; Kumar, Kamal; Jiang, Xiaohui; Wallqvist, Anders; Reifman, Jaques

2008-02-27

Molecular-docking-based virtual screening is an important tool in drug discovery that is used to significantly reduce the number of possible chemical compounds to be investigated. In addition to the selection of a sound docking strategy with appropriate scoring functions, another technical challenge is to in silico screen millions of compounds in a reasonable time. To meet this challenge, it is necessary to use high performance computing (HPC) platforms and techniques. However, the development of an integrated HPC system that makes efficient use of its elements is not trivial. We have developed an application termed DOVIS that uses AutoDock (version 3) as the docking engine and runs in parallel on a Linux cluster. DOVIS can efficiently dock large numbers (millions) of small molecules (ligands) to a receptor, screening 500 to 1,000 compounds per processor per day. Furthermore, in DOVIS, the docking session is fully integrated and automated in that the inputs are specified via a graphical user interface, the calculations are fully integrated with a Linux cluster queuing system for parallel processing, and the results can be visualized and queried. DOVIS removes most of the complexities and organizational problems associated with large-scale high-throughput virtual screening, and provides a convenient and efficient solution for AutoDock users to use this software in a Linux cluster platform.
Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver

NASA Technical Reports Server (NTRS)

Baggag, Abdelkader; Atkins, Harold; Oezturan, Can; Keyes, David

1999-01-01

A computational aeroacoustics code based on the discontinuous Galerkin method is ported to several parallel platforms using MPI. The discontinuous Galerkin method is a compact high-order method that retains its accuracy and robustness on non-smooth unstructured meshes. In its semi-discrete form, the discontinuous Galerkin method can be combined with explicit time marching methods making it well suited to time accurate computations. The compact nature of the discontinuous Galerkin method also makes it well suited for distributed memory parallel platforms. The original serial code was written using an object-oriented approach and was previously optimized for cache-based machines. The port to parallel platforms was achieved simply by treating partition boundaries as a type of boundary condition. Code modifications were minimal because boundary conditions were abstractions in the original program. Scalability results are presented for the SCI Origin, IBM SP2, and clusters of SGI and Sun workstations. Slightly superlinear speedup is achieved on a fixed-size problem on the Origin, due to cache effects.
A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

NASA Astrophysics Data System (ADS)

Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

2014-12-01

Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.
DInSAR time series generation within a cloud computing environment: from ERS to Sentinel-1 scenario

NASA Astrophysics Data System (ADS)

Casu, Francesco; Elefante, Stefano; Imperatore, Pasquale; Lanari, Riccardo; Manunta, Michele; Zinno, Ivana; Mathot, Emmanuel; Brito, Fabrice; Farres, Jordi; Lengert, Wolfgang

2013-04-01

One of the techniques that will strongly benefit from the advent of the Sentinel-1 system is Differential SAR Interferometry (DInSAR), which has successfully demonstrated to be an effective tool to detect and monitor ground displacements with centimetre accuracy. The geoscience communities (volcanology, seismicity, …), as well as those related to hazard monitoring and risk mitigation, make extensively use of the DInSAR technique and they will take advantage from the huge amount of SAR data acquired by Sentinel-1. Indeed, such an information will successfully permit the generation of Earth's surface displacement maps and time series both over large areas and long time span. However, the issue of managing, processing and analysing the large Sentinel data stream is envisaged by the scientific community to be a major bottleneck, particularly during crisis phases. The emerging need of creating a common ecosystem in which data, results and processing tools are shared, is envisaged to be a successful way to address such a problem and to contribute to the information and knowledge spreading. The Supersites initiative as well as the ESA SuperSites Exploitation Platform (SSEP) and the ESA Cloud Computing Operational Pilot (CIOP) projects provide effective answers to this need and they are pushing towards the development of such an ecosystem. It is clear that all the current and existent tools for querying, processing and analysing SAR data are required to be not only updated for managing the large data stream of Sentinel-1 satellite, but also reorganized for quickly replying to the simultaneous and highly demanding user requests, mainly during emergency situations. This translates into the automatic and unsupervised processing of large amount of data as well as the availability of scalable, widely accessible and high performance computing capabilities. The cloud computing environment permits to achieve all of these objectives, particularly in case of spike and peak requests of processing resources linked to disaster events. This work aims at presenting a parallel computational model for the widely used DInSAR algorithm named as Small BAseline Subset (SBAS), which has been implemented within the cloud computing environment provided by the ESA-CIOP platform. This activity has resulted in developing a scalable, unsupervised, portable, and widely accessible (through a web portal) parallel DInSAR computational tool. The activity has rewritten and developed the SBAS application algorithm within a parallel system environment, i.e., in a form that allows us to benefit from multiple processing units. This requires the devising a parallel version of the SBAS algorithm and its subsequent implementation, implying additional complexity in algorithm designing and an efficient multi processor programming, with the final aim of a parallel performance optimization. Although the presented algorithm has been designed to work with Sentinel-1 data, it can also process other satellite SAR data (ERS, ENVISAT, CSK, TSX, ALOS). Indeed, the performance analysis of the implemented SBAS parallel version has been tested on the full ASAR archive (64 acquisitions) acquired over the Napoli Bay, a volcanic and densely urbanized area in Southern Italy. The full processing - from the raw data download to the generation of DInSAR time series - has been carried out by engaging 4 nodes, each one with 2 cores and 16 GB of RAM, and has taken about 36 hours, with respect to about 135 hours of the sequential version. Extensive analysis on other test areas significant from DInSAR and geophysical viewpoint will be presented. Finally, preliminary performance evaluation of the presented approach within the Sentinel-1 scenario will be provided.
Programming with BIG data in R: Scaling analytics from one to thousands of nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmidt, Drew; Chen, Wei -Chen; Matheson, Michael A.

Here, we present a tutorial overview showing how one can achieve scalable performance with R. We do so by utilizing several package extensions, including those from the pbdR project. These packages consist of high performance, high-level interfaces to and extensions of MPI, PBLAS, ScaLAPACK, I/O libraries, profiling libraries, and more. While these libraries shine brightest on large distributed platforms, they also work rather well on small clusters and often, surprisingly, even on a laptop with only two cores. Our tutorial begins with recommendations on how to get more performance out of your R code before considering parallel implementations. Because Rmore » is a high-level language, a function can have a deep hierarchy of operations. For big data, this can easily lead to inefficiency. Profiling is an important tool to understand the performance of an R code for both serial and parallel improvements.« less
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Das, Sajal K.; Harvey, Daniel; Oliker, Leonid

1999-01-01

The ability to dynamically adapt an unstructured -rid (or mesh) is a powerful tool for solving computational problems with evolving physical features; however, an efficient parallel implementation is rather difficult, particularly from the view point of portability on various multiprocessor platforms We address this problem by developing PLUM, tin automatic anti architecture-independent framework for adaptive numerical computations in a message-passing environment. Portability is demonstrated by comparing performance on an SP2, an Origin2000, and a T3E, without any code modifications. We also present a general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication pattern, with a goal to providing a global view of system loads across processors. Experiments on, an SP2 and an Origin2000 demonstrate the portability of our approach which achieves superb load balance at the cost of minimal extra overhead.

Programming with BIG data in R: Scaling analytics from one to thousands of nodes

DOE PAGES

Schmidt, Drew; Chen, Wei -Chen; Matheson, Michael A.; ...

2016-11-09

Here, we present a tutorial overview showing how one can achieve scalable performance with R. We do so by utilizing several package extensions, including those from the pbdR project. These packages consist of high performance, high-level interfaces to and extensions of MPI, PBLAS, ScaLAPACK, I/O libraries, profiling libraries, and more. While these libraries shine brightest on large distributed platforms, they also work rather well on small clusters and often, surprisingly, even on a laptop with only two cores. Our tutorial begins with recommendations on how to get more performance out of your R code before considering parallel implementations. Because Rmore » is a high-level language, a function can have a deep hierarchy of operations. For big data, this can easily lead to inefficiency. Profiling is an important tool to understand the performance of an R code for both serial and parallel improvements.« less
Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less
Ordering Traces Logically to Identify Lateness in Message Passing Programs

DOE PAGES

Isaacs, Katherine E.; Gamblin, Todd; Bhatele, Abhinav; ...

2015-03-30

Event traces are valuable for understanding the behavior of parallel programs. However, automatically analyzing a large parallel trace is difficult, especially without a specific objective. We aid this endeavor by extracting a trace's logical structure, an ordering of trace events derived from happened-before relationships, while taking into account developer intent. Using this structure, we can calculate an operation's delay relative to its peers on other processes. The logical structure also serves as a platform for comparing and clustering processes as well as highlighting communication patterns in a trace visualization. We present an algorithm for determining this idealized logical structure frommore » traces of message passing programs, and we develop metrics to quantify delays and differences among processes. We implement our techniques in Ravel, a parallel trace visualization tool that displays both logical and physical timelines. Rather than showing the duration of each operation, we display where delays begin and end, and how they propagate. As a result, we apply our approach to the traces of several message passing applications, demonstrating the accuracy of our extracted structure and its utility in analyzing these codes.« less
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.

PubMed

Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi

2018-01-01

Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Trajectory Tracking of a Planer Parallel Manipulator by Using Computed Force Control Method

NASA Astrophysics Data System (ADS)

Bayram, Atilla

2017-03-01

Despite small workspace, parallel manipulators have some advantages over their serial counterparts in terms of higher speed, acceleration, rigidity, accuracy, manufacturing cost and payload. Accordingly, this type of manipulators can be used in many applications such as in high-speed machine tools, tuning machine for feeding, sensitive cutting, assembly and packaging. This paper presents a special type of planar parallel manipulator with three degrees of freedom. It is constructed as a variable geometry truss generally known planar Stewart platform. The reachable and orientation workspaces are obtained for this manipulator. The inverse kinematic analysis is solved for the trajectory tracking according to the redundancy and joint limit avoidance. Then, the dynamics model of the manipulator is established by using Virtual Work method. The simulations are performed to follow the given planar trajectories by using the dynamic equations of the variable geometry truss manipulator and computed force control method. In computed force control method, the feedback gain matrices for PD control are tuned with fixed matrices by trail end error and variable ones by means of optimization with genetic algorithm.
Dispel4py: An Open-Source Python library for Data-Intensive Seismology

NASA Astrophysics Data System (ADS)

Filgueira, Rosa; Krause, Amrey; Spinuso, Alessandro; Klampanos, Iraklis; Danecek, Peter; Atkinson, Malcolm

2015-04-01

Scientific workflows are a necessary tool for many scientific communities as they enable easy composition and execution of applications on computing resources while scientists can focus on their research without being distracted by the computation management. Nowadays, scientific communities (e.g. Seismology) have access to a large variety of computing resources and their computational problems are best addressed using parallel computing technology. However, successful use of these technologies requires a lot of additional machinery whose use is not straightforward for non-experts: different parallel frameworks (MPI, Storm, multiprocessing, etc.) must be used depending on the computing resources (local machines, grids, clouds, clusters) where applications are run. This implies that for achieving the best applications' performance, users usually have to change their codes depending on the features of the platform selected for running them. This work presents dispel4py, a new open-source Python library for describing abstract stream-based workflows for distributed data-intensive applications. Special care has been taken to provide dispel4py with the ability to map abstract workflows to different platforms dynamically at run-time. Currently dispel4py has four mappings: Apache Storm, MPI, multi-threading and sequential. The main goal of dispel4py is to provide an easy-to-use tool to develop and test workflows in local resources by using the sequential mode with a small dataset. Later, once a workflow is ready for long runs, it can be automatically executed on different parallel resources. dispel4py takes care of the underlying mappings by performing an efficient parallelisation. Processing Elements (PE) represent the basic computational activities of any dispel4Py workflow, which can be a seismologic algorithm, or a data transformation process. For creating a dispel4py workflow, users only have to write very few lines of code to describe their PEs and how they are connected by using Python, which is widely supported on many platforms and is popular in many scientific domains, such as in geosciences. Once, a dispel4py workflow is written, a user only has to select which mapping they would like to use, and everything else (parallelisation, distribution of data) is carried on by dispel4py without any cost to the user. Among all dispel4py features we would like to highlight the following: * The PEs are connected by streams and not by writing to and reading from intermediate files, avoiding many IO operations. * The PEs can be stored into a registry. Therefore, different users can recombine PEs in many different workflows. * dispel4py has been enriched with a provenance mechanism to support runtime provenance analysis. We have adopted the W3C-PROV data model, which is accessible via a prototypal browser-based user interface and a web API. It supports the users with the visualisation of graphical products and offers combined operations to access and download the data, which may be selectively stored at runtime, into dedicated data archives. dispel4py has been already used by seismologists in the VERCE project to develop different seismic workflows. One of them is the Seismic Ambient Noise Cross-Correlation workflow, which preprocesses and cross-correlates traces from several stations. First, this workflow was tested on a local machine by using a small number of stations as input data. Later, it was executed on different parallel platforms (SuperMUC cluster, and Terracorrelator machine), automatically scaling up by using MPI and multiprocessing mappings and up to 1000 stations as input data. The results show that the dispel4py achieves scalable performance in both mappings tested on different parallel platforms.
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S

2013-01-01

With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES) over multiple virtual machines, in contrast to executing in native mode directly over hardware as is traditionally done over the past decades. While mature VM-based parallel systems now offer new, compelling benefits such as serviceability, dynamic reconfigurability and overall cost effectiveness, the runtime performance of parallel applications can be significantly affected. In particular, most VM-based platforms are optimized for general workloads, but PDES execution exhibits unique dynamics significantly different from other workloads. Here we first present results frommore » experiments that highlight the gross deterioration of the runtime performance of VM-based PDES simulations when executed using traditional VM schedulers, quantitatively showing the bad scaling properties of the scheduler as the number of VMs is increased. The mismatch is fundamental in nature in the sense that any fairness-based VM scheduler implementation would exhibit this mismatch with PDES runs. We also present a new scheduler optimized specifically for PDES applications, and describe its design and implementation. Experimental results obtained from running PDES benchmarks (PHOLD and vehicular traffic simulations) over VMs show over an order of magnitude improvement in the run time of the PDES-optimized scheduler relative to the regular VM scheduler, with over 20 reduction in run time of simulations using up to 64 VMs. The observations and results are timely in the context of emerging systems such as cloud platforms and VM-based high performance computing installations, highlighting to the community the need for PDES-specific support, and the feasibility of significantly reducing the runtime overhead for scalable PDES on VM platforms.« less
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

NASA Astrophysics Data System (ADS)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
Mapper: high throughput maskless lithography

NASA Astrophysics Data System (ADS)

Kuiper, V.; Kampherbeek, B. J.; Wieland, M. J.; de Boer, G.; ten Berge, G. F.; Boers, J.; Jager, R.; van de Peut, T.; Peijster, J. J. M.; Slot, E.; Steenbrink, S. W. H. K.; Teepen, T. F.; van Veen, A. H. V.

2009-01-01

Maskless electron beam lithography, or electron beam direct write, has been around for a long time in the semiconductor industry and was pioneered from the mid-1960s onwards. This technique has been used for mask writing applications as well as device engineering and in some cases chip manufacturing. However because of its relatively low throughput compared to optical lithography, electron beam lithography has never been the mainstream lithography technology. To extend optical lithography double patterning, as a bridging technology, and EUV lithography are currently explored. Irrespective of the technical viability of both approaches, one thing seems clear. They will be expensive [1]. MAPPER Lithography is developing a maskless lithography technology based on massively-parallel electron-beam writing with high speed optical data transport for switching the electron beams. In this way optical columns can be made with a throughput of 10-20 wafers per hour. By clustering several of these columns together high throughputs can be realized in a small footprint. This enables a highly cost-competitive alternative to double patterning and EUV alternatives. In 2007 MAPPER obtained its Proof of Lithography milestone by exposing in its Demonstrator 45 nm half pitch structures with 110 electron beams in parallel, where all the beams where individually switched on and off [2]. In 2008 MAPPER has taken a next step in its development by building several tools. A new platform has been designed and built which contains a 300 mm wafer stage, a wafer handler and an electron beam column with 110 parallel electron beams. This manuscript describes the first patterning results with this 300 mm platform.
CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.

PubMed

Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B

2011-08-15

Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.
A Simple Tool for the Design and Analysis of Multiple-Reflector Antennas in a Multi-Disciplinary Environment

NASA Technical Reports Server (NTRS)

Katz, Daniel S.; Cwik, Tom; Fu, Chuigang; Imbriale, William A.; Jamnejad, Vahraz; Springer, Paul L.; Borgioli, Andrea

2000-01-01

The process of designing and analyzing a multiple-reflector system has traditionally been time-intensive, requiring large amounts of both computational and human time. At many frequencies, a discrete approximation of the radiation integral may be used to model the system. The code which implements this physical optics (PO) algorithm was developed at the Jet Propulsion Laboratory. It analyzes systems of antennas in pairs, and for each pair, the analysis can be computationally time-consuming. Additionally, the antennas must be described using a local coordinate system for each antenna, which makes it difficult to integrate the design into a multi-disciplinary framework in which there is traditionally one global coordinate system, even before considering deforming the antenna as prescribed by external structural and/or thermal factors. Finally, setting up the code to correctly analyze all the antenna pairs in the system can take a fair amount of time, and introduces possible human error. The use of parallel computing to reduce the computational time required for the analysis of a given pair of antennas has been previously discussed. This paper focuses on the other problems mentioned above. It will present a methodology and examples of use of an automated tool that performs the analysis of a complete multiple-reflector system in an integrated multi-disciplinary environment (including CAD modeling, and structural and thermal analysis) at the click of a button. This tool, named MOD Tool (Millimeter-wave Optics Design Tool), has been designed and implemented as a distributed tool, with a client that runs almost identically on Unix, Mac, and Windows platforms, and a server that runs primarily on a Unix workstation and can interact with parallel supercomputers with simple instruction from the user interacting with the client.
STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

PubMed Central

Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent

2009-01-01

Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a STARNET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at , and does not require user registration. PMID:19828039
Software platform for simulation of a prototype proton CT scanner.

PubMed

Giacometti, Valentina; Bashkirov, Vladimir A; Piersimoni, Pierluigi; Guatelli, Susanna; Plautz, Tia E; Sadrozinski, Hartmut F-W; Johnson, Robert P; Zatserklyaniy, Andriy; Tessonnier, Thomas; Parodi, Katia; Rosenfeld, Anatoly B; Schulte, Reinhard W

2017-03-01

Proton computed tomography (pCT) is a promising imaging technique to substitute or at least complement x-ray CT for more accurate proton therapy treatment planning as it allows calculating directly proton relative stopping power from proton energy loss measurements. A proton CT scanner with a silicon-based particle tracking system and a five-stage scintillating energy detector has been completed. In parallel a modular software platform was developed to characterize the performance of the proposed pCT. The modular pCT software platform consists of (1) a Geant4-based simulation modeling the Loma Linda proton therapy beam line and the prototype proton CT scanner, (2) water equivalent path length (WEPL) calibration of the scintillating energy detector, and (3) image reconstruction algorithm for the reconstruction of the relative stopping power (RSP) of the scanned object. In this work, each component of the modular pCT software platform is described and validated with respect to experimental data and benchmarked against theoretical predictions. In particular, the RSP reconstruction was validated with both experimental scans, water column measurements, and theoretical calculations. The results show that the pCT software platform accurately reproduces the performance of the existing prototype pCT scanner with a RSP agreement between experimental and simulated values to better than 1.5%. The validated platform is a versatile tool for clinical proton CT performance and application studies in a virtual setting. The platform is flexible and can be modified to simulate not yet existing versions of pCT scanners and higher proton energies than those currently clinically available. © 2017 American Association of Physicists in Medicine.
DOVIS 2.0: an efficient and easy to use parallel virtual screening tool based on AutoDock 4.0.

PubMed

Jiang, Xiaohui; Kumar, Kamal; Hu, Xin; Wallqvist, Anders; Reifman, Jaques

2008-09-08

Small-molecule docking is an important tool in studying receptor-ligand interactions and in identifying potential drug candidates. Previously, we developed a software tool (DOVIS) to perform large-scale virtual screening of small molecules in parallel on Linux clusters, using AutoDock 3.05 as the docking engine. DOVIS enables the seamless screening of millions of compounds on high-performance computing platforms. In this paper, we report significant advances in the software implementation of DOVIS 2.0, including enhanced screening capability, improved file system efficiency, and extended usability. To keep DOVIS up-to-date, we upgraded the software's docking engine to the more accurate AutoDock 4.0 code. We developed a new parallelization scheme to improve runtime efficiency and modified the AutoDock code to reduce excessive file operations during large-scale virtual screening jobs. We also implemented an algorithm to output docked ligands in an industry standard format, sd-file format, which can be easily interfaced with other modeling programs. Finally, we constructed a wrapper-script interface to enable automatic rescoring of docked ligands by arbitrarily selected third-party scoring programs. The significance of the new DOVIS 2.0 software compared with the previous version lies in its improved performance and usability. The new version makes the computation highly efficient by automating load balancing, significantly reducing excessive file operations by more than 95%, providing outputs that conform to industry standard sd-file format, and providing a general wrapper-script interface for rescoring of docked ligands. The new DOVIS 2.0 package is freely available to the public under the GNU General Public License.
Publishing Platform for Scientific Software - Lessons Learned

NASA Astrophysics Data System (ADS)

Hammitzsch, Martin; Fritzsch, Bernadette; Reusser, Dominik; Brembs, Björn; Deinzer, Gernot; Loewe, Peter; Fenner, Martin; van Edig, Xenia; Bertelmann, Roland; Pampel, Heinz; Klump, Jens; Wächter, Joachim

2015-04-01

Scientific software has become an indispensable commodity for the production, processing and analysis of empirical data but also for modelling and simulation of complex processes. Software has a significant influence on the quality of research results. For strengthening the recognition of the academic performance of scientific software development, for increasing its visibility and for promoting the reproducibility of research results, concepts for the publication of scientific software have to be developed, tested, evaluated, and then transferred into operations. For this, the publication and citability of scientific software have to fulfil scientific criteria by means of defined processes and the use of persistent identifiers, similar to data publications. The SciForge project is addressing these challenges. Based on interviews a blueprint for a scientific software publishing platform and a systematic implementation plan has been designed. In addition, the potential of journals, software repositories and persistent identifiers have been evaluated to improve the publication and dissemination of reusable software solutions. It is important that procedures for publishing software as well as methods and tools for software engineering are reflected in the architecture of the platform, in order to improve the quality of the software and the results of research. In addition, it is necessary to work continuously on improving specific conditions that promote the adoption and sustainable utilization of scientific software publications. Among others, this would include policies for the development and publication of scientific software in the institutions but also policies for establishing the necessary competencies and skills of scientists and IT personnel. To implement the concepts developed in SciForge a combined bottom-up / top-down approach is considered that will be implemented in parallel in different scientific domains, e.g. in earth sciences, climate research and the life sciences. Based on the developed blueprints a scientific software publishing platform will be iteratively implemented, tested, and evaluated. Thus the platform should be developed continuously on the basis of gained experiences and results. The platform services will be extended one by one corresponding to the requirements of the communities. Thus the implemented platform for the publication of scientific software can be improved and stabilized incrementally as a tool with software, science, publishing, and user oriented features.
myBrain: a novel EEG embedded system for epilepsy monitoring.

PubMed

Pinho, Francisco; Cerqueira, João; Correia, José; Sousa, Nuno; Dias, Nuno

2017-10-01

The World Health Organisation has pointed that a successful health care delivery, requires effective medical devices as tools for prevention, diagnosis, treatment and rehabilitation. Several studies have concluded that longer monitoring periods and outpatient settings might increase diagnosis accuracy and success rate of treatment selection. The long-term monitoring of epileptic patients through electroencephalography (EEG) has been considered a powerful tool to improve the diagnosis, disease classification, and treatment of patients with such condition. This work presents the development of a wireless and wearable EEG acquisition platform suitable for both long-term and short-term monitoring in inpatient and outpatient settings. The developed platform features 32 passive dry electrodes, analogue-to-digital signal conversion with 24-bit resolution and a variable sampling frequency from 250 Hz to 1000 Hz per channel, embedded in a stand-alone module. A computer-on-module embedded system runs a Linux ® operating system that rules the interface between two software frameworks, which interact to satisfy the real-time constraints of signal acquisition as well as parallel recording, processing and wireless data transmission. A textile structure was developed to accommodate all components. Platform performance was evaluated in terms of hardware, software and signal quality. The electrodes were characterised through electrochemical impedance spectroscopy and the operating system performance running an epileptic discrimination algorithm was evaluated. Signal quality was thoroughly assessed in two different approaches: playback of EEG reference signals and benchmarking with a clinical-grade EEG system in alpha-wave replacement and steady-state visual evoked potential paradigms. The proposed platform seems to efficiently monitor epileptic patients in both inpatient and outpatient settings and paves the way to new ambulatory clinical regimens as well as non-clinical EEG applications.
A Roadmap to Continuous Integration for ATLAS Software Development

NASA Astrophysics Data System (ADS)

Elmsheuser, J.; Krasznahorkay, A.; Obreshkov, E.; Undrus, A.; ATLAS Collaboration

2017-10-01

The ATLAS software infrastructure facilitates efforts of more than 1000 developers working on the code base of 2200 packages with 4 million lines of C++ and 1.4 million lines of python code. The ATLAS offline code management system is the powerful, flexible framework for processing new package versions requests, probing code changes in the Nightly Build System, migration to new platforms and compilers, deployment of production releases for worldwide access and supporting physicists with tools and interfaces for efficient software use. It maintains multi-stream, parallel development environment with about 70 multi-platform branches of nightly releases and provides vast opportunities for testing new packages, for verifying patches to existing software and for migrating to new platforms and compilers. The system evolution is currently aimed on the adoption of modern continuous integration (CI) practices focused on building nightly releases early and often, with rigorous unit and integration testing. This paper describes the CI incorporation program for the ATLAS software infrastructure. It brings modern open source tools such as Jenkins and GitLab into the ATLAS Nightly System, rationalizes hardware resource allocation and administrative operations, provides improved feedback and means to fix broken builds promptly for developers. Once adopted, ATLAS CI practices will improve and accelerate innovation cycles and result in increased confidence in new software deployments. The paper reports the status of Jenkins integration with the ATLAS Nightly System as well as short and long term plans for the incorporation of CI practices.
Massively Parallel, Molecular Analysis Platform Developed Using a CMOS Integrated Circuit With Biological Nanopores

PubMed Central

Roever, Stefan

2012-01-01

A massively parallel, low cost molecular analysis platform will dramatically change the nature of protein, molecular and genomics research, DNA sequencing, and ultimately, molecular diagnostics. An integrated circuit (IC) with 264 sensors was fabricated using standard CMOS semiconductor processing technology. Each of these sensors is individually controlled with precision analog circuitry and is capable of single molecule measurements. Under electronic and software control, the IC was used to demonstrate the feasibility of creating and detecting lipid bilayers and biological nanopores using wild type α-hemolysin. The ability to dynamically create bilayers over each of the sensors will greatly accelerate pore development and pore mutation analysis. In addition, the noise performance of the IC was measured to be 30fA(rms). With this noise performance, single base detection of DNA was demonstrated using α-hemolysin. The data shows that a single molecule, electrical detection platform using biological nanopores can be operationalized and can ultimately scale to millions of sensors. Such a massively parallel platform will revolutionize molecular analysis and will completely change the field of molecular diagnostics in the future.
The Forest Method as a New Parallel Tree Method with the Sectional Voronoi Tessellation

NASA Astrophysics Data System (ADS)

Yahagi, Hideki; Mori, Masao; Yoshii, Yuzuru

1999-09-01

We have developed a new parallel tree method which will be called the forest method hereafter. This new method uses the sectional Voronoi tessellation (SVT) for the domain decomposition. The SVT decomposes a whole space into polyhedra and allows their flat borders to move by assigning different weights. The forest method determines these weights based on the load balancing among processors by means of the overload diffusion (OLD). Moreover, since all the borders are flat, before receiving the data from other processors, each processor can collect enough data to calculate the gravity force with precision. Both the SVT and the OLD are coded in a highly vectorizable manner to accommodate on vector parallel processors. The parallel code based on the forest method with the Message Passing Interface is run on various platforms so that a wide portability is guaranteed. Extensive calculations with 15 processors of Fujitsu VPP300/16R indicate that the code can calculate the gravity force exerted on 105 particles in each second for some ideal dark halo. This code is found to enable an N-body simulation with 107 or more particles for a wide dynamic range and is therefore a very powerful tool for the study of galaxy formation and large-scale structure in the universe.
Conceptual design and kinematic analysis of a novel parallel robot for high-speed pick-and-place operations

NASA Astrophysics Data System (ADS)

Meng, Qizhi; Xie, Fugui; Liu, Xin-Jun

2018-06-01

This paper deals with the conceptual design, kinematic analysis and workspace identification of a novel four degrees-of-freedom (DOFs) high-speed spatial parallel robot for pick-and-place operations. The proposed spatial parallel robot consists of a base, four arms and a 1½ mobile platform. The mobile platform is a major innovation that avoids output singularity and offers the advantages of both single and double platforms. To investigate the characteristics of the robot's DOFs, a line graph method based on Grassmann line geometry is adopted in mobility analysis. In addition, the inverse kinematics is derived, and the constraint conditions to identify the correct solution are also provided. On the basis of the proposed concept, the workspace of the robot is identified using a set of presupposed parameters by taking input and output transmission index as the performance evaluation criteria.

Massively parallel simulator of optical coherence tomography of inhomogeneous turbid media.

PubMed

Malektaji, Siavash; Lima, Ivan T; Escobar I, Mauricio R; Sherif, Sherif S

2017-10-01

An accurate and practical simulator for Optical Coherence Tomography (OCT) could be an important tool to study the underlying physical phenomena in OCT such as multiple light scattering. Recently, many researchers have investigated simulation of OCT of turbid media, e.g., tissue, using Monte Carlo methods. The main drawback of these earlier simulators is the long computational time required to produce accurate results. We developed a massively parallel simulator of OCT of inhomogeneous turbid media that obtains both Class I diffusive reflectivity, due to ballistic and quasi-ballistic scattered photons, and Class II diffusive reflectivity due to multiply scattered photons. This Monte Carlo-based simulator is implemented on graphic processing units (GPUs), using the Compute Unified Device Architecture (CUDA) platform and programming model, to exploit the parallel nature of propagation of photons in tissue. It models an arbitrary shaped sample medium as a tetrahedron-based mesh and uses an advanced importance sampling scheme. This new simulator speeds up simulations of OCT of inhomogeneous turbid media by about two orders of magnitude. To demonstrate this result, we have compared the computation times of our new parallel simulator and its serial counterpart using two samples of inhomogeneous turbid media. We have shown that our parallel implementation reduced simulation time of OCT of the first sample medium from 407 min to 92 min by using a single GPU card, to 12 min by using 8 GPU cards and to 7 min by using 16 GPU cards. For the second sample medium, the OCT simulation time was reduced from 209 h to 35.6 h by using a single GPU card, and to 4.65 h by using 8 GPU cards, and to only 2 h by using 16 GPU cards. Therefore our new parallel simulator is considerably more practical to use than its central processing unit (CPU)-based counterpart. Our new parallel OCT simulator could be a practical tool to study the different physical phenomena underlying OCT, or to design OCT systems with improved performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

2005-01-01

The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
Missile signal processing common computer architecture for rapid technology upgrade

NASA Astrophysics Data System (ADS)

Rabinkin, Daniel V.; Rutledge, Edward; Monticciolo, Paul

2004-10-01

Interceptor missiles process IR images to locate an intended target and guide the interceptor towards it. Signal processing requirements have increased as the sensor bandwidth increases and interceptors operate against more sophisticated targets. A typical interceptor signal processing chain is comprised of two parts. Front-end video processing operates on all pixels of the image and performs such operations as non-uniformity correction (NUC), image stabilization, frame integration and detection. Back-end target processing, which tracks and classifies targets detected in the image, performs such algorithms as Kalman tracking, spectral feature extraction and target discrimination. In the past, video processing was implemented using ASIC components or FPGAs because computation requirements exceeded the throughput of general-purpose processors. Target processing was performed using hybrid architectures that included ASICs, DSPs and general-purpose processors. The resulting systems tended to be function-specific, and required custom software development. They were developed using non-integrated toolsets and test equipment was developed along with the processor platform. The lifespan of a system utilizing the signal processing platform often spans decades, while the specialized nature of processor hardware and software makes it difficult and costly to upgrade. As a result, the signal processing systems often run on outdated technology, algorithms are difficult to update, and system effectiveness is impaired by the inability to rapidly respond to new threats. A new design approach is made possible three developments; Moore's Law - driven improvement in computational throughput; a newly introduced vector computing capability in general purpose processors; and a modern set of open interface software standards. Today's multiprocessor commercial-off-the-shelf (COTS) platforms have sufficient throughput to support interceptor signal processing requirements. This application may be programmed under existing real-time operating systems using parallel processing software libraries, resulting in highly portable code that can be rapidly migrated to new platforms as processor technology evolves. Use of standardized development tools and 3rd party software upgrades are enabled as well as rapid upgrade of processing components as improved algorithms are developed. The resulting weapon system will have a superior processing capability over a custom approach at the time of deployment as a result of a shorter development cycles and use of newer technology. The signal processing computer may be upgraded over the lifecycle of the weapon system, and can migrate between weapon system variants enabled by modification simplicity. This paper presents a reference design using the new approach that utilizes an Altivec PowerPC parallel COTS platform. It uses a VxWorks-based real-time operating system (RTOS), and application code developed using an efficient parallel vector library (PVL). A quantification of computing requirements and demonstration of interceptor algorithm operating on this real-time platform are provided.
Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

NASA Astrophysics Data System (ADS)

Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

2018-01-01

We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
Kinematics of an in-parallel actuated manipulator based on the Stewart platform mechanism

NASA Technical Reports Server (NTRS)

Williams, Robert L., II

1992-01-01

This paper presents kinematic equations and solutions for an in-parallel actuated robotic mechanism based on Stewart's platform. These equations are required for inverse position and resolved rate (inverse velocity) platform control. NASA LaRC has a Vehicle Emulator System (VES) platform designed by MIT which is based on Stewart's platform. The inverse position solution is straight-forward and computationally inexpensive. Given the desired position and orientation of the moving platform with respect to the base, the lengths of the prismatic leg actuators are calculated. The forward position solution is more complicated and theoretically has 16 solutions. The position and orientation of the moving platform with respect to the base is calculated given the leg actuator lengths. Two methods are pursued in this paper to solve this problem. The resolved rate (inverse velocity) solution is derived. Given the desired Cartesian velocity of the end-effector, the required leg actuator rates are calculated. The Newton-Raphson Jacobian matrix resulting from the second forward position kinematics solution is a modified inverse Jacobian matrix. Examples and simulations are given for the VES.
Ion beam figuring of highly steep mirrors with a 5-axis hybrid machine tool

NASA Astrophysics Data System (ADS)

Yin, Xiaolin; Tang, Wa; Hu, Haixiang; Zeng, Xuefeng; Wang, Dekang; Xue, Donglin; Zhang, Feng; Deng, Weijie; Zhang, Xuejun

2018-02-01

Ion beam figuring (IBF) is an advanced and deterministic method for optical mirror surface processing. The removal function of IBF varies with the different incident angles of ion beam. Therefore, for the curved surface especially the highly steep one, the Ion Beam Source (IBS) should be equipped with 5-axis machining capability to remove the material along the normal direction of the mirror surface, so as to ensure the stability of the removal function. Based on the 3-RPS parallel mechanism and two dimensional displacement platform, a new type of 5-axis hybrid machine tool for IBF is presented. With the hybrid machine tool, the figuring process of a highly steep fused silica spherical mirror is introduced. The R/# of the mirror is 0.96 and the aperture is 104mm. The figuring result shows that, PV value of the mirror surface error is converged from 121.1nm to32.3nm, and RMS value 23.6nm to 3.4nm.
Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery.

PubMed

Karthikeyan, Muthukumarasamy; Vyas, Renu

2015-01-01

Advancement in chemoinformatics research in parallel with availability of high performance computing platform has made handling of large scale multi-dimensional scientific data for high throughput drug discovery easier. In this study we have explored publicly available molecular databases with the help of open-source based integrated in-house molecular informatics tools for virtual screening. The virtual screening literature for past decade has been extensively investigated and thoroughly analyzed to reveal interesting patterns with respect to the drug, target, scaffold and disease space. The review also focuses on the integrated chemoinformatics tools that are capable of harvesting chemical data from textual literature information and transform them into truly computable chemical structures, identification of unique fragments and scaffolds from a class of compounds, automatic generation of focused virtual libraries, computation of molecular descriptors for structure-activity relationship studies, application of conventional filters used in lead discovery along with in-house developed exhaustive PTC (Pharmacophore, Toxicophores and Chemophores) filters and machine learning tools for the design of potential disease specific inhibitors. A case study on kinase inhibitors is provided as an example.
Developing Web-based Tools for Collaborative Science and Public Outreach

NASA Astrophysics Data System (ADS)

Friedman, A.; Pizarro, O.; Williams, S. B.

2016-02-01

With the advances in high bandwidth communications and the proliferation of social media tools, education & outreach activities have become commonplace on ocean-bound research cruises. In parallel, advances in underwater robotics & other data collecting platforms, have made it possible to collect copious amounts of oceanographic data. This data then typically undergoes laborious, manual processing to transform it into quantitative information, which normally occurs post cruise resulting in significant lags between collecting data and using it for scientific discovery. This presentation discusses how appropriately designed software systems, can be used to fulfill multiple objectives and attempt to leverage public engagement in order to compliment science goals. We will present two software platforms: the first is a web browser based tool that was developed for real-time tracking of multiple underwater robots and ships. It was designed to allow anyone on board to view or control it on any device with a web browser. It opens up the possibility of remote teleoperation & engagement and was easily adapted to enable live streaming over the internet for public outreach. While the tracking system provided context and engaged people in real-time, it also directed interested participants to Squidle, another online system. Developed for scientists, Squidle supports data management, exploration & analysis and enables direct access to survey data reducing the lag in data processing. It provides a user-friendly streamlined interface that integrates advanced data management & online annotation tools. This system was adapted to provide a simplified user interface, tutorial instructions and a gamified ranking system to encourage "citizen science" participation. These examples show that through a flexible design approach, it is possible to leverage the development effort of creating science tools to facilitate outreach goals, opening up the possibility for acquiring large volumes of crowd-sourced data without compromising science objectives.
Robustness of Massively Parallel Sequencing Platforms

PubMed Central

Kavak, Pınar; Yüksel, Bayram; Aksu, Soner; Kulekci, M. Oguzhan; Güngör, Tunga; Hach, Faraz; Şahinalp, S. Cenk; Alkan, Can; Sağıroğlu, Mahmut Şamil

2015-01-01

The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications. PMID:26382624
A Programming Model Performance Study Using the NAS Parallel Benchmarks

DOE PAGES

Shan, Hongzhang; Blagojević, Filip; Min, Seung-Jai; ...

2010-01-01

Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper we use the NAS Parallel Benchmarks to study three programming models, MPI, OpenMP and PGAS to understand their performance and memory usage characteristics on current multicore architectures. To understand these characteristics we use the Integrated Performance Monitoring tool and other ways to measure communication versus computation time, as well as the fraction of the run time spent in OpenMP. The benchmarks are run on two different Cray XT5 systems and an Infiniband cluster. Our results show that in general the threemore » programming models exhibit very similar performance characteristics. In a few cases, OpenMP is significantly faster because it explicitly avoids communication. For these particular cases, we were able to re-write the UPC versions and achieve equal performance to OpenMP. Using OpenMP was also the most advantageous in terms of memory usage. Also we compare performance differences between the two Cray systems, which have quad-core and hex-core processors. We show that at scale the performance is almost always slower on the hex-core system because of increased contention for network resources.« less
Parallel Index and Query for Large Scale Data Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver

2011-07-18

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing ofmore » a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.« less
Imaging electric field dynamics with graphene optoelectronics

DOE PAGES

Horng, Jason; Balch, Halleh B.; McGuire, Allister F.; ...

2016-12-16

The use of electric fields for signalling and control in liquids is widespread, spanning bioelectric activity in cells to electrical manipulation of microstructures in lab-on-a-chip devices. However, an appropriate tool to resolve the spatio-temporal distribution of electric fields over a large dynamic range has yet to be developed. Here we present a label-free method to image local electric fields in real time and under ambient conditions. Our technique combines the unique gate-variable optical transitions of graphene with a critically coupled planar waveguide platform that enables highly sensitive detection of local electric fields with a voltage sensitivity of a few microvolts,more » a spatial resolution of tens of micrometres and a frequency response over tens of kilohertz. Our imaging platform enables parallel detection of electric fields over a large field of view and can be tailored to broad applications spanning lab-on-a-chip device engineering to analysis of bioelectric phenomena.« less
Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

PubMed Central

2014-01-01

Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911
Imaging electric field dynamics with graphene optoelectronics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Horng, Jason; Balch, Halleh B.; McGuire, Allister F.

The use of electric fields for signalling and control in liquids is widespread, spanning bioelectric activity in cells to electrical manipulation of microstructures in lab-on-a-chip devices. However, an appropriate tool to resolve the spatio-temporal distribution of electric fields over a large dynamic range has yet to be developed. Here we present a label-free method to image local electric fields in real time and under ambient conditions. Our technique combines the unique gate-variable optical transitions of graphene with a critically coupled planar waveguide platform that enables highly sensitive detection of local electric fields with a voltage sensitivity of a few microvolts,more » a spatial resolution of tens of micrometres and a frequency response over tens of kilohertz. Our imaging platform enables parallel detection of electric fields over a large field of view and can be tailored to broad applications spanning lab-on-a-chip device engineering to analysis of bioelectric phenomena.« less
A scalable double-barcode sequencing platform for characterization of dynamic protein-protein interactions.

PubMed

Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R; St Onge, Robert P; Levy, Sasha F

2017-05-25

Several large-scale efforts have systematically catalogued protein-protein interactions (PPIs) of a cell in a single environment. However, little is known about how the protein interactome changes across environmental perturbations. Current technologies, which assay one PPI at a time, are too low throughput to make it practical to study protein interactome dynamics. Here, we develop a highly parallel protein-protein interaction sequencing (PPiSeq) platform that uses a novel double barcoding system in conjunction with the dihydrofolate reductase protein-fragment complementation assay in Saccharomyces cerevisiae. PPiSeq detects PPIs at a rate that is on par with current assays and, in contrast with current methods, quantitatively scores PPIs with enough accuracy and sensitivity to detect changes across environments. Both PPI scoring and the bulk of strain construction can be performed with cell pools, making the assay scalable and easily reproduced across environments. PPiSeq is therefore a powerful new tool for large-scale investigations of dynamic PPIs.
1001 Ways to run AutoDock Vina for virtual screening

NASA Astrophysics Data System (ADS)

Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D.

2016-03-01

Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
1001 Ways to run AutoDock Vina for virtual screening.

PubMed

Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D

2016-03-01

Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
Establishing a Novel Modeling Tool: A Python-Based Interface for a Neuromorphic Hardware System

PubMed Central

Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz

2008-01-01

Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated. PMID:19562085
Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system.

PubMed

Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz

2009-01-01

Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.
Durham extremely large telescope adaptive optics simulation platform.

PubMed

Basden, Alastair; Butterley, Timothy; Myers, Richard; Wilson, Richard

2007-03-01

Adaptive optics systems are essential on all large telescopes for which image quality is important. These are complex systems with many design parameters requiring optimization before good performance can be achieved. The simulation of adaptive optics systems is therefore necessary to categorize the expected performance. We describe an adaptive optics simulation platform, developed at Durham University, which can be used to simulate adaptive optics systems on the largest proposed future extremely large telescopes as well as on current systems. This platform is modular, object oriented, and has the benefit of hardware application acceleration that can be used to improve the simulation performance, essential for ensuring that the run time of a given simulation is acceptable. The simulation platform described here can be highly parallelized using parallelization techniques suited for adaptive optics simulation, while still offering the user complete control while the simulation is running. The results from the simulation of a ground layer adaptive optics system are provided as an example to demonstrate the flexibility of this simulation platform.

Xyce

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomquist, Heidi K.; Fixel, Deborah A.; Fett, David Brian

The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.
Parallel k-means++

DOE Office of Scientific and Technical Information (OSTI.GOV)

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique.more » We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.« less
Use of Parallel Micro-Platform for the Simulation the Space Exploration

NASA Astrophysics Data System (ADS)

Velasco Herrera, Victor Manuel; Velasco Herrera, Graciela; Rosano, Felipe Lara; Rodriguez Lozano, Salvador; Lucero Roldan Serrato, Karen

The purpose of this work is to create a parallel micro-platform, that simulates the virtual movements of a space exploration in 3D. One of the innovations presented in this design consists of the application of a lever mechanism for the transmission of the movement. The development of such a robot is a challenging task very different of the industrial manipulators due to a totally different target system of requirements. This work presents the study and simulation, aided by computer, of the movement of this parallel manipulator. The development of this model has been developed using the platform of computer aided design Unigraphics, in which it was done the geometric modeled of each one of the components and end assembly (CAD), the generation of files for the computer aided manufacture (CAM) of each one of the pieces and the kinematics simulation of the system evaluating different driving schemes. We used the toolbox (MATLAB) of aerospace and create an adaptive control module to simulate the system.
Parallel manipulation of individual magnetic microbeads for lab-on-a-chip applications

NASA Astrophysics Data System (ADS)

Peng, Zhengchun

Many scientists and engineers are turning to lab-on-a-chip systems for faster and cheaper analysis of chemical reactions and biomolecular interactions. A common approach that facilitates the handling of reagents and biomolecules in these systems utilizes micro/nano beads as the solid carrier. Physical manipulation, such as assembly, transport, sorting, and tweezing, of beads on a chip represents an essential step for fully utilizing their potentials in a wide spectrum of bead-based analysis. Previous work demonstrated manipulation of either an ensemble of beads without individual control, or single beads but lacks the capability for parallel operation. Parallel manipulation of individual beads is required to meet the demand for high-throughput and location-specific analysis. In this work, we introduced two methods for parallel manipulation of individual magnetic microbeads, which can serve as effective lab-on-a-chip platforms and/or efficient analytic tools. The first method employs arrays of soft ferromagnetic patterns fabricated inside a microfluidic channel and subjected to an external magnetic field. We demonstrated that the system can be used to assemble individual beads (1-3 mum) from a flow of suspended beads into a regular array on the chip, hence improving the integrated electrochemical detection of biomolecules bound to the bead surface. By rotating the external field, the assembled microbeads can be remotely controlled with synchronized, high-speed circular motion around individual soft magnets on the chip. We employed this manipulation mode for efficient sample mixing in continuous microflow. Furthermore, we discovered a simple but effective way of transporting the microbeads on the chip by varying the strength of the local bias field within a revolution of the external field. In addition, selective transport of microbeads with different size was realized, providing a platform for effective on-chip sample separation and offering the potential for multiplexing capability. The second method integrates magnetic and dielectrophoretic manipulations of the same microbeads. The device combines tapered conducting wires and fingered electrodes to generate desirable magnetic and electric fields, respectively. By externally programming the magnetic attraction and dielectrophoretic repulsion forces, out-of-plane oscillation of the microbeads across the channel height was realized. This manipulation mode can facilitate the interaction between the beads with multiple layers of sample fluid inside the channel. We further demonstrated the tweezing of microbeads in liquid with high spatial resolutions, i.e., from submicrometer to nanometer range, by fine-tuning the net force from magnetic attraction and dielectrophoretic repulsion of the beads. The highresolution control of the out-of-plane motion of the microbeads led to the invention of massively parallel biomolecular tweezers. We believe the maturation of bead-based microtweezers will revolutionize the state-of-art tools currently used for single cell and single molecule studies.
Space Situational Awareness Data Processing Scalability Utilizing Google Cloud Services

NASA Astrophysics Data System (ADS)

Greenly, D.; Duncan, M.; Wysack, J.; Flores, F.

Space Situational Awareness (SSA) is a fundamental and critical component of current space operations. The term SSA encompasses the awareness, understanding and predictability of all objects in space. As the population of orbital space objects and debris increases, the number of collision avoidance maneuvers grows and prompts the need for accurate and timely process measures. The SSA mission continually evolves to near real-time assessment and analysis demanding the need for higher processing capabilities. By conventional methods, meeting these demands requires the integration of new hardware to keep pace with the growing complexity of maneuver planning algorithms. SpaceNav has implemented a highly scalable architecture that will track satellites and debris by utilizing powerful virtual machines on the Google Cloud Platform. SpaceNav algorithms for processing CDMs outpace conventional means. A robust processing environment for tracking data, collision avoidance maneuvers and various other aspects of SSA can be created and deleted on demand. Migrating SpaceNav tools and algorithms into the Google Cloud Platform will be discussed and the trials and tribulations involved. Information will be shared on how and why certain cloud products were used as well as integration techniques that were implemented. Key items to be presented are: 1.Scientific algorithms and SpaceNav tools integrated into a scalable architecture a) Maneuver Planning b) Parallel Processing c) Monte Carlo Simulations d) Optimization Algorithms e) SW Application Development/Integration into the Google Cloud Platform 2. Compute Engine Processing a) Application Engine Automated Processing b) Performance testing and Performance Scalability c) Cloud MySQL databases and Database Scalability d) Cloud Data Storage e) Redundancy and Availability
A survey of parallel programming tools

NASA Technical Reports Server (NTRS)

Cheng, Doreen Y.

1991-01-01

This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.
Field Programmable Gate Array Based Parallel Strapdown Algorithm Design for Strapdown Inertial Navigation Systems

PubMed Central

Li, Zong-Tao; Wu, Tie-Jun; Lin, Can-Long; Ma, Long-Hua

2011-01-01

A new generalized optimum strapdown algorithm with coning and sculling compensation is presented, in which the position, velocity and attitude updating operations are carried out based on the single-speed structure in which all computations are executed at a single updating rate that is sufficiently high to accurately account for high frequency angular rate and acceleration rectification effects. Different from existing algorithms, the updating rates of the coning and sculling compensations are unrelated with the number of the gyro incremental angle samples and the number of the accelerometer incremental velocity samples. When the output sampling rate of inertial sensors remains constant, this algorithm allows increasing the updating rate of the coning and sculling compensation, yet with more numbers of gyro incremental angle and accelerometer incremental velocity in order to improve the accuracy of system. Then, in order to implement the new strapdown algorithm in a single FPGA chip, the parallelization of the algorithm is designed and its computational complexity is analyzed. The performance of the proposed parallel strapdown algorithm is tested on the Xilinx ISE 12.3 software platform and the FPGA device XC6VLX550T hardware platform on the basis of some fighter data. It is shown that this parallel strapdown algorithm on the FPGA platform can greatly decrease the execution time of algorithm to meet the real-time and high precision requirements of system on the high dynamic environment, relative to the existing implemented on the DSP platform. PMID:22164058
Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources

NASA Astrophysics Data System (ADS)

Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

2014-12-01

The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.
Pressure-constrained, reduced-DOF, interconnected parallel manipulators with applications to space suit design

NASA Astrophysics Data System (ADS)

Jacobs, Shane Earl

This dissertation presents the concept of a Morphing Upper Torso, an innovative pressure suit design that incorporates robotic elements to enable a resizable, highly mobile and easy to don/doff spacesuit. The torso is modeled as a system of interconnected, pressure-constrained, reduced-DOF, wire-actuated parallel manipulators, that enable the dimensions of the suit to be reconfigured to match the wearer. The kinematics, dynamics and control of wire-actuated manipulators are derived and simulated, along with the Jacobian transforms, which relate the total twist vector of the system to the vector of actuator velocities. Tools are developed that allow calculation of the workspace for both single and interconnected reduced-DOF robots of this type, using knowledge of the link lengths. The forward kinematics and statics equations are combined and solved to produce the pose of the platforms along with the link tensions. These tools allow analysis of the full Morphing Upper Torso design, in which the back hatch of a rear-entry torso is interconnected with the waist ring, helmet ring and two scye bearings. Half-scale and full-scale experimental models are used along with analytical models to examine the feasibility of this novel space suit concept. The analytical and experimental results demonstrate that the torso could be expanded to facilitate donning and doffng, and then contracted to match different wearer's body dimensions. Using the system of interconnected parallel manipulators, suit components can be accurately repositioned to different desired configurations. The demonstrated feasibility of the Morphing Upper Torso concept makes it an exciting candidate for inclusion in a future planetary suit architecture.
Perspectives on the Future of CFD

NASA Technical Reports Server (NTRS)

Kwak, Dochan

2000-01-01

This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
Climbing with adhesion: from bioinspiration to biounderstanding

PubMed Central

Cutkosky, Mark R.

2015-01-01

Bioinspiration is an increasingly popular design paradigm, especially as robots venture out of the laboratory and into the world. Animals are adept at coping with the variability that the world imposes. With advances in scientific tools for understanding biological structures in detail, we are increasingly able to identify design features that account for animals' robust performance. In parallel, advances in fabrication methods and materials are allowing us to engineer artificial structures with similar properties. The resulting robots become useful platforms for testing hypotheses about which principles are most important. Taking gecko-inspired climbing as an example, we show that the process of extracting principles from animals and adapting them to robots provides insights for both robotics and biology. PMID:26464786
NARMER-1: a photon point-kernel code with build-up factors

NASA Astrophysics Data System (ADS)

Visonneau, Thierry; Pangault, Laurence; Malouch, Fadhel; Malvagi, Fausto; Dolci, Florence

2017-09-01

This paper presents an overview of NARMER-1, the new generation of photon point-kernel code developed by the Reactor Studies and Applied Mathematics Unit (SERMA) at CEA Saclay Center. After a short introduction giving some history points and the current context of development of the code, the paper exposes the principles implemented in the calculation, the physical quantities computed and surveys the generic features: programming language, computer platforms, geometry package, sources description, etc. Moreover, specific and recent features are also detailed: exclusion sphere, tetrahedral meshes, parallel operations. Then some points about verification and validation are presented. Finally we present some tools that can help the user for operations like visualization and pre-treatment.
Xyce Parallel Electronic Simulator : users' guide, version 2.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont

2004-06-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for allmore » numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce These input formats include standard analytical models, behavioral models look-up Parallel Electronic Simulator is designed to support a variety of device model inputs. tables, and mesh-level PDE device models. Combined with this flexible interface is an architectural design that greatly simplifies the addition of circuit models. One of the most important feature of Xyce is in providing a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia now has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods) research and development can be performed. Ultimately, these capabilities are migrated to end users.« less
Parallel processing of genomics data

NASA Astrophysics Data System (ADS)

Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

2016-10-01

The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
Global and local waveform simulations using the VERCE platform

NASA Astrophysics Data System (ADS)

Garth, Thomas; Saleh, Rafiq; Spinuso, Alessandro; Gemund, Andre; Casarotti, Emanuele; Magnoni, Federica; Krischner, Lion; Igel, Heiner; Schlichtweg, Horst; Frank, Anton; Michelini, Alberto; Vilotte, Jean-Pierre; Rietbrock, Andreas

2017-04-01

In recent years the potential to increase resolution of seismic imaging by full waveform inversion has been demonstrated on a range of scales from basin to continental scales. These techniques rely on harnessing the computational power of large supercomputers, and running large parallel codes to simulate the seismic wave field in a three-dimensional geological setting. The VERCE platform is designed to make these full waveform techniques accessible to a far wider spectrum of the seismological community. The platform supports the two widely used spectral element simulation programs SPECFEM3D Cartesian, and SPECFEM3D globe, allowing users to run a wide range of simulations. In the SPECFEM3D Cartesian implementation the user can run waveform simulations on a range of pre-loaded meshes and velocity models for specific areas, or upload their own velocity model and mesh. In the new SPECFEM3D globe implementation, the user will be able to select from a number of continent scale model regions, or perform waveform simulations for the whole earth. Earthquake focal mechanisms can be downloaded within the platform, for example from the GCMT catalogue, or users can upload their own focal mechanism catalogue through the platform. The simulations can be run on a range of European supercomputers in the PRACE network. Once a job has been submitted and run through the platform, the simulated waveforms can be manipulated or downloaded for further analysis. The misfit between the simulated and recorded waveforms can then be calculated through the platform through three interoperable workflows, for raw-data access (FDSN) and caching, pre-processing and finally misfit. The last workflow makes use of the Pyflex analysis software. In addition, the VERCE platform can be used to produce animations of waveform propagation through the velocity model, and synthetic shakemaps. All these data-products are made discoverable and re-usable thanks to the VERCE data and metadata management layer. We demonstrate the functionality of the VERCE platform with two use cases, one using the pre-loaded velocity model and mesh for the Maule area of Chile using the SPECFEM3D Cartesian workflow, and one showing the output of a global simulation using the SPECFEM3D globe workflow. It is envisioned that this tool will allow a much greater range of seismologists to access these full waveform inversion tools, and aid full waveform tomographic and source inversion, synthetic shakemap production and other full waveform applications, in a wide range of tectonic settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, C.; Yu, G.; Wang, K.

The physical designs of the new concept reactors which have complex structure, various materials and neutronic energy spectrum, have greatly improved the requirements to the calculation methods and the corresponding computing hardware. Along with the widely used parallel algorithm, heterogeneous platforms architecture has been introduced into numerical computations in reactor physics. Because of the natural parallel characteristics, the CPU-FPGA architecture is often used to accelerate numerical computation. This paper studies the application and features of this kind of heterogeneous platforms used in numerical calculation of reactor physics through practical examples. After the designed neutron diffusion module based on CPU-FPGA architecturemore » achieves a 11.2 speed up factor, it is proved to be feasible to apply this kind of heterogeneous platform into reactor physics. (authors)« less
Parallel workflow tools to facilitate human brain MRI post-processing

PubMed Central

Cui, Zaixu; Zhao, Chenxi; Gong, Gaolang

2015-01-01

Multi-modal magnetic resonance imaging (MRI) techniques are widely applied in human brain studies. To obtain specific brain measures of interest from MRI datasets, a number of complex image post-processing steps are typically required. Parallel workflow tools have recently been developed, concatenating individual processing steps and enabling fully automated processing of raw MRI data to obtain the final results. These workflow tools are also designed to make optimal use of available computational resources and to support the parallel processing of different subjects or of independent processing steps for a single subject. Automated, parallel MRI post-processing tools can greatly facilitate relevant brain investigations and are being increasingly applied. In this review, we briefly summarize these parallel workflow tools and discuss relevant issues. PMID:26029043
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
Heterogeneous scalable framework for multiphase flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morris, Karla Vanessa

2013-09-01

Two categories of challenges confront the developer of computational spray models: those related to the computation and those related to the physics. Regarding the computation, the trend towards heterogeneous, multi- and many-core platforms will require considerable re-engineering of codes written for the current supercomputing platforms. Regarding the physics, accurate methods for transferring mass, momentum and energy from the dispersed phase onto the carrier fluid grid have so far eluded modelers. Significant challenges also lie at the intersection between these two categories. To be competitive, any physics model must be expressible in a parallel algorithm that performs well on evolving computermore » platforms. This work created an application based on a software architecture where the physics and software concerns are separated in a way that adds flexibility to both. The develop spray-tracking package includes an application programming interface (API) that abstracts away the platform-dependent parallelization concerns, enabling the scientific programmer to write serial code that the API resolves into parallel processes and threads of execution. The project also developed the infrastructure required to provide similar APIs to other application. The API allow object-oriented Fortran applications direct interaction with Trilinos to support memory management of distributed objects in central processing units (CPU) and graphic processing units (GPU) nodes for applications using C++.« less
Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aaby, Brandon G; Perumalla, Kalyan S; Seal, Sudip K

2010-01-01

An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Messagemore » Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.« less

Thiolene and SIFEL-based Microfluidic Platforms for Liquid-Liquid Extraction

PubMed Central

Goyal, Sachit; Desai, Amit V.; Lewis, Robert W.; Ranganathan, David R.; Li, Hairong; Zeng, Dexing; Reichert, David E.; Kenis, Paul J.A.

2014-01-01

Microfluidic platforms provide several advantages for liquid-liquid extraction (LLE) processes over conventional methods, for example with respect to lower consumption of solvents and enhanced extraction efficiencies due to the inherent shorter diffusional distances. Here, we report the development of polymer-based parallel-flow microfluidic platforms for LLE. To date, parallel-flow microfluidic platforms have predominantly been made out of silicon or glass due to their compatibility with most organic solvents used for LLE. Fabrication of silicon and glass-based LLE platforms typically requires extensive use of photolithography, plasma or laser-based etching, high temperature (anodic) bonding, and/or wet etching with KOH or HF solutions. In contrast, polymeric microfluidic platforms can be fabricated using less involved processes, typically photolithography in combination with replica molding, hot embossing, and/or bonding at much lower temperatures. Here we report the fabrication and testing of microfluidic LLE platforms comprised of thiolene or a perfluoropolyether-based material, SIFEL, where the choice of materials was mainly guided by the need for solvent compatibility and fabrication amenability. Suitable designs for polymer-based LLE platforms that maximize extraction efficiencies within the constraints of the fabrication methods and feasible operational conditions were obtained using analytical modeling. To optimize the performance of the polymer-based LLE platforms, we systematically studied the effect of surface functionalization and of microstructures on the stability of the liquid-liquid interface and on the ability to separate the phases. As demonstrative examples, we report (i) a thiolene-based platform to determine the lipophilicity of caffeine, and (ii) a SIFEL-based platform to extract radioactive copper from an acidic aqueous solution. PMID:25246730
Integrated nanoscale tools for interrogating living cells

NASA Astrophysics Data System (ADS)

Jorgolli, Marsela

The development of next-generation, nanoscale technologies that interface biological systems will pave the way towards new understanding of such complex systems. Nanowires -- one-dimensional nanoscale structures -- have shown unique potential as an ideal physical interface to biological systems. Herein, we focus on the development of nanowire-based devices that can enable a wide variety of biological studies. First, we built upon standard nanofabrication techniques to optimize nanowire devices, resulting in perfectly ordered arrays of both opaque (Silicon) and transparent (Silicon dioxide) nanowires with user defined structural profile, densities, and overall patterns, as well as high sample consistency and large scale production. The high-precision and well-controlled fabrication method in conjunction with additional technologies laid the foundation for the generation of highly specialized platforms for imaging, electrochemical interrogation, and molecular biology. Next, we utilized nanowires as the fundamental structure in the development of integrated nanoelectronic platforms to directly interrogate the electrical activity of biological systems. Initially, we generated a scalable intracellular electrode platform based on vertical nanowires that allows for parallel electrical interfacing to multiple mammalian neurons. Our prototype device consisted of 16 individually addressable stimulation/recording sites, each containing an array of 9 electrically active silicon nanowires. We showed that these vertical nanowire electrode arrays could intracellularly record and stimulate neuronal activity in dissociated cultures of rat cortical neurons similar to patch clamp electrodes. In addition, we used our intracellular electrode platform to measure multiple individual synaptic connections, which enables the reconstruction of the functional connectivity maps of neuronal circuits. In order to expand and improve the capability of this functional prototype device we designed and fabricated a new hybrid chip that combines a front-side nanowire-based interface for neuronal recording with backside complementary metal oxide semiconductor (CMOS) circuits for on-chip multiplexing, voltage control for stimulation, signal amplification, and signal processing. Individual chips contain 1024 stimulation/recording sites enabling large-scale interfacing of neuronal networks with single cell resolution. Through electrical and electrochemical characterization of the devices, we demonstrated their enhanced functionality at a massively parallel scale. In our initial cell experiments, we achieved intracellular stimulations and recordings of changes in the membrane potential in a variety of cells including: HEK293T, cardiomyocytes, and rat cortical neurons. This demonstrated the device capability for single-cell-resolution recording/stimulation which when extended to a large number of neurons in a massively parallel fashion will enable the functional mapping of a complex neuronal network.
Self-Locking Optoelectronic Tweezers for Single-Cell and Microparticle Manipulation across a Large Area in High Conductivity Media

PubMed Central

Yang, Yajia; Mao, Yufei; Shin, Kyeong-Sik; Chui, Chi On; Chiou, Pei-Yu

2016-01-01

Optoelectronic tweezers (OET) has advanced within the past decade to become a promising tool for cell and microparticle manipulation. Its incompatibility with high conductivity media and limited throughput remain two major technical challenges. Here a novel manipulation concept and corresponding platform called Self-Locking Optoelectronic Tweezers (SLOT) are proposed and demonstrated to tackle these challenges concurrently. The SLOT platform comprises a periodic array of optically tunable phototransistor traps above which randomly dispersed single cells and microparticles are self-aligned to and retained without light illumination. Light beam illumination on a phototransistor turns off the trap and releases the trapped cell, which is then transported downstream via a background flow. The cell trapping and releasing functions in SLOT are decoupled, which is a unique feature that enables SLOT’s stepper-mode function to overcome the small field-of-view issue that all prior OET technologies encountered in manipulation with single-cell resolution across a large area. Massively parallel trapping of more than 100,000 microparticles has been demonstrated in high conductivity media. Even larger scale trapping and manipulation can be achieved by linearly scaling up the number of phototransistors and device area. Cells after manipulation on the SLOT platform maintain high cell viability and normal multi-day divisibility. PMID:26940301
Quantum Monte Carlo for large chemical systems: implementing efficient strategies for petascale platforms and beyond.

PubMed

Scemama, Anthony; Caffarel, Michel; Oseret, Emmanuel; Jalby, William

2013-04-30

Various strategies to implement efficiently quantum Monte Carlo (QMC) simulations for large chemical systems are presented. These include: (i) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), (ii) the possibility of keeping the memory footprint minimal, (iii) the important enhancement of single-core performance when efficient optimization tools are used, and (iv) the definition of a universal, dynamic, fault-tolerant, and load-balanced framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC=Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056, and 1731 electrons). Using 10-80 k computing cores of the Curie machine (GENCI-TGCC-CEA, France), QMC=Chem has been shown to be capable of running at the petascale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exascale platforms with a comparable level of efficiency is expected to be feasible. Copyright © 2013 Wiley Periodicals, Inc.
Proba-V Mission Exploitation Platform

NASA Astrophysics Data System (ADS)

Goor, Erwin; Dries, Jeroen

2017-04-01

VITO and partners developed the Proba-V Mission Exploitation Platform (MEP) as an end-to-end solution to drastically improve the exploitation of the Proba-V (a Copernicus contributing mission) EO-data archive (http://proba-v.vgt.vito.be/), the past mission SPOT-VEGETATION and derived vegetation parameters by researchers, service providers and end-users. The analysis of time series of data (+1PB) is addressed, as well as the large scale on-demand processing of near real-time data on a powerful and scalable processing environment. Furthermore data from the Copernicus Global Land Service is in scope of the platform. From November 2015 an operational Proba-V MEP environment, as an ESA operation service, is gradually deployed at the VITO data center with direct access to the complete data archive. Since autumn 2016 the platform is operational and yet several applications are released to the users, e.g. - A time series viewer, showing the evolution of Proba-V bands and derived vegetation parameters from the Copernicus Global Land Service for any area of interest. - Full-resolution viewing services for the complete data archive. - On-demand processing chains on a powerfull Hadoop/Spark backend e.g. for the calculation of N-daily composites. - Virtual Machines can be provided with access to the data archive and tools to work with this data, e.g. various toolboxes (GDAL, QGIS, GrassGIS, SNAP toolbox, …) and support for R and Python. This allows users to immediately work with the data without having to install tools or download data, but as well to design, debug and test applications on the platform. - A prototype of jupyter Notebooks is available with some examples worked out to show the potential of the data. Today the platform is used by several third party projects to perform R&D activities on the data, and to develop/host data analysis toolboxes. In parallel the platform is further improved and extended. From the MEP PROBA-V, access to Sentinel-2 and landsat data will be available as well soon. Users can make use of powerful Web based tools and can self-manage virtual machines to perform their work on the infrastructure at VITO with access to the complete data archive. To realise this, private cloud technology (openStack) is used and a distributed processing environment is built based on Hadoop. The Hadoop ecosystem offers a lot of technologies (Spark, Yarn, Accumulo, etc.) which we integrate with several open-source components (e.g. Geotrellis). The impact of this MEP on the user community will be high and will completely change the way of working with the data and hence open the large time series to a larger community of users. The presentation will address these benefits for the users and discuss on the technical challenges in implementing this MEP. Furthermore demonstrations will be done. Platform URL: https://proba-v-mep.esa.int/
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

NASA Astrophysics Data System (ADS)

Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

2008-04-01

This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
Fabrication of heterogeneous nanomaterial array by programmable heating and chemical supply within microfluidic platform towards multiplexed gas sensing application

PubMed Central

Yang, Daejong; Kang, Kyungnam; Kim, Donghwan; Li, Zhiyong; Park, Inkyu

2015-01-01

A facile top-down/bottom-up hybrid nanofabrication process based on programmable temperature control and parallel chemical supply within microfluidic platform has been developed for the all liquid-phase synthesis of heterogeneous nanomaterial arrays. The synthesized materials and locations can be controlled by local heating with integrated microheaters and guided liquid chemical flow within microfluidic platform. As proofs-of-concept, we have demonstrated the synthesis of two types of nanomaterial arrays: (i) parallel array of TiO2 nanotubes, CuO nanospikes and ZnO nanowires, and (ii) parallel array of ZnO nanowire/CuO nanospike hybrid nanostructures, CuO nanospikes and ZnO nanowires. The laminar flow with negligible ionic diffusion between different precursor solutions as well as localized heating was verified by numerical calculation and experimental result of nanomaterial array synthesis. The devices made of heterogeneous nanomaterial array were utilized as a multiplexed sensor for toxic gases such as NO2 and CO. This method would be very useful for the facile fabrication of functional nanodevices based on highly integrated arrays of heterogeneous nanomaterials. PMID:25634814
A practical approach to portability and performance problems on massively parallel supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beazley, D.M.; Lomdahl, P.S.

1994-12-08

We present an overview of the tactics we have used to achieve a high-level of performance while improving portability for a large-scale molecular dynamics code SPaSM. SPaSM was originally implemented in ANSI C with message passing for the Connection Machine 5 (CM-5). In 1993, SPaSM was selected as one of the winners in the IEEE Gordon Bell Prize competition for sustaining 50 Gflops on the 1024 node CM-5 at Los Alamos National Laboratory. Achieving this performance on the CM-5 required rewriting critical sections of code in CDPEAC assembler language. In addition, the code made extensive use of CM-5 parallel I/Omore » and the CMMD message passing library. Given this highly specialized implementation, we describe how we have ported the code to the Cray T3D and high performance workstations. In addition we will describe how it has been possible to do this using a single version of source code that runs on all three platforms without sacrificing any performance. Sound too good to be true? We hope to demonstrate that one can realize both code performance and portability without relying on the latest and greatest prepackaged tool or parallelizing compiler.« less
Development of novel microfluidic platforms for neural stem cell research

NASA Astrophysics Data System (ADS)

Chung, Bonggeun

This dissertation describes the development and characterization of novel microfluidic platforms to study proliferation, differentiation, migration, and apoptosis of neural stem cells (NSCs). NSCs hold tremendous promise for fundamental biological studies and cell-based therapies in human disorders. NSCs are defined as cells that can self-renew yet maintain the ability to generate the three principal cell types of the central nervous system such as neurons, astrocytes, and oligodendrocytes. NSCs therefore have therapeutic possibilities in multiple neurodevelopmental and neurodegenerative diseases. Despite their promise, cell-based therapies are limited by the inability to precisely control their behavior in culture. Compared to traditional culture tools, microfluidic platforms can provide much greater control over cell microenvironments and optimize proliferation and differentiation conditions of cells exposed to combinatorial mixtures of growth factors. Human NSCs were cultured for more than 1 week in the microfluidic device while constantly exposed to a continuous gradient of a growth factor mixture. NSCs proliferated and differentiated in a graded and proportional fashion that varied directly with growth factor concentration. In parallel to the study of growth and differentiation of NSCs, we are interested in proliferation and apoptosis of mouse NSCs exposed to morphogen gradients. Morphogen gradients are fundamental to animal brain development. Nonetheless, much controversy remains about the mechanisms by which morphogen gradients act on the developing brain. To overcome limitations of in-vitro models of gradients, we have developed a hybrid microfluidic platform that can mimic morphogen gradient profiles. Bone morphogenetic protein (BMP) activity in the developing cortex is graded and cortical NSC responses to BMPs are highly dependent on concentration and gradient slope of BMPs. To make novel microfluidic devices integrated with multiple functions, we have also developed a microfluidic multi-injector (MMI) that can generate temporal and spatial concentration gradients. MMI consists of fluidic channels and control channels with pneumatically actuated on-chip barrier valves. Repetitive actuations of on-chip valves control pulsatile release of solution that establishes microscopic chemical gradients. The development of novel gradient-generating microfluidic platforms will help in advancing our understanding of brain development and provide a versatile tool with basic and applied studies in stem cell biology.
Virtual earthquake engineering laboratory with physics-based degrading materials on parallel computers

NASA Astrophysics Data System (ADS)

Cho, In Ho

For the last few decades, we have obtained tremendous insight into underlying microscopic mechanisms of degrading quasi-brittle materials from persistent and near-saintly efforts in laboratories, and at the same time we have seen unprecedented evolution in computational technology such as massively parallel computers. Thus, time is ripe to embark on a novel approach to settle unanswered questions, especially for the earthquake engineering community, by harmoniously combining the microphysics mechanisms with advanced parallel computing technology. To begin with, it should be stressed that we placed a great deal of emphasis on preserving clear meaning and physical counterparts of all the microscopic material models proposed herein, since it is directly tied to the belief that by doing so, the more physical mechanisms we incorporate, the better prediction we can obtain. We departed from reviewing representative microscopic analysis methodologies, selecting out "fixed-type" multidirectional smeared crack model as the base framework for nonlinear quasi-brittle materials, since it is widely believed to best retain the physical nature of actual cracks. Microscopic stress functions are proposed by integrating well-received existing models to update normal stresses on the crack surfaces (three orthogonal surfaces are allowed to initiate herein) under cyclic loading. Unlike the normal stress update, special attention had to be paid to the shear stress update on the crack surfaces, due primarily to the well-known pathological nature of the fixed-type smeared crack model---spurious large stress transfer over the open crack under nonproportional loading. In hopes of exploiting physical mechanism to resolve this deleterious nature of the fixed crack model, a tribology-inspired three-dimensional (3d) interlocking mechanism has been proposed. Following the main trend of tribology (i.e., the science and engineering of interacting surfaces), we introduced the base fabric of solid particle-soft matrix to explain realistic interlocking over rough crack surfaces, and the adopted Gaussian distribution feeds random particle sizes to the entire domain. Validation against a well-documented rough crack experiment reveals promising accuracy of the proposed 3d interlocking model. A consumed energy-based damage model has been proposed for the weak correlation between the normal and shear stresses on the crack surfaces, and also for describing the nature of irrecoverable damage. Since the evaluation of the consumed energy is directly linked to the microscopic deformation, which can be efficiently tracked on the crack surfaces, the proposed damage model is believed to provide a more physical interpretation than existing damage mechanics, which fundamentally stem from mathematical derivation with few physical counterparts. Another novel point of the present work lies in the topological transition-based "smart" steel bar model, notably with evolving compressive buckling length. We presented a systematic framework of information flow between the key ingredients of composite materials (i.e., steel bar and its surrounding concrete elements). The smart steel model suggested can incorporate smooth transition during reversal loading, tensile rupture, early buckling after reversal from excessive tensile loading, and even compressive buckling. Especially, the buckling length is made to evolve according to the damage states of the surrounding elements of each bar, while all other dominant models leave the length unchanged. What lies behind all the aforementioned novel attempts is, of course, the problem-optimized parallel platform. In fact, the parallel computing in our field has been restricted to monotonic shock or blast loading with explicit algorithm which is characteristically feasible to be parallelized. In the present study, efficient parallelization strategies for the highly demanding implicit nonlinear finite element analysis (FEA) program for real-scale reinforced concrete (RC) structures under cyclic loading are proposed. Quantitative comparison of state-of-the-art parallel strategies, in terms of factorization, had been carried out, leading to the problem-optimized solver, which is successfully embracing the penalty method and banded nature. Particularly, the penalty method employed imparts considerable smoothness to the global response, which yields a practical superiority of the parallel triangular system solver over other advanced solvers such as parallel preconditioned conjugate gradient method. Other salient issues on parallelization are also addressed. The parallel platform established offers unprecedented access to simulations of real-scale structures, giving new understanding about the physics-based mechanisms adopted and probabilistic randomness at the entire system level. Particularly, the platform enables bold simulations of real-scale RC structures exposed to cyclic loading---H-shaped wall system and 4-story T-shaped wall system. The simulations show the desired capability of accurate prediction of global force-displacement responses, postpeak softening behavior, and compressive buckling of longitudinal steel bars. It is fascinating to see that intrinsic randomness of the 3d interlocking model appears to cause "localized" damage of the real-scale structures, which is consistent with reported observations in different fields such as granular media. Equipped with accuracy, stability and scalability as demonstrated so far, the parallel platform is believed to serve as a fertile ground for the introducing of further physical mechanisms into various research fields as well as the earthquake engineering community. In the near future, it can be further expanded to run in concert with reliable FEA programs such as FRAME3d or OPENSEES. Following the central notion of "multiscale" analysis technique, actual infrastructures exposed to extreme natural hazard can be successfully tackled by this next generation analysis tool---the harmonious union of the parallel platform and a general FEA program. At the same time, any type of experiments can be easily conducted by this "virtual laboratory."
Reconfigurable microfluidic hanging drop network for multi-tissue interaction and analysis.

PubMed

Frey, Olivier; Misun, Patrick M; Fluri, David A; Hengstler, Jan G; Hierlemann, Andreas

2014-06-30

Integration of multiple three-dimensional microtissues into microfluidic networks enables new insights in how different organs or tissues of an organism interact. Here, we present a platform that extends the hanging-drop technology, used for multi-cellular spheroid formation, to multifunctional complex microfluidic networks. Engineered as completely open, 'hanging' microfluidic system at the bottom of a substrate, the platform features high flexibility in microtissue arrangements and interconnections, while fabrication is simple and operation robust. Multiple spheroids of different cell types are formed in parallel on the same platform; the different tissues are then connected in physiological order for multi-tissue experiments through reconfiguration of the fluidic network. Liquid flow is precisely controlled through the hanging drops, which enable nutrient supply, substance dosage and inter-organ metabolic communication. The possibility to perform parallelized microtissue formation on the same chip that is subsequently used for complex multi-tissue experiments renders the developed platform a promising technology for 'body-on-a-chip'-related research.
Motorized manipulator for positioning a TEM specimen

DOEpatents

Schmid, Andreas Karl; Andresen, Nord

2010-12-14

The invention relates to a motorized manipulator for positioning a TEM specimen holder with sub-micron resolution parallel to a y-z plane and rotating the specimen holder in the y-z plane, the manipulator comprising a base (2), and attachment means (30) for attaching the specimen holder to the manipulator, characterized in that the manipulator further comprises at least three nano-actuators (3.sup.a, 3.sup.b, 3.sup.c) mounted on the base, each nano-actuator showing a tip (4.sup.a, 4.sup.b, 4.sup.c), the at least three tips defining the y-z plane, each tip capable of moving with respect to the base in the y-z plane; a platform (5) in contact with the tips of the nano-actuators; and clamping means (6) for pressing the platform against the tips of the nano-actuators; as a result of which the nano-actuators can rotate the platform with respect to the base in the y-z plane and translate the platform parallel to the y-z plane.
High-Throughput Nanofabrication of Infra-red and Chiral Metamaterials using Nanospherical-Lens Lithography

PubMed Central

Chang, Yun-Chorng; Lu, Sih-Chen; Chung, Hsin-Chan; Wang, Shih-Ming; Tsai, Tzung-Da; Guo, Tzung-Fang

2013-01-01

Various infra-red and planar chiral metamaterials were fabricated using the modified Nanospherical-Lens Lithography. By replacing the light source with a hand-held ultraviolet lamp, its asymmetric light emission pattern produces the elliptical-shaped photoresist holes after passing through the spheres. The long axis of the ellipse is parallel to the lamp direction. The fabricated ellipse arrays exhibit localized surface plasmon resonance in mid-infra-red and are ideal platforms for surface enhanced infra-red absorption (SEIRA). We also demonstrate a way to design and fabricate complicated patterns by tuning parameters in each exposure step. This method is both high-throughput and low-cost, which is a powerful tool for future infra-red metamaterials applications. PMID:24284941
Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics

PubMed Central

Ardui, Simon; Ameur, Adam; Vermeesch, Joris R; Hestand, Matthew S

2018-01-01

Abstract Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing. PMID:29401301
Parallel computing for automated model calibration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burke, John S.; Danielson, Gary R.; Schulz, Douglas A.

2002-07-29

Natural resources model calibration is a significant burden on computing and staff resources in modeling efforts. Most assessments must consider multiple calibration objectives (for example magnitude and timing of stream flow peak). An automated calibration process that allows real time updating of data/models, allowing scientists to focus effort on improving models is needed. We are in the process of building a fully featured multi objective calibration tool capable of processing multiple models cheaply and efficiently using null cycle computing. Our parallel processing and calibration software routines have been generically, but our focus has been on natural resources model calibration. Somore » far, the natural resources models have been friendly to parallel calibration efforts in that they require no inter-process communication, only need a small amount of input data and only output a small amount of statistical information for each calibration run. A typical auto calibration run might involve running a model 10,000 times with a variety of input parameters and summary statistical output. In the past model calibration has been done against individual models for each data set. The individual model runs are relatively fast, ranging from seconds to minutes. The process was run on a single computer using a simple iterative process. We have completed two Auto Calibration prototypes and are currently designing a more feature rich tool. Our prototypes have focused on running the calibration in a distributed computing cross platform environment. They allow incorporation of?smart? calibration parameter generation (using artificial intelligence processing techniques). Null cycle computing similar to SETI@Home has also been a focus of our efforts. This paper details the design of the latest prototype and discusses our plans for the next revision of the software.« less
Single-cell barcoding and sequencing using droplet microfluidics.

PubMed

Zilionis, Rapolas; Nainys, Juozas; Veres, Adrian; Savova, Virginia; Zemmour, David; Klein, Allon M; Mazutis, Linas

2017-01-01

Single-cell RNA sequencing has recently emerged as a powerful tool for mapping cellular heterogeneity in diseased and healthy tissues, yet high-throughput methods are needed for capturing the unbiased diversity of cells. Droplet microfluidics is among the most promising candidates for capturing and processing thousands of individual cells for whole-transcriptome or genomic analysis in a massively parallel manner with minimal reagent use. We recently established a method called inDrops, which has the capability to index >15,000 cells in an hour. A suspension of cells is first encapsulated into nanoliter droplets with hydrogel beads (HBs) bearing barcoding DNA primers. Cells are then lysed and mRNA is barcoded (indexed) by a reverse transcription (RT) reaction. Here we provide details for (i) establishing an inDrops platform (1 d); (ii) performing hydrogel bead synthesis (4 d); (iii) encapsulating and barcoding cells (1 d); and (iv) RNA-seq library preparation (2 d). inDrops is a robust and scalable platform, and it is unique in its ability to capture and profile >75% of cells in even very small samples, on a scale of thousands or tens of thousands of cells.
A cable-driven parallel robots application: modelling and simulation of a dynamic cable model in Dymola

NASA Astrophysics Data System (ADS)

Othman, M. F.; Kurniawan, R.; Schramm, D.; Ariffin, A. K.

2018-05-01

Modeling a cable model in multibody dynamics simulation tool which dynamically varies in length, mass and stiffness is a challenging task. Simulation of cable-driven parallel robots (CDPR) for instance requires a cable model that can dynamically change in length for every desired pose of the platform. Thus, in this paper, a detailed procedure for modeling and simulation of a dynamic cable model in Dymola is proposed. The approach is also applicable for other types of Modelica simulation environments. The cable is modeled using standard mechanical elements like mass, spring, damper and joint. The parameters of the cable model are based on the factsheet of the manufacturer and experimental results. Its dynamic ability is tested by applying it on a complete planar CDPR model in which the parameters are based on a prototype named CABLAR, which is developed in Chair of Mechatronics, University of Duisburg-Essen. The prototype has been developed to demonstrate an application of CDPR as a goods storage and retrieval machine. The performance of the cable model during the simulation is analyzed and discussed.
WaveJava: Wavelet-based network computing

NASA Astrophysics Data System (ADS)

Ma, Kun; Jiao, Licheng; Shi, Zhuoer

1997-04-01

Wavelet is a powerful theory, but its successful application still needs suitable programming tools. Java is a simple, object-oriented, distributed, interpreted, robust, secure, architecture-neutral, portable, high-performance, multi- threaded, dynamic language. This paper addresses the design and development of a cross-platform software environment for experimenting and applying wavelet theory. WaveJava, a wavelet class library designed by the object-orient programming, is developed to take advantage of the wavelets features, such as multi-resolution analysis and parallel processing in the networking computing. A new application architecture is designed for the net-wide distributed client-server environment. The data are transmitted with multi-resolution packets. At the distributed sites around the net, these data packets are done the matching or recognition processing in parallel. The results are fed back to determine the next operation. So, the more robust results can be arrived quickly. The WaveJava is easy to use and expand for special application. This paper gives a solution for the distributed fingerprint information processing system. It also fits for some other net-base multimedia information processing, such as network library, remote teaching and filmless picture archiving and communications.
Cpu/gpu Computing for AN Implicit Multi-Block Compressible Navier-Stokes Solver on Heterogeneous Platform

NASA Astrophysics Data System (ADS)

Deng, Liang; Bai, Hanli; Wang, Fang; Xu, Qingxin

2016-06-01

CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely “one-thread-one-point” and “one-thread-one-line”, to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a tri-level hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.
The Snow Data System at NASA JPL

NASA Astrophysics Data System (ADS)

Laidlaw, R.; Painter, T. H.; Mattmann, C. A.; Ramirez, P.; Bormann, K.; Brodzik, M. J.; Burgess, A. B.; Rittger, K.; Goodale, C. E.; Joyce, M.; McGibbney, L. J.; Zimdars, P.

2014-12-01

NASA JPL's Snow Data System has a data-processing pipeline powered by Apache OODT, an open source software tool. The pipeline has been running for several years and has successfully generated a significant amount of cryosphere data, including MODIS-based products such as MODSCAG, MODDRFS and MODICE, with historical and near-real time windows and covering regions such as the Artic, Western US, Alaska, Central Europe, Asia, South America, Australia and New Zealand. The team continues to improve the pipeline, using monitoring tools such as Ganglia to give an overview of operations, and improving fault-tolerance with automated recovery scripts. Several alternative adaptations of the Snow Covered Area and Grain size (SCAG) algorithm are being investigated. These include using VIIRS and Landsat TM/ETM+ satellite data as inputs. Parallel computing techniques are being considered for core SCAG processing, such as using the PyCUDA Python API to utilize multi-core GPU architectures. An experimental version of MODSCAG is also being developed for the Google Earth Engine platform, a cloud-based service.

Interactive Visualization of Large-Scale Hydrological Data using Emerging Technologies in Web Systems and Parallel Programming

NASA Astrophysics Data System (ADS)

Demir, I.; Krajewski, W. F.

2013-12-01

As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.
Technical integration of hippocampus, Basal Ganglia and physical models for spatial navigation.

PubMed

Fox, Charles; Humphries, Mark; Mitchinson, Ben; Kiss, Tamas; Somogyvari, Zoltan; Prescott, Tony

2009-01-01

Computational neuroscience is increasingly moving beyond modeling individual neurons or neural systems to consider the integration of multiple models, often constructed by different research groups. We report on our preliminary technical integration of recent hippocampal formation, basal ganglia and physical environment models, together with visualisation tools, as a case study in the use of Python across the modelling tool-chain. We do not present new modeling results here. The architecture incorporates leaky-integrator and rate-coded neurons, a 3D environment with collision detection and tactile sensors, 3D graphics and 2D plots. We found Python to be a flexible platform, offering a significant reduction in development time, without a corresponding significant increase in execution time. We illustrate this by implementing a part of the model in various alternative languages and coding styles, and comparing their execution times. For very large-scale system integration, communication with other languages and parallel execution may be required, which we demonstrate using the BRAHMS framework's Python bindings.
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.

PubMed

Warris, Sven; Timal, N Roshan N; Kempenaar, Marcel; Poortinga, Arne M; van de Geest, Henri; Varbanescu, Ana L; Nap, Jan-Peter

2018-01-01

Our previously published CUDA-only application PaSWAS for Smith-Waterman (SW) sequence alignment of any type of sequence on NVIDIA-based GPUs is platform-specific and therefore adopted less than could be. The OpenCL language is supported more widely and allows use on a variety of hardware platforms. Moreover, there is a need to promote the adoption of parallel computing in bioinformatics by making its use and extension more simple through more and better application of high-level languages commonly used in bioinformatics, such as Python. The novel application pyPaSWAS presents the parallel SW sequence alignment code fully packed in Python. It is a generic SW implementation running on several hardware platforms with multi-core systems and/or GPUs that provides accurate sequence alignments that also can be inspected for alignment details. Additionally, pyPaSWAS support the affine gap penalty. Python libraries are used for automated system configuration, I/O and logging. This way, the Python environment will stimulate further extension and use of pyPaSWAS. pyPaSWAS presents an easy Python-based environment for accurate and retrievable parallel SW sequence alignments on GPUs and multi-core systems. The strategy of integrating Python with high-performance parallel compute languages to create a developer- and user-friendly environment should be considered for other computationally intensive bioinformatics algorithms.
Compliance analysis of a 3-DOF spindle head by considering gravitational effects

NASA Astrophysics Data System (ADS)

Li, Qi; Wang, Manxin; Huang, Tian; Chetwynd, Derek G.

2015-01-01

The compliance modeling is one of the most significant issues in the stage of preliminary design for parallel kinematic machine(PKM). The gravity ignored in traditional compliance analysis has a significant effect on pose accuracy of tool center point(TCP) when a PKM is horizontally placed. By taking gravity into account, this paper presents a semi-analytical approach for compliance analysis of a 3-DOF spindle head named the A3 head. The architecture behind the A3 head is a 3-R PS parallel mechanism having one translational and two rotational movement capabilities, which can be employed to form the main body of a 5-DOF hybrid kinematic machine especially designed for high-speed machining of large aircraft components. The force analysis is carried out by considering both the externally applied wrench imposed upon the platform as well as gravity of all moving components. Then, the deflection analysis is investigated to establish the relationship between the deflection twist and compliances of all joints and links using semi-analytical method. The merits of this approach lie in that platform deflection twist throughout the entire task workspace can be evaluated in a very efficient manner. The effectiveness of the proposed approach is verified by the FEA and experiment at different configurations and the results show that the discrepancy of the compliances is less than 0.04 μm/N-1 and that of the deformations is less than 10μm. The computational and experimental results show that the deflection twist induced by gravity forces of the moving components has significant bearings on pose accuracy of the platform, providing an informative guidance for the improvement of the current design. The proposed approach can be easily applied to the compliance analysis of PKM by considering gravitational effects and to evaluate the deformation caused by gravity throughout the entire workspace.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform.

PubMed

Cao, Jianfang; Chen, Lichao; Wang, Min; Tian, Yun

2018-01-01

The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
F3D Image Processing and Analysis for Many - and Multi-core Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

F3D is written in OpenCL, so it achieve[sic] platform-portable parallelism on modern mutli-core CPUs and many-core GPUs. The interface and mechanims to access F3D core are written in Java as a plugin for Fiji/ImageJ to deliver several key image-processing algorithms necessary to remove artifacts from micro-tomography data. The algorithms consist of data parallel aware filters that can efficiently utilizes[sic] resources and can work on out of core datasets and scale efficiently across multiple accelerators. Optimizing for data parallel filters, streaming out of core datasets, and efficient resource and memory and data managements over complex execution sequence of filters greatly expeditesmore » any scientific workflow with image processing requirements. F3D performs several different types of 3D image processing operations, such as non-linear filtering using bilateral filtering and/or median filtering and/or morphological operators (MM). F3D gray-level MM operators are one-pass constant time methods that can perform morphological transformations with a line-structuring element oriented in discrete directions. Additionally, MM operators can be applied to gray-scale images, and consist of two parts: (a) a reference shape or structuring element, which is translated over the image, and (b) a mechanism, or operation, that defines the comparisons to be performed between the image and the structuring element. This tool provides a critical component within many complex pipelines such as those for performing automated segmentation of image stacks. F3D is also called a "descendent" of Quant-CT, another software we developed in the past. These two modules are to be integrated in a next version. Further details were reported in: D.M. Ushizima, T. Perciano, H. Krishnan, B. Loring, H. Bale, D. Parkinson, and J. Sethian. Structure recognition from high-resolution images of ceramic composites. IEEE International Conference on Big Data, October 2014.« less
Large Spatial Scale Ground Displacement Mapping through the P-SBAS Processing of Sentinel-1 Data on a Cloud Computing Environment

NASA Astrophysics Data System (ADS)

Casu, F.; Bonano, M.; de Luca, C.; Lanari, R.; Manunta, M.; Manzo, M.; Zinno, I.

2017-12-01

Since its launch in 2014, the Sentinel-1 (S1) constellation has played a key role on SAR data availability and dissemination all over the World. Indeed, the free and open access data policy adopted by the European Copernicus program together with the global coverage acquisition strategy, make the Sentinel constellation as a game changer in the Earth Observation scenario. Being the SAR data become ubiquitous, the technological and scientific challenge is focused on maximizing the exploitation of such huge data flow. In this direction, the use of innovative processing algorithms and distributed computing infrastructures, such as the Cloud Computing platforms, can play a crucial role. In this work we present a Cloud Computing solution for the advanced interferometric (DInSAR) processing chain based on the Parallel SBAS (P-SBAS) approach, aimed at processing S1 Interferometric Wide Swath (IWS) data for the generation of large spatial scale deformation time series in efficient, automatic and systematic way. Such a DInSAR chain ingests Sentinel 1 SLC images and carries out several processing steps, to finally compute deformation time series and mean deformation velocity maps. Different parallel strategies have been designed ad hoc for each processing step of the P-SBAS S1 chain, encompassing both multi-core and multi-node programming techniques, in order to maximize the computational efficiency achieved within a Cloud Computing environment and cut down the relevant processing times. The presented P-SBAS S1 processing chain has been implemented on the Amazon Web Services platform and a thorough analysis of the attained parallel performances has been performed to identify and overcome the major bottlenecks to the scalability. The presented approach is used to perform national-scale DInSAR analyses over Italy, involving the processing of more than 3000 S1 IWS images acquired from both ascending and descending orbits. Such an experiment confirms the big advantage of exploiting large computational and storage resources of Cloud Computing platforms for large scale DInSAR analysis. The presented Cloud Computing P-SBAS processing chain can be a precious tool in the perspective of developing operational services disposable for the EO scientific community related to hazard monitoring and risk prevention and mitigation.
Fully Parallel MHD Stability Analysis Tool

NASA Astrophysics Data System (ADS)

Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

2014-10-01

Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.
Solar wind interaction with Venus and Mars in a parallel hybrid code

NASA Astrophysics Data System (ADS)

Jarvinen, Riku; Sandroos, Arto

2013-04-01

We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.
S-Genius, a universal software platform with versatile inverse problem resolution for scatterometry

NASA Astrophysics Data System (ADS)

Fuard, David; Troscompt, Nicolas; El Kalyoubi, Ismael; Soulan, Sébastien; Besacier, Maxime

2013-05-01

S-Genius is a new universal scatterometry platform, which gathers all the LTM-CNRS know-how regarding the rigorous electromagnetic computation and several inverse problem solver solutions. This software platform is built to be a userfriendly, light, swift, accurate, user-oriented scatterometry tool, compatible with any ellipsometric measurements to fit and any types of pattern. It aims to combine a set of inverse problem solver capabilities — via adapted Levenberg- Marquard optimization, Kriging, Neural Network solutions — that greatly improve the reliability and the velocity of the solution determination. Furthermore, as the model solution is mainly vulnerable to materials optical properties, S-Genius may be coupled with an innovative material refractive indices determination. This paper will a little bit more focuses on the modified Levenberg-Marquardt optimization, one of the indirect method solver built up in parallel with the total SGenius software coding by yours truly. This modified Levenberg-Marquardt optimization corresponds to a Newton algorithm with an adapted damping parameter regarding the definition domains of the optimized parameters. Currently, S-Genius is technically ready for scientific collaboration, python-powered, multi-platform (windows/linux/macOS), multi-core, ready for 2D- (infinite features along the direction perpendicular to the incident plane), conical, and 3D-features computation, compatible with all kinds of input data from any possible ellipsometers (angle or wavelength resolved) or reflectometers, and widely used in our laboratory for resist trimming studies, etching features characterization (such as complex stack) or nano-imprint lithography measurements for instance. The work about kriging solver, neural network solver and material refractive indices determination is done (or about to) by other LTM members and about to be integrated on S-Genius platform.
Xyce Parallel Electronic Simulator Users' Guide Version 6.7.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one tomore » develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright c 2002-2017 Sandia Corporation. All rights reserved. Trademarks Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. All other trademarks are property of their respective owners. Contacts World Wide Web http://xyce.sandia.gov https://info.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only) Bug Reports (Sandia only) http://joseki-vm.sandia.gov/bugzilla http://morannon.sandia.gov/bugzilla« less
Height parallelism of implants in the treatment of the edentulous mandible with ball-retained overdentures: a technical note.

PubMed

Iglesia-Puig, Miguel A

2008-01-01

The objective of this report is to present a device to achieve equal platform height in the vertical axis to allow the spherical abutments to work correctly in mandibular overdentures retained with 2 implants. The device is fabricated over plastic castable abutments, with a plate perpendicular to the implant platforms and located at the top of the platform height. Once implants are inserted, the device is screwed to an implant and allows evaluation of the height of the platforms.
Reducing Response Time Bounds for DAG-Based Task Systems on Heterogeneous Multicore Platforms

DTIC Science & Technology

2016-01-01

synchronous parallel tasks on multicore platforms. In 25th ECRTS, 2013. [10] U. Devi. Soft Real - Time Scheduling on Multiprocessors. PhD thesis...report, Washington University in St Louis, 2014. [18] C. Liu and J. Anderson. Supporting soft real - time DAG-based sys- tems on multiprocessors with...analysis for DAG-based real - time task systems im- plemented on heterogeneous multicore platforms. The spe- cific analysis problem that is considered was
Implementation and Assessment of a Virtual Laboratory of Parallel Robots Developed for Engineering Students

ERIC Educational Resources Information Center

Gil, Arturo; Peidró, Adrián; Reinoso, Óscar; Marín, José María

2017-01-01

This paper presents a tool, LABEL, oriented to the teaching of parallel robotics. The application, organized as a set of tools developed using Easy Java Simulations, enables the study of the kinematics of parallel robotics. A set of classical parallel structures was implemented such that LABEL can solve the inverse and direct kinematic problem of…
TDat: An Efficient Platform for Processing Petabyte-Scale Whole-Brain Volumetric Images.

PubMed

Li, Yuxin; Gong, Hui; Yang, Xiaoquan; Yuan, Jing; Jiang, Tao; Li, Xiangning; Sun, Qingtao; Zhu, Dan; Wang, Zhenyu; Luo, Qingming; Li, Anan

2017-01-01

Three-dimensional imaging of whole mammalian brains at single-neuron resolution has generated terabyte (TB)- and even petabyte (PB)-sized datasets. Due to their size, processing these massive image datasets can be hindered by the computer hardware and software typically found in biological laboratories. To fill this gap, we have developed an efficient platform named TDat, which adopts a novel data reformatting strategy by reading cuboid data and employing parallel computing. In data reformatting, TDat is more efficient than any other software. In data accessing, we adopted parallelization to fully explore the capability for data transmission in computers. We applied TDat in large-volume data rigid registration and neuron tracing in whole-brain data with single-neuron resolution, which has never been demonstrated in other studies. We also showed its compatibility with various computing platforms, image processing software and imaging systems.
Reusable Component Model Development Approach for Parallel and Distributed Simulation

PubMed Central

Zhu, Feng; Yao, Yiping; Chen, Huilong; Yao, Feng

2014-01-01

Model reuse is a key issue to be resolved in parallel and distributed simulation at present. However, component models built by different domain experts usually have diversiform interfaces, couple tightly, and bind with simulation platforms closely. As a result, they are difficult to be reused across different simulation platforms and applications. To address the problem, this paper first proposed a reusable component model framework. Based on this framework, then our reusable model development approach is elaborated, which contains two phases: (1) domain experts create simulation computational modules observing three principles to achieve their independence; (2) model developer encapsulates these simulation computational modules with six standard service interfaces to improve their reusability. The case study of a radar model indicates that the model developed using our approach has good reusability and it is easy to be used in different simulation platforms and applications. PMID:24729751
OpenACC performance for simulating 2D radial dambreak using FVM HLLE flux

NASA Astrophysics Data System (ADS)

Gunawan, P. H.; Pahlevi, M. R.

2018-03-01

The aim of this paper is to investigate the performances of openACC platform for computing 2D radial dambreak. Here, the shallow water equation will be used to describe and simulate 2D radial dambreak with finite volume method (FVM) using HLLE flux. OpenACC is a parallel computing platform based on GPU cores. Indeed, from this research this platform is used to minimize computational time on the numerical scheme performance. The results show the using OpenACC, the computational time is reduced. For the dry and wet radial dambreak simulations using 2048 grids, the computational time of parallel is obtained 575.984 s and 584.830 s respectively for both simulations. These results show the successful of OpenACC when they are compared with the serial time of dry and wet radial dambreak simulations which are collected 28047.500 s and 29269.40 s respectively.
Development of an optical inspection platform for surface defect detection in touch panel glass

NASA Astrophysics Data System (ADS)

Chang, Ming; Chen, Bo-Cheng; Gabayno, Jacque Lynn; Chen, Ming-Fu

2016-04-01

An optical inspection platform combining parallel image processing with high resolution opto-mechanical module was developed for defect inspection of touch panel glass. Dark field images were acquired using a 12288-pixel line CCD camera with 3.5 µm per pixel resolution and 12 kHz line rate. Key features of the glass surface were analyzed by parallel image processing on combined CPU and GPU platforms. Defect inspection of touch panel glass, which provided 386 megapixel image data per sample, was completed in roughly 5 seconds. High detection rate of surface scratches on the touch panel glass was realized with minimum defects size of about 10 µm after inspection. The implementation of a custom illumination source significantly improved the scattering efficiency on the surface, therefore enhancing the contrast in the acquired images and overall performance of the inspection system.
Code Optimization and Parallelization on the Origins: Looking from Users' Perspective

NASA Technical Reports Server (NTRS)

Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)

2002-01-01

Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.
A Debugger for Computational Grid Applications

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation gives an overview of a debugger for computational grid applications. Details are given on NAS parallel tools groups (including parallelization support tools, evaluation of various parallelization strategies, and distributed and aggregated computing), debugger dependencies, scalability, initial implementation, the process grid, and information on Globus.

MultiSense: A Multimodal Sensor Tool Enabling the High-Throughput Analysis of Respiration.

PubMed

Keil, Peter; Liebsch, Gregor; Borisjuk, Ljudmilla; Rolletschek, Hardy

2017-01-01

The high-throughput analysis of respiratory activity has become an important component of many biological investigations. Here, a technological platform, denoted the "MultiSense tool," is described. The tool enables the parallel monitoring of respiration in 100 samples over an extended time period, by dynamically tracking the concentrations of oxygen (O 2 ) and/or carbon dioxide (CO 2 ) and/or pH within an airtight vial. Its flexible design supports the quantification of respiration based on either oxygen consumption or carbon dioxide release, thereby allowing for the determination of the physiologically significant respiratory quotient (the ratio between the quantities of CO 2 released and the O 2 consumed). It requires an LED light source to be mounted above the sample, together with a CCD camera system, adjusted to enable the capture of analyte-specific wavelengths, and fluorescent sensor spots inserted into the sample vial. Here, a demonstration is given of the use of the MultiSense tool to quantify respiration in imbibing plant seeds, for which an appropriate step-by-step protocol is provided. The technology can be easily adapted for a wide range of applications, including the monitoring of gas exchange in any kind of liquid culture system (algae, embryo and tissue culture, cell suspensions, microbial cultures).
Image Processing Using a Parallel Architecture.

DTIC Science & Technology

1987-12-01

ENG/87D-25 Abstract This study developed a set o± low level image processing tools on a parallel computer that allows concurrent processing of images...environment, the set of tools offers a significant reduction in the time required to perform some commonly used image processing operations. vI IMAGE...step toward developing these systems, a structured set of image processing tools was implemented using a parallel computer. More important than
Parallelization of ARC3D with Computer-Aided Tools

NASA Technical Reports Server (NTRS)

Jin, Haoqiang; Hribar, Michelle; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

A series of efforts have been devoted to investigating methods of porting and parallelizing applications quickly and efficiently for new architectures, such as the SCSI Origin 2000 and Cray T3E. This report presents the parallelization of a CFD application, ARC3D, using the computer-aided tools, Cesspools. Steps of parallelizing this code and requirements of achieving better performance are discussed. The generated parallel version has achieved reasonably well performance, for example, having a speedup of 30 for 36 Cray T3E processors. However, this performance could not be obtained without modification of the original serial code. It is suggested that in many cases improving serial code and performing necessary code transformations are important parts for the automated parallelization process although user intervention in many of these parts are still necessary. Nevertheless, development and improvement of useful software tools, such as Cesspools, can help trim down many tedious parallelization details and improve the processing efficiency.
Beam Dynamics Simulation Platform and Studies of Beam Breakup in Dielectric Wakefield Structures

NASA Astrophysics Data System (ADS)

Schoessow, P.; Kanareykin, A.; Jing, C.; Kustov, A.; Altmark, A.; Gai, W.

2010-11-01

A particle-Green's function beam dynamics code (BBU-3000) to study beam breakup effects is incorporated into a parallel computing framework based on the Boinc software environment, and supports both task farming on a heterogeneous cluster and local grid computing. User access to the platform is through a web browser.
Testing tubewell platform color as a rapid screening tool for arsenic and manganese in drinking water wells.

PubMed

Biswas, Ashis; Nath, Bibhash; Bhattacharya, Prosun; Halder, Dipti; Kundu, Amit K; Mandal, Ujjal; Mukherjee, Abhijit; Chatterjee, Debashis; Jacks, Gunnar

2012-01-03

A low-cost rapid screening tool for arsenic (As) and manganese (Mn) in groundwater is urgently needed to formulate mitigation policies for sustainable drinking water supply. This study attempts to make statistical comparison between tubewell (TW) platform color and the level of As and Mn concentration in groundwater extracted from the respective TW (n = 423), to validate platform color as a screening tool for As and Mn in groundwater. The result shows that a black colored platform with 73% certainty indicates that well water is safe from As, while with 84% certainty a red colored platform indicates that well water is enriched with As, compared to WHO drinking water guideline of 10 μg/L. With this guideline the efficiency, sensitivity, and specificity of the tool are 79%, 77%, and 81%, respectively. However, the certainty values become 93% and 38%, respectively, for black and red colored platforms at 50 μg/L, the drinking water standards for India and Bangladesh. The respective efficiency, sensitivity, and specificity are 65%, 85%, and 59%. Similarly for Mn, black and red colored platform with 78% and 64% certainty, respectively, indicates that well water is either enriched or free from Mn at the Indian national drinking water standard of 300 μg/L. With this guideline the efficiency, sensitivity, and specificity of the tool are 71%, 67%, and 76%, respectively. Thus, this study demonstrates that TW platform color can be potentially used as an initial screening tool for identifying TWs with elevated dissolved As and Mn, to make further rigorous groundwater testing more intensive and implement mitigation options for safe drinking water supplies.
Targeting multiple heterogeneous hardware platforms with OpenCL

NASA Astrophysics Data System (ADS)

Fox, Paul A.; Kozacik, Stephen T.; Humphrey, John R.; Paolini, Aaron; Kuller, Aryeh; Kelmelis, Eric J.

2014-06-01

The OpenCL API allows for the abstract expression of parallel, heterogeneous computing, but hardware implementations have substantial implementation differences. The abstractions provided by the OpenCL API are often insufficiently high-level to conceal differences in hardware architecture. Additionally, implementations often do not take advantage of potential performance gains from certain features due to hardware limitations and other factors. These factors make it challenging to produce code that is portable in practice, resulting in much OpenCL code being duplicated for each hardware platform being targeted. This duplication of effort offsets the principal advantage of OpenCL: portability. The use of certain coding practices can mitigate this problem, allowing a common code base to be adapted to perform well across a wide range of hardware platforms. To this end, we explore some general practices for producing performant code that are effective across platforms. Additionally, we explore some ways of modularizing code to enable optional optimizations that take advantage of hardware-specific characteristics. The minimum requirement for portability implies avoiding the use of OpenCL features that are optional, not widely implemented, poorly implemented, or missing in major implementations. Exposing multiple levels of parallelism allows hardware to take advantage of the types of parallelism it supports, from the task level down to explicit vector operations. Static optimizations and branch elimination in device code help the platform compiler to effectively optimize programs. Modularization of some code is important to allow operations to be chosen for performance on target hardware. Optional subroutines exploiting explicit memory locality allow for different memory hierarchies to be exploited for maximum performance. The C preprocessor and JIT compilation using the OpenCL runtime can be used to enable some of these techniques, as well as to factor in hardware-specific optimizations as necessary.
Development of a gene synthesis platform for the efficient large scale production of small genes encoding animal toxins.

PubMed

Sequeira, Ana Filipa; Brás, Joana L A; Guerreiro, Catarina I P D; Vincentelli, Renaud; Fontes, Carlos M G A

2016-12-01

Gene synthesis is becoming an important tool in many fields of recombinant DNA technology, including recombinant protein production. De novo gene synthesis is quickly replacing the classical cloning and mutagenesis procedures and allows generating nucleic acids for which no template is available. In addition, when coupled with efficient gene design algorithms that optimize codon usage, it leads to high levels of recombinant protein expression. Here, we describe the development of an optimized gene synthesis platform that was applied to the large scale production of small genes encoding venom peptides. This improved gene synthesis method uses a PCR-based protocol to assemble synthetic DNA from pools of overlapping oligonucleotides and was developed to synthesise multiples genes simultaneously. This technology incorporates an accurate, automated and cost effective ligation independent cloning step to directly integrate the synthetic genes into an effective Escherichia coli expression vector. The robustness of this technology to generate large libraries of dozens to thousands of synthetic nucleic acids was demonstrated through the parallel and simultaneous synthesis of 96 genes encoding animal toxins. An automated platform was developed for the large-scale synthesis of small genes encoding eukaryotic toxins. Large scale recombinant expression of synthetic genes encoding eukaryotic toxins will allow exploring the extraordinary potency and pharmacological diversity of animal venoms, an increasingly valuable but unexplored source of lead molecules for drug discovery.
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

NASA Astrophysics Data System (ADS)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-01

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.
Integrated microfluidic devices for combinatorial cell-based assays.

PubMed

Yu, Zeta Tak For; Kamei, Ken-ichiro; Takahashi, Hiroko; Shu, Chengyi Jenny; Wang, Xiaopu; He, George Wenfu; Silverman, Robert; Radu, Caius G; Witte, Owen N; Lee, Ki-Bum; Tseng, Hsian-Rong

2009-06-01

The development of miniaturized cell culture platforms for performing parallel cultures and combinatorial assays is important in cell biology from the single-cell level to the system level. In this paper we developed an integrated microfluidic cell-culture platform, Cell-microChip (Cell-microChip), for parallel analyses of the effects of microenvironmental cues (i.e., culture scaffolds) on different mammalian cells and their cellular responses to external stimuli. As a model study, we demonstrated the ability of culturing and assaying several mammalian cells, such as NIH 3T3 fibroblast, B16 melanoma and HeLa cell lines, in a parallel way. For functional assays, first we tested drug-induced apoptotic responses from different cell lines. As a second functional assay, we performed "on-chip" transfection of a reporter gene encoding an enhanced green fluorescent protein (EGFP) followed by live-cell imaging of transcriptional activation of cyclooxygenase 2 (Cox-2) expression. Collectively, our Cell-microChip approach demonstrated the capability to carry out parallel operations and the potential to further integrate advanced functions and applications in the broader space of combinatorial chemistry and biology.
Integrated microfluidic devices for combinatorial cell-based assays

PubMed Central

Yu, Zeta Tak For; Kamei, Ken-ichiro; Takahashi, Hiroko; Shu, Chengyi Jenny; Wang, Xiaopu; He, George Wenfu; Silverman, Robert

2010-01-01

The development of miniaturized cell culture platforms for performing parallel cultures and combinatorial assays is important in cell biology from the single-cell level to the system level. In this paper we developed an integrated microfluidic cell-culture platform, Cell-microChip (Cell-μChip), for parallel analyses of the effects of microenvir-onmental cues (i.e., culture scaffolds) on different mammalian cells and their cellular responses to external stimuli. As a model study, we demonstrated the ability of culturing and assaying several mammalian cells, such as NIH 3T3 fibro-blast, B16 melanoma and HeLa cell lines, in a parallel way. For functional assays, first we tested drug-induced apoptotic responses from different cell lines. As a second functional assay, we performed "on-chip" transfection of a reporter gene encoding an enhanced green fluorescent protein (EGFP) followed by live-cell imaging of transcriptional activation of cyclooxygenase 2 (Cox-2) expression. Collectively, our Cell-μChip approach demonstrated the capability to carry out parallel operations and the potential to further integrate advanced functions and applications in the broader space of combinatorial chemistry and biology. PMID:19130244
GenomicTools: a computational platform for developing high-throughput analytics in genomics.

PubMed

Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

2012-01-15

Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.
Fully Parallel MHD Stability Analysis Tool

NASA Astrophysics Data System (ADS)

Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

2015-11-01

Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Results of MARS parallelization and of the development of a new fix boundary equilibrium code adapted for MARS input will be reported. Work is supported by the U.S. DOE SBIR program.
Optimisation of a parallel ocean general circulation model

NASA Astrophysics Data System (ADS)

Beare, M. I.; Stevens, D. P.

1997-10-01

This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
Software Tools for Development on the Peregrine System | High-Performance

Science.gov Websites

Computing | NREL Software Tools for Development on the Peregrine System Software Tools for and manage software at the source code level. Cross-Platform Make and SCons The "Cross-Platform Make" (CMake) package is from Kitware, and SCons is a modern software build tool based on Python
A benchtop biorobotic platform for in vitro observation of muscle-tendon dynamics with parallel mechanical assistance from an elastic exoskeleton.

PubMed

Robertson, Benjamin D; Vadakkeveedu, Siddarth; Sawicki, Gregory S

2017-05-24

We present a novel biorobotic framework comprised of a biological muscle-tendon unit (MTU) mechanically coupled to a feedback controlled robotic environment simulation that mimics in vivo inertial/gravitational loading and mechanical assistance from a parallel elastic exoskeleton. Using this system, we applied select combinations of biological muscle activation (modulated with rate-coded direct neural stimulation) and parallel elastic assistance (applied via closed-loop mechanical environment simulation) hypothesized to mimic human behavior based on previously published modeling studies. These conditions resulted in constant system-level force-length dynamics (i.e., stiffness), reduced biological loads, increased muscle excursion, and constant muscle average positive power output-all consistent with laboratory experiments on intact humans during exoskeleton assisted hopping. Mechanical assistance led to reduced estimated metabolic cost and MTU apparent efficiency, but increased apparent efficiency for the MTU+Exo system as a whole. Findings from this study suggest that the increased natural resonant frequency of the artificially stiffened MTU+Exo system, along with invariant movement frequencies, may underlie observed limits on the benefits of exoskeleton assistance. Our novel approach demonstrates that it is possible to capture the salient features of human locomotion with exoskeleton assistance in an isolated muscle-tendon preparation, and introduces a powerful new tool for detailed, direct examination of how assistive devices affect muscle-level neuromechanics and energetics. Copyright © 2017 Elsevier Ltd. All rights reserved.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform

PubMed Central

Wang, Min; Tian, Yun

2018-01-01

The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance. PMID:29861711
Advances in nanopatterned and nanostructured supported lipid membranes and their applications.

PubMed

Reimhult, Erik; Baumann, Martina; Kaufmann, Stefan; Kumar, Karthik; Spycher, Philipp

2010-01-01

Lipid membranes are versatile and convenient alternatives to study the properties of natural cell membranes. Self-assembled, artificial, substrate-supported lipid membranes have taken a central role in membrane research due to a combination of factors such as ease of creation, control over complexity, stability and the applicability of a large range of different analytical techniques. While supported lipid bilayers have been investigated for several decades, recent advances in the understanding of the assembly of such membranes from liposomes have spawned a renaissance in the field. Supported lipid bilayers are a highly promising tool to study transmembrane proteins in their native state, an application that could have tremendous impact on, e.g. drug discovery, development of biointerfaces and as platforms for glycomics and probing of multivalent binding which requires ligand mobility. Parallel advances in microfluidics, biosensor design, micro- and nanofabrication have converged to bring self-assembled supported lipid bilayers closer to a versatile and easy to use research tool as well as closer to industrial applications. The field of supported lipid bilayer research and application is thus rapidly expanding and diversifying with new platforms continuously being proposed and developed. In order to use supported lipid bilayers for such applications several advances have to be made: decoupling of the membrane from the support while maintaining it close to the surface, making use of biologically relevant lipid compositions, patterning of lipid membranes into arrays, and application to nanostructured substrates and sensors. This review summarizes recent advances in the field which addresses these challenges.
DensToolKit: A comprehensive open-source package for analyzing the electron density and its derivative scalar and vector fields

NASA Astrophysics Data System (ADS)

Solano-Altamirano, J. M.; Hernández-Pérez, Julio M.

2015-11-01

DensToolKit is a suite of cross-platform, optionally parallelized, programs for analyzing the molecular electron density (ρ) and several fields derived from it. Scalar and vector fields, such as the gradient of the electron density (∇ρ), electron localization function (ELF) and its gradient, localized orbital locator (LOL), region of slow electrons (RoSE), reduced density gradient, localized electrons detector (LED), information entropy, molecular electrostatic potential, kinetic energy densities K and G, among others, can be evaluated on zero, one, two, and three dimensional grids. The suite includes a program for searching critical points and bond paths of the electron density, under the framework of Quantum Theory of Atoms in Molecules. DensToolKit also evaluates the momentum space electron density on spatial grids, and the reduced density matrix of order one along lines joining two arbitrary atoms of a molecule. The source code is distributed under the GNU-GPLv3 license, and we release the code with the intent of establishing an open-source collaborative project. The style of DensToolKit's code follows some of the guidelines of an object-oriented program. This allows us to supply the user with a simple manner for easily implement new scalar or vector fields, provided they are derived from any of the fields already implemented in the code. In this paper, we present some of the most salient features of the programs contained in the suite, some examples of how to run them, and the mathematical definitions of the implemented fields along with hints of how we optimized their evaluation. We benchmarked our suite against both a freely-available program and a commercial package. Speed-ups of ˜2×, and up to 12× were obtained using a non-parallel compilation of DensToolKit for the evaluation of fields. DensToolKit takes similar times for finding critical points, compared to a commercial package. Finally, we present some perspectives for the future development and growth of the suite.
Mechanically latchable tiltable platform for forming micromirrors and micromirror arrays

DOEpatents

Garcia, Ernest J [Albuquerque, NM; Polosky, Marc A [Tijeras, NM; Sleefe, Gerard E [Cedar Crest, NM

2006-12-12

A microelectromechanical (MEM) apparatus is disclosed which includes a platform that can be electrostatically tilted from being parallel to a substrate on which the platform to being tilted at an angle of 1 20 degrees with respect to the substrate. Once the platform has been tilted to a maximum angle of tilt, the platform can be locked in position using an electrostatically-operable latching mechanism which engages a tab protruding below the platform. The platform has a light-reflective upper surface which can be optionally coated to provide an enhanced reflectivity and form a micromirror. An array of such micromirrors can be formed on a common substrate for applications including optical switching (e.g. for fiber optic communications), optical information processing, image projection displays or non-volatile optical memories.
Myria: Scalable Analytics as a Service

NASA Astrophysics Data System (ADS)

Howe, B.; Halperin, D.; Whitaker, A.

2014-12-01

At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.

FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

PubMed

Wilson, Carolyn A; Simonyan, Vahan

2014-01-01

Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and bioinformatics tools to be able to store and analyze large amounts of data. To address this need, we have developed the High-performance Integrated Virtual Environment, or HIVE. HIVE uses a combination of distributed cloud storage and distributed cloud computations to provide a platform that is both rapid and responsive to support the growing and increasingly diverse scientific and regulatory needs of FDA scientists in their evaluation of NGS in research and ultimately for evaluation of NGS data in regulatory submissions. © PDA, Inc. 2014.
geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling

NASA Astrophysics Data System (ADS)

Cowart, C.; Block, J.; Crawl, D.; Graham, J.; Gupta, A.; Nguyen, M.; de Callafon, R.; Smarr, L.; Altintas, I.

2015-12-01

The NSF-funded WIFIRE project has developed an open-source, online geospatial workflow platform for unifying geoprocessing tools and models for for fire and other geospatially dependent modeling applications. It is a product of WIFIRE's objective to build an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. geoKepler includes a set of reusable GIS components, or actors, for the Kepler Scientific Workflow System (https://kepler-project.org). Actors exist for reading and writing GIS data in formats such as Shapefile, GeoJSON, KML, and using OGC web services such as WFS. The actors also allow for calling geoprocessing tools in other packages such as GDAL and GRASS. Kepler integrates functions from multiple platforms and file formats into one framework, thus enabling optimal GIS interoperability, model coupling, and scalability. Products of the GIS actors can be fed directly to models such as FARSITE and WRF. Kepler's ability to schedule and scale processes using Hadoop and Spark also makes geoprocessing ultimately extensible and computationally scalable. The reusable workflows in geoKepler can be made to run automatically when alerted by real-time environmental conditions. Here, we show breakthroughs in the speed of creating complex data for hazard assessments with this platform. We also demonstrate geoKepler workflows that use Data Assimilation to ingest real-time weather data into wildfire simulations, and for data mining techniques to gain insight into environmental conditions affecting fire behavior. Existing machine learning tools and libraries such as R and MLlib are being leveraged for this purpose in Kepler, as well as Kepler's Distributed Data Parallel (DDP) capability to provide a framework for scalable processing. geoKepler workflows can be executed via an iPython notebook as a part of a Jupyter hub at UC San Diego for sharing and reporting of the scientific analysis and results from various runs of geoKepler workflows. The communication between iPython and Kepler workflow executions is established through an iPython magic function for Kepler that we have implemented. In summary, geoKepler is an ecosystem that makes geospatial processing and analysis of any kind programmable, reusable, scalable and sharable.
Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

NASA Astrophysics Data System (ADS)

Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

2017-01-01

Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.
Performance Models for the Spike Banded Linear System Solver

DOE PAGES

Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...

2011-01-01

With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.« less
From Analysis to Impact: Challenges and Outcomes from Google's Cloud-based Platforms for Analyzing and Leveraging Petapixels of Geospatial Data

NASA Astrophysics Data System (ADS)

Thau, D.

2017-12-01

For the past seven years, Google has made petabytes of Earth observation data, and the tools to analyze it, freely available to researchers around the world via cloud computing. These data and tools were initially available via Google Earth Engine and are increasingly available on the Google Cloud Platform. We have introduced a number of APIs for both the analysis and presentation of geospatial data that have been successfully used to create impactful datasets and web applications, including studies of global surface water availability, global tree cover change, and crop yield estimation. Each of these projects used the cloud to analyze thousands to millions of Landsat scenes. The APIs support a range of publishing options, from outputting imagery and data for inclusion in papers, to providing tools for full scale web applications that provide analysis tools of their own. Over the course of developing these tools, we have learned a number of lessons about how to build a publicly available cloud platform for geospatial analysis, and about how the characteristics of an API can affect the kinds of impacts a platform can enable. This study will present an overview of how Google Earth Engine works and how Google's geospatial capabilities are extending to Google Cloud Platform. We will provide a number of case studies describing how these platforms, and the data they host, have been leveraged to build impactful decision support tools used by governments, researchers, and other institutions, and we will describe how the available APIs have shaped (or constrained) those tools. [Image Credit: Tyler A. Erickson
Parallel and Portable Monte Carlo Particle Transport

NASA Astrophysics Data System (ADS)

Lee, S. R.; Cummings, J. C.; Nolen, S. D.; Keen, N. D.

1997-08-01

We have developed a multi-group, Monte Carlo neutron transport code in C++ using object-oriented methods and the Parallel Object-Oriented Methods and Applications (POOMA) class library. This transport code, called MC++, currently computes k and α eigenvalues of the neutron transport equation on a rectilinear computational mesh. It is portable to and runs in parallel on a wide variety of platforms, including MPPs, clustered SMPs, and individual workstations. It contains appropriate classes and abstractions for particle transport and, through the use of POOMA, for portable parallelism. Current capabilities are discussed, along with physics and performance results for several test problems on a variety of hardware, including all three Accelerated Strategic Computing Initiative (ASCI) platforms. Current parallel performance indicates the ability to compute α-eigenvalues in seconds or minutes rather than days or weeks. Current and future work on the implementation of a general transport physics framework (TPF) is also described. This TPF employs modern C++ programming techniques to provide simplified user interfaces, generic STL-style programming, and compile-time performance optimization. Physics capabilities of the TPF will be extended to include continuous energy treatments, implicit Monte Carlo algorithms, and a variety of convergence acceleration techniques such as importance combing.
Soil Monitor: an advanced and freely accesible platform to challenge soil sealing in Italy

NASA Astrophysics Data System (ADS)

Langella, Giuliano; Basile, Angelo; Giannecchini, Simone; Domenico Moccia, Francesco; Munafò, Michele; Terribile, Fabio

2017-04-01

Soil sealing is known to be one of the most serious soil degradation processes since it greatly disturbs or removes essential ecosystem services. Although important policy documents (Roadmap to a Resource Efficient in Europe, SDG'S) promise to mitigate this problem, there are still no signs of change and today soil sealing continues to increase globally. We believe an immediate action is required to reduce the distance between the grand policy declarations and the poor availability of operational - and scientifically robust - tools to challenge soil sealing. These tools must be able to support the decisions made by people who manage and control the soil sealing, namely urban and landscape planning professionals and authorities. In this contribution, we demonstrate that soil sealing can be effectively challenged by the implementation of a dedicated Geospatial Cyberinfrastructure. The platform we are developing - named Soil Monitor - is at now a well-functioning prototype freely available at http://www.soilmonitor.it/. It has been developed by research scientists coming from different disciplines. The national authority for environmental protection (ISPRA) provided the dataset while INU (Italian association of urban planners) tested the soil sealing and the urban planning indicators. More generally, Soil Monitor has been designed to support the Italian policy documents connected to soil sealing: AS 1181, AS 2383, L. 22 May 2015, n. 68; L. 28 December, n. 221). Thus, it connects many different soil sealing aspects including science, community, policy and economy. Soil Monitor performs geospatial computation in real-time to support the decision making in the landscape planning. This aims at measuring soil sealing in order to mitigate it and in particular at recognizing actions to achieve the land degradation neutrality. The web platform covers the entire Italy, even though it is "Country-agnostic". Data are processed at a very high spatial resolution (10-20 m), which is a "must" for effective landscape planning. Computation is designed to be highly scalable enabling real time responses over a customised range of spatial extents and high-demand calculations are embedded by means of advanced parallel codes running fast on GPUs (Graphical Processing Units). For any Italian area of interest drawn or selected by the user the analysis includes real time quantification of (i) land use changes at different times (ii) rural landscape fragmentation, (iii) loss of ecosystem services after new urbanisation, (iv) potential impact of new green corridors. A library of parallel routines based on the CUDA (Computing Unified Device Architecture) framework is going to be built which enables the easy implementation of new indicators for measuring land state and degradation.
Parallel software tools at Langley Research Center

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Tennille, Geoffrey M.; Lakeotes, Christopher D.; Randall, Donald P.; Arthur, Jarvis J.; Hammond, Dana P.; Mall, Gerald H.

1993-01-01

This document gives a brief overview of parallel software tools available on the Intel iPSC/860 parallel computer at Langley Research Center. It is intended to provide a source of information that is somewhat more concise than vendor-supplied material on the purpose and use of various tools. Each of the chapters on tools is organized in a similar manner covering an overview of the functionality, access information, how to effectively use the tool, observations about the tool and how it compares to similar software, known problems or shortfalls with the software, and reference documentation. It is primarily intended for users of the iPSC/860 at Langley Research Center and is appropriate for both the experienced and novice user.
Synthesizing parallel imaging applications using the CAP (computer-aided parallelization) tool

NASA Astrophysics Data System (ADS)

Gennart, Benoit A.; Mazzariol, Marc; Messerli, Vincent; Hersch, Roger D.

1997-12-01

Imaging applications such as filtering, image transforms and compression/decompression require vast amounts of computing power when applied to large data sets. These applications would potentially benefit from the use of parallel processing. However, dedicated parallel computers are expensive and their processing power per node lags behind that of the most recent commodity components. Furthermore, developing parallel applications remains a difficult task: writing and debugging the application is difficult (deadlocks), programs may not be portable from one parallel architecture to the other, and performance often comes short of expectations. In order to facilitate the development of parallel applications, we propose the CAP computer-aided parallelization tool which enables application programmers to specify at a high-level of abstraction the flow of data between pipelined-parallel operations. In addition, the CAP tool supports the programmer in developing parallel imaging and storage operations. CAP enables combining efficiently parallel storage access routines and image processing sequential operations. This paper shows how processing and I/O intensive imaging applications must be implemented to take advantage of parallelism and pipelining between data access and processing. This paper's contribution is (1) to show how such implementations can be compactly specified in CAP, and (2) to demonstrate that CAP specified applications achieve the performance of custom parallel code. The paper analyzes theoretically the performance of CAP specified applications and demonstrates the accuracy of the theoretical analysis through experimental measurements.
Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends

PubMed Central

2014-01-01

The emergence of massive datasets in a clinical setting presents both challenges and opportunities in data storage and analysis. This so called “big data” challenges traditional analytic tools and will increasingly require novel solutions adapted from other fields. Advances in information and communication technology present the most viable solutions to big data analysis in terms of efficiency and scalability. It is vital those big data solutions are multithreaded and that data access approaches be precisely tailored to large volumes of semi-structured/unstructured data. The MapReduce programming framework uses two tasks common in functional programming: Map and Reduce. MapReduce is a new parallel processing framework and Hadoop is its open-source implementation on a single computing node or on clusters. Compared with existing parallel processing paradigms (e.g. grid computing and graphical processing unit (GPU)), MapReduce and Hadoop have two advantages: 1) fault-tolerant storage resulting in reliable data processing by replicating the computing tasks, and cloning the data chunks on different computing nodes across the computing cluster; 2) high-throughput data processing via a batch processing framework and the Hadoop distributed file system (HDFS). Data are stored in the HDFS and made available to the slave nodes for computation. In this paper, we review the existing applications of the MapReduce programming framework and its implementation platform Hadoop in clinical big data and related medical health informatics fields. The usage of MapReduce and Hadoop on a distributed system represents a significant advance in clinical big data processing and utilization, and opens up new opportunities in the emerging era of big data analytics. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools. This paper is concluded by summarizing the potential usage of the MapReduce programming framework and Hadoop platform to process huge volumes of clinical data in medical health informatics related fields. PMID:25383096
Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.

PubMed

Mohammed, Emad A; Far, Behrouz H; Naugler, Christopher

2014-01-01

The emergence of massive datasets in a clinical setting presents both challenges and opportunities in data storage and analysis. This so called "big data" challenges traditional analytic tools and will increasingly require novel solutions adapted from other fields. Advances in information and communication technology present the most viable solutions to big data analysis in terms of efficiency and scalability. It is vital those big data solutions are multithreaded and that data access approaches be precisely tailored to large volumes of semi-structured/unstructured data. THE MAPREDUCE PROGRAMMING FRAMEWORK USES TWO TASKS COMMON IN FUNCTIONAL PROGRAMMING: Map and Reduce. MapReduce is a new parallel processing framework and Hadoop is its open-source implementation on a single computing node or on clusters. Compared with existing parallel processing paradigms (e.g. grid computing and graphical processing unit (GPU)), MapReduce and Hadoop have two advantages: 1) fault-tolerant storage resulting in reliable data processing by replicating the computing tasks, and cloning the data chunks on different computing nodes across the computing cluster; 2) high-throughput data processing via a batch processing framework and the Hadoop distributed file system (HDFS). Data are stored in the HDFS and made available to the slave nodes for computation. In this paper, we review the existing applications of the MapReduce programming framework and its implementation platform Hadoop in clinical big data and related medical health informatics fields. The usage of MapReduce and Hadoop on a distributed system represents a significant advance in clinical big data processing and utilization, and opens up new opportunities in the emerging era of big data analytics. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools. This paper is concluded by summarizing the potential usage of the MapReduce programming framework and Hadoop platform to process huge volumes of clinical data in medical health informatics related fields.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

NASA Astrophysics Data System (ADS)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
Microfluidic immunocapture of circulating pancreatic cells using parallel EpCAM and MUC1 capture: characterization, optimization and downstream analysis.

PubMed

Thege, Fredrik I; Lannin, Timothy B; Saha, Trisha N; Tsai, Shannon; Kochman, Michael L; Hollingsworth, Michael A; Rhim, Andrew D; Kirby, Brian J

2014-05-21

We have developed and optimized a microfluidic device platform for the capture and analysis of circulating pancreatic cells (CPCs) and pancreatic circulating tumor cells (CTCs). Our platform uses parallel anti-EpCAM and cancer-specific mucin 1 (MUC1) immunocapture in a silicon microdevice. Using a combination of anti-EpCAM and anti-MUC1 capture in a single device, we are able to achieve efficient capture while extending immunocapture beyond single marker recognition. We also have detected a known oncogenic KRAS mutation in cells spiked in whole blood using immunocapture, RNA extraction, RT-PCR and Sanger sequencing. To allow for downstream single-cell genetic analysis, intact nuclei were released from captured cells by using targeted membrane lysis. We have developed a staining protocol for clinical samples, including standard CTC markers; DAPI, cytokeratin (CK) and CD45, and a novel marker of carcinogenesis in CPCs, mucin 4 (MUC4). We have also demonstrated a semi-automated approach to image analysis and CPC identification, suitable for clinical hypothesis generation. Initial results from immunocapture of a clinical pancreatic cancer patient sample show that parallel capture may capture more of the heterogeneity of the CPC population. With this platform, we aim to develop a diagnostic biomarker for early pancreatic carcinogenesis and patient risk stratification.
Analysis and design of a high power, digitally-controlled spacecraft power system

NASA Technical Reports Server (NTRS)

Lee, F. C.; Cho, B. H.

1990-01-01

The progress to date on the analysis and design of a high power, digitally controlled spacecraft power system is described. Several battery discharger topologies were compared for use in the space platform application. Updated information has been provided on the battery voltage specification. Initially it was thought to be in the 30 to 40 V range. It is now specified to be 53 V to 84 V. This eliminated the tapped-boost and the current-fed auto-transformer converters from consideration. After consultations with NASA, it was decided to trade-off the following topologies: (1) boost converter; (2) multi-module, multi-phase boost converter; and (3) voltage-fed push-pull with auto-transformer. A non-linear design optimization software tool was employed to facilitate an objective comparison. Non-linear design optimization insures that the best design of each topology is compared. The results indicate that a four-module, boost converter with each module operating 90 degrees out of phase is the optimum converter for the space platform. Large-signal and small-signal models were generated for the shunt, charger, discharger, battery, and the mode controller. The models were first tested individually according to the space platform power system specifications supplied by NASA. The effect of battery voltage imbalance on parallel dischargers was investigated with respect to dc and small-signal responses. Similarly, the effects of paralleling dischargers and chargers were also investigated. A solar array and shunt model was included in these simulations. A model for the bus mode controller (power control unit) was also developed to interface the Orbital replacement Unit (ORU) model to the platform power system. Small signal models were used to generate the bus impedance plots in the various operating modes. The large signal models were integrated into a system model, and time domain simulations were performed to verify bus regulation during mode transitions. Some changes have subsequently been incorporated into the models. The changes include the use of a four module boost discharger, and a new model for the mode controller, which includes the effects of saturation. The new simulations for the boost discharger show the improvement in bus ripple that can be achieved by phase-shifted operation of each of the boost modules.
Embedded Streaming Deep Neural Networks Accelerator With Applications.

PubMed

Dundar, Aysegul; Jin, Jonghoon; Martini, Berin; Culurciello, Eugenio

2017-07-01

Deep convolutional neural networks (DCNNs) have become a very powerful tool in visual perception. DCNNs have applications in autonomous robots, security systems, mobile phones, and automobiles, where high throughput of the feedforward evaluation phase and power efficiency are important. Because of this increased usage, many field-programmable gate array (FPGA)-based accelerators have been proposed. In this paper, we present an optimized streaming method for DCNNs' hardware accelerator on an embedded platform. The streaming method acts as a compiler, transforming a high-level representation of DCNNs into operation codes to execute applications in a hardware accelerator. The proposed method utilizes maximum computational resources available based on a novel-scheduled routing topology that combines data reuse and data concatenation. It is tested with a hardware accelerator implemented on the Xilinx Kintex-7 XC7K325T FPGA. The system fully explores weight-level and node-level parallelizations of DCNNs and achieves a peak performance of 247 G-ops while consuming less than 4 W of power. We test our system with applications on object classification and object detection in real-world scenarios. Our results indicate high-performance efficiency, outperforming all other presented platforms while running these applications.
Next-generation sequencing in clinical virology: Discovery of new viruses.

PubMed

Datta, Sibnarayan; Budhauliya, Raghvendra; Das, Bidisha; Chatterjee, Soumya; Vanlalhmuaka; Veer, Vijay

2015-08-12

Viruses are a cause of significant health problem worldwide, especially in the developing nations. Due to different anthropological activities, human populations are exposed to different viral pathogens, many of which emerge as outbreaks. In such situations, discovery of novel viruses is utmost important for deciding prevention and treatment strategies. Since last century, a number of different virus discovery methods, based on cell culture inoculation, sequence-independent PCR have been used for identification of a variety of viruses. However, the recent emergence and commercial availability of next-generation sequencers (NGS) has entirely changed the field of virus discovery. These massively parallel sequencing platforms can sequence a mixture of genetic materials from a very heterogeneous mix, with high sensitivity. Moreover, these platforms work in a sequence-independent manner, making them ideal tools for virus discovery. However, for their application in clinics, sample preparation or enrichment is necessary to detect low abundance virus populations. A number of techniques have also been developed for enrichment or viral nucleic acids. In this manuscript, we review the evolution of sequencing; NGS technologies available today as well as widely used virus enrichment technologies. We also discuss the challenges associated with their applications in the clinical virus discovery.
Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells.

PubMed

Beltman, Joost B; Urbanus, Jos; Velds, Arno; van Rooij, Nienke; Rohr, Jan C; Naik, Shalin H; Schumacher, Ton N

2016-04-02

Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and rare sequences are biologically relevant, the relatively high error rate of NGS techniques complicates data analysis, as it is difficult to distinguish rare true sequences from spurious sequences that are generated by PCR or sequencing errors. This issue, for instance, applies to cellular barcoding strategies that aim to follow the amount and type of offspring of single cells, by supplying these with unique heritable DNA tags. Here, we use genetic barcoding data from the Illumina HiSeq platform to show that straightforward read threshold-based filtering of data is typically insufficient to filter out spurious barcodes. Importantly, we demonstrate that specific sequencing errors occur at an approximately constant rate across different samples that are sequenced in parallel. We exploit this observation by developing a novel approach to filter out spurious sequences. Application of our new method demonstrates its value in the identification of true sequences amongst spurious sequences in biological data sets.
Parallel Implementation of the Discontinuous Galerkin Method

NASA Technical Reports Server (NTRS)

Baggag, Abdalkader; Atkins, Harold; Keyes, David

1999-01-01

This paper describes a parallel implementation of the discontinuous Galerkin method. Discontinuous Galerkin is a spatially compact method that retains its accuracy and robustness on non-smooth unstructured grids and is well suited for time dependent simulations. Several parallelization approaches are studied and evaluated. The most natural and symmetric of the approaches has been implemented in all object-oriented code used to simulate aeroacoustic scattering. The parallel implementation is MPI-based and has been tested on various parallel platforms such as the SGI Origin, IBM SP2, and clusters of SGI and Sun workstations. The scalability results presented for the SGI Origin show slightly superlinear speedup on a fixed-size problem due to cache effects.
New Database Manipulation Tools in the Easy-Learning On-Line Platform

ERIC Educational Resources Information Center

Radescu, Radu; Davidescu, Andrei; Pupezescu, Valentin

2011-01-01

The present paper deals with the new ORM (object-relational mapping) tool introduced in the easy-learning platform. Propel 1.5 is the latest version of Propel, one of the ORMs fully compatible with the Symfony framework, and in comparison with the older versions and it has drastically improved the way the easy-learning platform can manipulate its…
Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

2000-01-01

HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).

Kinematics and dynamics of robotic systems with multiple closed loops

NASA Astrophysics Data System (ADS)

Zhang, Chang-De

The kinematics and dynamics of robotic systems with multiple closed loops, such as Stewart platforms, walking machines, and hybrid manipulators, are studied. In the study of kinematics, focus is on the closed-form solutions of the forward position analysis of different parallel systems. A closed-form solution means that the solution is expressed as a polynomial in one variable. If the order of the polynomial is less than or equal to four, the solution has analytical closed-form. First, the conditions of obtaining analytical closed-form solutions are studied. For a Stewart platform, the condition is found to be that one rotational degree of freedom of the output link is decoupled from the other five. Based on this condition, a class of Stewart platforms which has analytical closed-form solution is formulated. Conditions of analytical closed-form solution for other parallel systems are also studied. Closed-form solutions of forward kinematics for walking machines and multi-fingered grippers are then studied. For a parallel system with three three-degree-of-freedom subchains, there are 84 possible ways to select six independent joints among nine joints. These 84 ways can be classified into three categories: Category 3:3:0, Category 3:2:1, and Category 2:2:2. It is shown that the first category has no solutions; the solutions of the second category have analytical closed-form; and the solutions of the last category are higher order polynomials. The study is then extended to a nearly general Stewart platform. The solution is a 20th order polynomial and the Stewart platform has a maximum of 40 possible configurations. Also, the study is extended to a new class of hybrid manipulators which consists of two serially connected parallel mechanisms. In the study of dynamics, a computationally efficient method for inverse dynamics of manipulators based on the virtual work principle is developed. Although this method is comparable with the recursive Newton-Euler method for serial manipulators, its advantage is more noteworthy when applied to parallel systems. An approach of inverse dynamics of a walking machine is also developed, which includes inverse dynamic modeling, foot force distribution, and joint force/torque allocation.

Charon Toolkit for Parallel, Implicit Structured-Grid Computations: Functional Design

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

1997-01-01

In a previous report the design concepts of Charon were presented. Charon is a toolkit that aids engineers in developing scientific programs for structured-grid applications to be run on MIMD parallel computers. It constitutes an augmentation of the general-purpose MPI-based message-passing layer, and provides the user with a hierarchy of tools for rapid prototyping and validation of parallel programs, and subsequent piecemeal performance tuning. Here we describe the implementation of the domain decomposition tools used for creating data distributions across sets of processors. We also present the hierarchy of parallelization tools that allows smooth translation of legacy code (or a serial design) into a parallel program. Along with the actual tool descriptions, we will present the considerations that led to the particular design choices. Many of these are motivated by the requirement that Charon must be useful within the traditional computational environments of Fortran 77 and C. Only the Fortran 77 syntax will be presented in this report.
ASC-ATDM Performance Portability Requirements for 2015-2019

DOE Office of Scientific and Technical Information (OSTI.GOV)

Edwards, Harold C.; Trott, Christian Robert

This report outlines the research, development, and support requirements for the Advanced Simulation and Computing (ASC ) Advanced Technology, Development, and Mitigation (ATDM) Performance Portability (a.k.a., Kokkos) project for 2015 - 2019 . The research and development (R&D) goal for Kokkos (v2) has been to create and demonstrate a thread - parallel programming model a nd standard C++ library - based implementation that enables performance portability across diverse manycore architectures such as multicore CPU, Intel Xeon Phi, and NVIDIA Kepler GPU. This R&D goal has been achieved for algorithms that use data parallel pat terns including parallel - for, parallelmore » - reduce, and parallel - scan. Current R&D is focusing on hierarchical parallel patterns such as a directed acyclic graph (DAG) of asynchronous tasks where each task contain s nested data parallel algorithms. This five y ear plan includes R&D required to f ully and performance portably exploit thread parallelism across current and anticipated next generation platforms (NGP). The Kokkos library is being evaluated by many projects exploring algorithm s and code design for NGP. Some production libraries and applications such as Trilinos and LAMMPS have already committed to Kokkos as their foundation for manycore parallelism an d performance portability. These five year requirements includes support required for current and antic ipated ASC projects to be effective and productive in their use of Kokkos on NGP. The greatest risk to the success of Kokkos and ASC projects relying upon Kokkos is a lack of staffing resources to support Kokkos to the degree needed by these ASC projects. This support includes up - to - date tutorials, documentation, multi - platform (hardware and software stack) testing, minor feature enhancements, thread - scalable algorithm consulting, and managing collaborative R&D.« less
Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sreepathi, Sarat; Kumar, Jitendra; Mills, Richard T.

A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like themore » Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.« less
A framework for plasticity implementation on the SpiNNaker neural architecture.

PubMed

Galluppi, Francesco; Lagorce, Xavier; Stromatias, Evangelos; Pfeiffer, Michael; Plana, Luis A; Furber, Steve B; Benosman, Ryad B

2014-01-01

Many of the precise biological mechanisms of synaptic plasticity remain elusive, but simulations of neural networks have greatly enhanced our understanding of how specific global functions arise from the massively parallel computation of neurons and local Hebbian or spike-timing dependent plasticity rules. For simulating large portions of neural tissue, this has created an increasingly strong need for large scale simulations of plastic neural networks on special purpose hardware platforms, because synaptic transmissions and updates are badly matched to computing style supported by current architectures. Because of the great diversity of biological plasticity phenomena and the corresponding diversity of models, there is a great need for testing various hypotheses about plasticity before committing to one hardware implementation. Here we present a novel framework for investigating different plasticity approaches on the SpiNNaker distributed digital neural simulation platform. The key innovation of the proposed architecture is to exploit the reconfigurability of the ARM processors inside SpiNNaker, dedicating a subset of them exclusively to process synaptic plasticity updates, while the rest perform the usual neural and synaptic simulations. We demonstrate the flexibility of the proposed approach by showing the implementation of a variety of spike- and rate-based learning rules, including standard Spike-Timing dependent plasticity (STDP), voltage-dependent STDP, and the rate-based BCM rule. We analyze their performance and validate them by running classical learning experiments in real time on a 4-chip SpiNNaker board. The result is an efficient, modular, flexible and scalable framework, which provides a valuable tool for the fast and easy exploration of learning models of very different kinds on the parallel and reconfigurable SpiNNaker system.
A framework for plasticity implementation on the SpiNNaker neural architecture

PubMed Central

Galluppi, Francesco; Lagorce, Xavier; Stromatias, Evangelos; Pfeiffer, Michael; Plana, Luis A.; Furber, Steve B.; Benosman, Ryad B.

2015-01-01

Many of the precise biological mechanisms of synaptic plasticity remain elusive, but simulations of neural networks have greatly enhanced our understanding of how specific global functions arise from the massively parallel computation of neurons and local Hebbian or spike-timing dependent plasticity rules. For simulating large portions of neural tissue, this has created an increasingly strong need for large scale simulations of plastic neural networks on special purpose hardware platforms, because synaptic transmissions and updates are badly matched to computing style supported by current architectures. Because of the great diversity of biological plasticity phenomena and the corresponding diversity of models, there is a great need for testing various hypotheses about plasticity before committing to one hardware implementation. Here we present a novel framework for investigating different plasticity approaches on the SpiNNaker distributed digital neural simulation platform. The key innovation of the proposed architecture is to exploit the reconfigurability of the ARM processors inside SpiNNaker, dedicating a subset of them exclusively to process synaptic plasticity updates, while the rest perform the usual neural and synaptic simulations. We demonstrate the flexibility of the proposed approach by showing the implementation of a variety of spike- and rate-based learning rules, including standard Spike-Timing dependent plasticity (STDP), voltage-dependent STDP, and the rate-based BCM rule. We analyze their performance and validate them by running classical learning experiments in real time on a 4-chip SpiNNaker board. The result is an efficient, modular, flexible and scalable framework, which provides a valuable tool for the fast and easy exploration of learning models of very different kinds on the parallel and reconfigurable SpiNNaker system. PMID:25653580
GREEN SUPERCOMPUTING IN A DESKTOP BOX

DOE Office of Scientific and Technical Information (OSTI.GOV)

HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY

2007-01-17

The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less
Visualization of protein interaction networks: problems and solutions

PubMed Central

2013-01-01

Background Visualization concerns the representation of data visually and is an important task in scientific research. Protein-protein interactions (PPI) are discovered using either wet lab techniques, such mass spectrometry, or in silico predictions tools, resulting in large collections of interactions stored in specialized databases. The set of all interactions of an organism forms a protein-protein interaction network (PIN) and is an important tool for studying the behaviour of the cell machinery. Since graphic representation of PINs may highlight important substructures, e.g. protein complexes, visualization is more and more used to study the underlying graph structure of PINs. Although graphs are well known data structures, there are different open problems regarding PINs visualization: the high number of nodes and connections, the heterogeneity of nodes (proteins) and edges (interactions), the possibility to annotate proteins and interactions with biological information extracted by ontologies (e.g. Gene Ontology) that enriches the PINs with semantic information, but complicates their visualization. Methods In these last years many software tools for the visualization of PINs have been developed. Initially thought for visualization only, some of them have been successively enriched with new functions for PPI data management and PIN analysis. The paper analyzes the main software tools for PINs visualization considering four main criteria: (i) technology, i.e. availability/license of the software and supported OS (Operating System) platforms; (ii) interoperability, i.e. ability to import/export networks in various formats, ability to export data in a graphic format, extensibility of the system, e.g. through plug-ins; (iii) visualization, i.e. supported layout and rendering algorithms and availability of parallel implementation; (iv) analysis, i.e. availability of network analysis functions, such as clustering or mining of the graph, and the possibility to interact with external databases. Results Currently, many tools are available and it is not easy for the users choosing one of them. Some tools offer sophisticated 2D and 3D network visualization making available many layout algorithms, others tools are more data-oriented and support integration of interaction data coming from different sources and data annotation. Finally, some specialistic tools are dedicated to the analysis of pathways and cellular processes and are oriented toward systems biology studies, where the dynamic aspects of the processes being studied are central. Conclusion A current trend is the deployment of open, extensible visualization tools (e.g. Cytoscape), that may be incrementally enriched by the interactomics community with novel and more powerful functions for PIN analysis, through the development of plug-ins. On the other hand, another emerging trend regards the efficient and parallel implementation of the visualization engine that may provide high interactivity and near real-time response time, as in NAViGaTOR. From a technological point of view, open-source, free and extensible tools, like Cytoscape, guarantee a long term sustainability due to the largeness of the developers and users communities, and provide a great flexibility since new functions are continuously added by the developer community through new plug-ins, but the emerging parallel, often closed-source tools like NAViGaTOR, can offer near real-time response time also in the analysis of very huge PINs. PMID:23368786
A New Parallel Approach for Accelerating the GPU-Based Execution of Edge Detection Algorithms

PubMed Central

Emrani, Zahra; Bateni, Soroosh; Rabbani, Hossein

2017-01-01

Real-time image processing is used in a wide variety of applications like those in medical care and industrial processes. This technique in medical care has the ability to display important patient information graphi graphically, which can supplement and help the treatment process. Medical decisions made based on real-time images are more accurate and reliable. According to the recent researches, graphic processing unit (GPU) programming is a useful method for improving the speed and quality of medical image processing and is one of the ways of real-time image processing. Edge detection is an early stage in most of the image processing methods for the extraction of features and object segments from a raw image. The Canny method, Sobel and Prewitt filters, and the Roberts’ Cross technique are some examples of edge detection algorithms that are widely used in image processing and machine vision. In this work, these algorithms are implemented using the Compute Unified Device Architecture (CUDA), Open Source Computer Vision (OpenCV), and Matrix Laboratory (MATLAB) platforms. An existing parallel method for Canny approach has been modified further to run in a fully parallel manner. This has been achieved by replacing the breadth- first search procedure with a parallel method. These algorithms have been compared by testing them on a database of optical coherence tomography images. The comparison of results shows that the proposed implementation of the Canny method on GPU using the CUDA platform improves the speed of execution by 2–100× compared to the central processing unit-based implementation using the OpenCV and MATLAB platforms. PMID:28487831
A New Parallel Approach for Accelerating the GPU-Based Execution of Edge Detection Algorithms.

PubMed

Emrani, Zahra; Bateni, Soroosh; Rabbani, Hossein

2017-01-01

Real-time image processing is used in a wide variety of applications like those in medical care and industrial processes. This technique in medical care has the ability to display important patient information graphi graphically, which can supplement and help the treatment process. Medical decisions made based on real-time images are more accurate and reliable. According to the recent researches, graphic processing unit (GPU) programming is a useful method for improving the speed and quality of medical image processing and is one of the ways of real-time image processing. Edge detection is an early stage in most of the image processing methods for the extraction of features and object segments from a raw image. The Canny method, Sobel and Prewitt filters, and the Roberts' Cross technique are some examples of edge detection algorithms that are widely used in image processing and machine vision. In this work, these algorithms are implemented using the Compute Unified Device Architecture (CUDA), Open Source Computer Vision (OpenCV), and Matrix Laboratory (MATLAB) platforms. An existing parallel method for Canny approach has been modified further to run in a fully parallel manner. This has been achieved by replacing the breadth- first search procedure with a parallel method. These algorithms have been compared by testing them on a database of optical coherence tomography images. The comparison of results shows that the proposed implementation of the Canny method on GPU using the CUDA platform improves the speed of execution by 2-100× compared to the central processing unit-based implementation using the OpenCV and MATLAB platforms.
A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

PubMed Central

Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

2014-01-01

The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872
A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

PubMed

Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

2014-01-01

The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.
Real-Time Compressive Sensing MRI Reconstruction Using GPU Computing and Split Bregman Methods

PubMed Central

Smith, David S.; Gore, John C.; Yankeelov, Thomas E.; Welch, E. Brian

2012-01-01

Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time-consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with a graphics processing unit (GPU) computing platform. The increases in speed we find are similar to those we measure for matrix multiplication on this platform, suggesting that the split Bregman methods parallelize efficiently. We demonstrate that the combination of the rapid convergence of the split Bregman algorithm and the massively parallel strategy of GPU computing can enable real-time CS reconstruction of even acquisition data matrices of dimension 40962 or more, depending on available GPU VRAM. Reconstruction of two-dimensional data matrices of dimension 10242 and smaller took ~0.3 s or less, showing that this platform also provides very fast iterative reconstruction for small-to-moderate size images. PMID:22481908
Real-Time Compressive Sensing MRI Reconstruction Using GPU Computing and Split Bregman Methods.

PubMed

Smith, David S; Gore, John C; Yankeelov, Thomas E; Welch, E Brian

2012-01-01

Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time-consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with a graphics processing unit (GPU) computing platform. The increases in speed we find are similar to those we measure for matrix multiplication on this platform, suggesting that the split Bregman methods parallelize efficiently. We demonstrate that the combination of the rapid convergence of the split Bregman algorithm and the massively parallel strategy of GPU computing can enable real-time CS reconstruction of even acquisition data matrices of dimension 4096(2) or more, depending on available GPU VRAM. Reconstruction of two-dimensional data matrices of dimension 1024(2) and smaller took ~0.3 s or less, showing that this platform also provides very fast iterative reconstruction for small-to-moderate size images.
A Critical Appraisal of Techniques, Software Packages, and Standards for Quantitative Proteomic Analysis

PubMed Central

Lawless, Craig; Hubbard, Simon J.; Fan, Jun; Bessant, Conrad; Hermjakob, Henning; Jones, Andrew R.

2012-01-01

Abstract New methods for performing quantitative proteome analyses based on differential labeling protocols or label-free techniques are reported in the literature on an almost monthly basis. In parallel, a correspondingly vast number of software tools for the analysis of quantitative proteomics data has also been described in the literature and produced by private companies. In this article we focus on the review of some of the most popular techniques in the field and present a critical appraisal of several software packages available to process and analyze the data produced. We also describe the importance of community standards to support the wide range of software, which may assist researchers in the analysis of data using different platforms and protocols. It is intended that this review will serve bench scientists both as a useful reference and a guide to the selection and use of different pipelines to perform quantitative proteomics data analysis. We have produced a web-based tool (http://www.proteosuite.org/?q=other_resources) to help researchers find appropriate software for their local instrumentation, available file formats, and quantitative methodology. PMID:22804616
Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware.

PubMed

Zhu, Xiangyuan; Li, Kenli; Salah, Ahmad; Shi, Lin; Li, Keqin

2015-01-01

Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures. Due to the ever increasing sizes of sequence databases, there is increasing demand to accelerate this task. In this paper, we demonstrate how graphic processing units (GPUs), powered by the compute unified device architecture (CUDA), can be used as an efficient computational platform to accelerate the MAFFT algorithm. To fully exploit the GPU's capabilities for accelerating MAFFT, we have optimized the sequence data organization to eliminate the bandwidth bottleneck of memory access, designed a memory allocation and reuse strategy to make full use of limited memory of GPUs, proposed a new modified-run-length encoding (MRLE) scheme to reduce memory consumption, and used high-performance shared memory to speed up I/O operations. Our implementation tested in three NVIDIA GPUs achieves speedup up to 11.28 on a Tesla K20m GPU compared to the sequential MAFFT 7.015.
An Update on Improvements to NiCE Support for PROTEUS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bennett, Andrew; McCaskey, Alexander J.; Billings, Jay Jay

2015-09-01

The Department of Energy Office of Nuclear Energy's Nuclear Energy Advanced Modeling and Simulation (NEAMS) program has supported the development of the NEAMS Integrated Computational Environment (NiCE), a modeling and simulation workflow environment that provides services and plugins to facilitate tasks such as code execution, model input construction, visualization, and data analysis. This report details the development of workflows for the reactor core neutronics application, PROTEUS. This advanced neutronics application (primarily developed at Argonne National Laboratory) aims to improve nuclear reactor design and analysis by providing an extensible and massively parallel, finite-element solver for current and advanced reactor fuel neutronicsmore » modeling. The integration of PROTEUS-specific tools into NiCE is intended to make the advanced capabilities that PROTEUS provides more accessible to the nuclear energy research and development community. This report will detail the work done to improve existing PROTEUS workflow support in NiCE. We will demonstrate and discuss these improvements, including the development of flexible IO services, an improved interface for input generation, and the addition of advanced Fortran development tools natively in the platform.« less
A scalable parallel black oil simulator on distributed memory parallel computers

NASA Astrophysics Data System (ADS)

Wang, Kun; Liu, Hui; Chen, Zhangxin

2015-11-01

This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

NASA Astrophysics Data System (ADS)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
Parallel Solver for Diffuse Optical Tomography on Realistic Head Models With Scattering and Clear Regions.

PubMed

Placati, Silvio; Guermandi, Marco; Samore, Andrea; Scarselli, Eleonora Franchi; Guerrieri, Roberto

2016-09-01

Diffuse optical tomography is an imaging technique, based on evaluation of how light propagates within the human head to obtain the functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model, integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7 times speed-up over an isotropic-scattered parallel Monte Carlo engine based on a radiative transport equation for a domain composed of 2 million voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( ≈ 3 million voxels) in 116 s for the platform used.

Analysis of parameters for technological equipment of parallel kinematics based on rods of variable length for processing accuracy assurance

NASA Astrophysics Data System (ADS)

Koltsov, A. G.; Shamutdinov, A. H.; Blokhin, D. A.; Krivonos, E. V.

2018-01-01

A new classification of parallel kinematics mechanisms on symmetry coefficient, being proportional to mechanism stiffness and accuracy of the processing product using the technological equipment under study, is proposed. A new version of the Stewart platform with a high symmetry coefficient is presented for analysis. The workspace of the mechanism under study is described, this space being a complex solid figure. The workspace end points are reached by the center of the mobile platform which moves in parallel related to the base plate. Parameters affecting the processing accuracy, namely the static and dynamic stiffness, natural vibration frequencies are determined. The capability assessment of the mechanism operation under various loads, taking into account resonance phenomena at different points of the workspace, was conducted. The study proved that stiffness and therefore, processing accuracy with the use of the above mentioned mechanisms are comparable with the stiffness and accuracy of medium-sized series-produced machines.
Parallel Navier-Stokes computations on shared and distributed memory architectures

NASA Technical Reports Server (NTRS)

Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar

1995-01-01

We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.
Planning for Pre-Exascale Platform Environment (Fiscal Year 2015 Level 2 Milestone 5216)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Springmeyer, R.; Lang, M.; Noe, J.

This Plan for ASC Pre-Exascale Platform Environments document constitutes the deliverable for the fiscal year 2015 (FY15) Advanced Simulation and Computing (ASC) Program Level 2 milestone Planning for Pre-Exascale Platform Environment. It acknowledges and quantifies challenges and recognized gaps for moving the ASC Program towards effective use of exascale platforms and recommends strategies to address these gaps. This document also presents an update to the concerns, strategies, and plans presented in the FY08 predecessor document that dealt with the upcoming (at the time) petascale high performance computing (HPC) platforms. With the looming push towards exascale systems, a review of themore » earlier document was appropriate in light of the myriad architectural choices currently under consideration. The ASC Program believes the platforms to be fielded in the 2020s will be fundamentally different systems that stress ASC’s ability to modify codes to take full advantage of new or unique features. In addition, the scale of components will increase the difficulty of maintaining an errorfree system, thus driving new approaches to resilience and error detection/correction. The code revamps of the past, from serial- to vector-centric code to distributed memory to threaded implementations, will be revisited as codes adapt to a new message passing interface (MPI) plus “x” or more advanced and dynamic programming models based on architectural specifics. Development efforts are already underway in some cases, and more difficult or uncertain aspects of the new architectures will require research and analysis that may inform future directions for program choices. In addition, the potential diversity of system architectures may require parallel if not duplicative efforts to analyze and modify environments, codes, subsystems, libraries, debugging tools, and performance analysis techniques as well as exploring new monitoring methodologies. It is difficult if not impossible to selectively eliminate some of these activities until more information is available through simulations of potential architectures, analysis of systems designs, and informed study of commodity technologies that will be the constituent parts of future platforms.« less
The Transition to a Many-core World

NASA Astrophysics Data System (ADS)

Mattson, T. G.

2012-12-01

The need to increase performance within a fixed energy budget has pushed the computer industry to many core processors. This is grounded in the physics of computing and is not a trend that will just go away. It is hard to overestimate the profound impact of many-core processors on software developers. Virtually every facet of the software development process will need to change to adapt to these new processors. In this talk, we will look at many-core hardware and consider its evolution from a perspective grounded in the CPU. We will show that the number of cores will inevitably increase, but in addition, a quest to maximize performance per watt will push these cores to be heterogeneous. We will show that the inevitable result of these changes is a computing landscape where the distinction between the CPU and the GPU is blurred. We will then consider the much more pressing problem of software in a many core world. Writing software for heterogeneous many core processors is well beyond the ability of current programmers. One solution is to support a software development process where programmer teams are split into two distinct groups: a large group of domain-expert productivity programmers and much smaller team of computer-scientist efficiency programmers. The productivity programmers work in terms of high level frameworks to express the concurrency in their problems while avoiding any details for how that concurrency is exploited. The second group, the efficiency programmers, map applications expressed in terms of these frameworks onto the target many-core system. In other words, we can solve the many-core software problem by creating a software infrastructure that only requires a small subset of programmers to become master parallel programmers. This is different from the discredited dream of automatic parallelism. Note that productivity programmers still need to define the architecture of their software in a way that exposes the concurrency inherent in their problem. We submit that domain-expert programmers understand "what is concurrent". The parallel programming problem emerges from the complexity of "how that concurrency is utilized" on real hardware. The research described in this talk was carried out in collaboration with the ParLab at UC Berkeley. We use a design pattern language to define the high level frameworks exposed to domain-expert, productivity programmers. We then use tools from the SEJITS project (Selective embedded Just In time Specializers) to build the software transformation tool chains thst turn these framework-oriented designs into highly efficient code. The final ingredient is a software platform to serve as a target for these tools. One such platform is the OpenCL industry standard for programming heterogeneous systems. We will briefly describe OpenCL and show how it provides a vendor-neutral software target for current and future many core systems; both CPU-based, GPU-based, and heterogeneous combinations of the two.
Parallel and Preemptable Dynamically Dimensioned Search Algorithms for Single and Multi-objective Optimization in Water Resources

NASA Astrophysics Data System (ADS)

Tolson, B.; Matott, L. S.; Gaffoor, T. A.; Asadzadeh, M.; Shafii, M.; Pomorski, P.; Xu, X.; Jahanpour, M.; Razavi, S.; Haghnegahdar, A.; Craig, J. R.

2015-12-01

We introduce asynchronous parallel implementations of the Dynamically Dimensioned Search (DDS) family of algorithms including DDS, discrete DDS, PA-DDS and DDS-AU. These parallel algorithms are unique from most existing parallel optimization algorithms in the water resources field in that parallel DDS is asynchronous and does not require an entire population (set of candidate solutions) to be evaluated before generating and then sending a new candidate solution for evaluation. One key advance in this study is developing the first parallel PA-DDS multi-objective optimization algorithm. The other key advance is enhancing the computational efficiency of solving optimization problems (such as model calibration) by combining a parallel optimization algorithm with the deterministic model pre-emption concept. These two efficiency techniques can only be combined because of the asynchronous nature of parallel DDS. Model pre-emption functions to terminate simulation model runs early, prior to completely simulating the model calibration period for example, when intermediate results indicate the candidate solution is so poor that it will definitely have no influence on the generation of further candidate solutions. The computational savings of deterministic model preemption available in serial implementations of population-based algorithms (e.g., PSO) disappear in synchronous parallel implementations as these algorithms. In addition to the key advances above, we implement the algorithms across a range of computation platforms (Windows and Unix-based operating systems from multi-core desktops to a supercomputer system) and package these for future modellers within a model-independent calibration software package called Ostrich as well as MATLAB versions. Results across multiple platforms and multiple case studies (from 4 to 64 processors) demonstrate the vast improvement over serial DDS-based algorithms and highlight the important role model pre-emption plays in the performance of parallel, pre-emptable DDS algorithms. Case studies include single- and multiple-objective optimization problems in water resources model calibration and in many cases linear or near linear speedups are observed.
A Cloud Based Real-Time Collaborative Platform for eHealth.

PubMed

Ionescu, Bogdan; Gadea, Cristian; Solomon, Bogdan; Ionescu, Dan; Stoicu-Tivadar, Vasile; Trifan, Mircea

2015-01-01

For more than a decade, the eHealth initiative has been a government concern of many countries. In an Electronic Health Record (EHR) System, there is a need for sharing the data with a group of specialists simultaneously. Collaborative platforms alone are just a part of a solution, while a collaborative platform with parallel editing capabilities and with synchronized data streaming are stringently needed. In this paper, the design and implementation of a collaborative platform used in healthcare is introduced by describing the high level architecture and its implementation. A series of eHealth services are identified and usage examples in a healthcare environment are given.
HPCC Methodologies for Structural Design and Analysis on Parallel and Distributed Computing Platforms

NASA Technical Reports Server (NTRS)

Farhat, Charbel

1998-01-01

In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.
Hierarchical random cellular neural networks for system-level brain-like signal processing.

PubMed

Kozma, Robert; Puljic, Marko

2013-09-01

Sensory information processing and cognition in brains are modeled using dynamic systems theory. The brain's dynamic state is described by a trajectory evolving in a high-dimensional state space. We introduce a hierarchy of random cellular automata as the mathematical tools to describe the spatio-temporal dynamics of the cortex. The corresponding brain model is called neuropercolation which has distinct advantages compared to traditional models using differential equations, especially in describing spatio-temporal discontinuities in the form of phase transitions. Phase transitions demarcate singularities in brain operations at critical conditions, which are viewed as hallmarks of higher cognition and awareness experience. The introduced Monte-Carlo simulations obtained by parallel computing point to the importance of computer implementations using very large-scale integration (VLSI) and analog platforms. Copyright © 2013 Elsevier Ltd. All rights reserved.
Distributed Fast Self-Organized Maps for Massive Spectrophotometric Data Analysis †.

PubMed

Dafonte, Carlos; Garabato, Daniel; Álvarez, Marco A; Manteiga, Minia

2018-05-03

Analyzing huge amounts of data becomes essential in the era of Big Data, where databases are populated with hundreds of Gigabytes that must be processed to extract knowledge. Hence, classical algorithms must be adapted towards distributed computing methodologies that leverage the underlying computational power of these platforms. Here, a parallel, scalable, and optimized design for self-organized maps (SOM) is proposed in order to analyze massive data gathered by the spectrophotometric sensor of the European Space Agency (ESA) Gaia spacecraft, although it could be extrapolated to other domains. The performance comparison between the sequential implementation and the distributed ones based on Apache Hadoop and Apache Spark is an important part of the work, as well as the detailed analysis of the proposed optimizations. Finally, a domain-specific visualization tool to explore astronomical SOMs is presented.
Developing an Intelligent Diagnosis and Assessment E-Learning Tool for Introductory Programming

ERIC Educational Resources Information Center

Huang, Chenn-Jung; Chen, Chun-Hua; Luo, Yun-Cheng; Chen, Hong-Xin; Chuang, Yi-Ta

2008-01-01

Recently, a lot of open source e-learning platforms have been offered for free in the Internet. We thus incorporate the intelligent diagnosis and assessment tool into an open software e-learning platform developed for programming language courses, wherein the proposed learning diagnosis assessment tools based on text mining and machine learning…
minimega

DOE Office of Scientific and Technical Information (OSTI.GOV)

David Fritz, John Floren

2013-08-27

Minimega is a simple emulytics platform for creating testbeds of networked devices. The platform consists of easily deployable tools to facilitate bringing up large networks of virtual machines including Windows, Linux, and Android. Minimega attempts to allow experiments to be brought up quickly with nearly no configuration. Minimega also includes tools for simple cluster management, as well as tools for creating Linux based virtual machine images.
Practical Formal Verification of MPI and Thread Programs

NASA Astrophysics Data System (ADS)

Gopalakrishnan, Ganesh; Kirby, Robert M.

Large-scale simulation codes in science and engineering are written using the Message Passing Interface (MPI). Shared memory threads are widely used directly, or to implement higher level programming abstractions. Traditional debugging methods for MPI or thread programs are incapable of providing useful formal guarantees about coverage. They get bogged down in the sheer number of interleavings (schedules), often missing shallow bugs. In this tutorial we will introduce two practical formal verification tools: ISP (for MPI C programs) and Inspect (for Pthread C programs). Unlike other formal verification tools, ISP and Inspect run directly on user source codes (much like a debugger). They pursue only the relevant set of process interleavings, using our own customized Dynamic Partial Order Reduction algorithms. For a given test harness, DPOR allows these tools to guarantee the absence of deadlocks, instrumented MPI object leaks and communication races (using ISP), and shared memory races (using Inspect). ISP and Inspect have been used to verify large pieces of code: in excess of 10,000 lines of MPI/C for ISP in under 5 seconds, and about 5,000 lines of Pthread/C code in a few hours (and much faster with the use of a cluster or by exploiting special cases such as symmetry) for Inspect. We will also demonstrate the Microsoft Visual Studio and Eclipse Parallel Tools Platform integrations of ISP (these will be available on the LiveCD).
Optimization of Microelectronic Devices for Sensor Applications

NASA Technical Reports Server (NTRS)

Cwik, Tom; Klimeck, Gerhard

2000-01-01

The NASA/JPL goal to reduce payload in future space missions while increasing mission capability demands miniaturization of active and passive sensors, analytical instruments and communication systems among others. Currently, typical system requirements include the detection of particular spectral lines, associated data processing, and communication of the acquired data to other systems. Advances in lithography and deposition methods result in more advanced devices for space application, while the sub-micron resolution currently available opens a vast design space. Though an experimental exploration of this widening design space-searching for optimized performance by repeated fabrication efforts-is unfeasible, it does motivate the development of reliable software design tools. These tools necessitate models based on fundamental physics and mathematics of the device to accurately model effects such as diffraction and scattering in opto-electronic devices, or bandstructure and scattering in heterostructure devices. The software tools must have convenient turn-around times and interfaces that allow effective usage. The first issue is addressed by the application of high-performance computers and the second by the development of graphical user interfaces driven by properly developed data structures. These tools can then be integrated into an optimization environment, and with the available memory capacity and computational speed of high performance parallel platforms, simulation of optimized components can proceed. In this paper, specific applications of the electromagnetic modeling of infrared filtering, as well as heterostructure device design will be presented using genetic algorithm global optimization methods.
Solution Conformations of Graphene Oxide Sheets, and Two-Dimensional Nanofluidics

NASA Astrophysics Data System (ADS)

Koltonow, Andrew R.

This work reports studies on the physical properties of collections of nanosheets. First, the configurations of graphene oxide sheets in solution are studied. Polarized optical microscopy reveals quickly and decisively that sheets remain flat and form lyotropic liquid crystals over a wide range of solvent conditions. When solvent conditions are inhospitable enough, sheets agglomerate into stacks rather crumpling upon themselves. Theory and simulation suggest that the crumpled state, which can be formed by compressing sheets, is metastable. This work might correct a persistent misunderstanding about the solution physics of graphene oxide. The other major area of study concerns the hydration layers in between lamellar stacks of exfoliated, restacked nanosheets. These layers comprise massive arrays of parallel two-dimensional nanofluidic channels, which exhibit enhanced unipolar ionic conductivity with counterions as the majority charge carriers. Based on the previously discovered graphene oxide nanofluidic platform, exfoliated vermiculite nanofluidic channels are constructed, which shuttle protons through the hydration channels by a Grotthuss mechanism, and which show superior thermal stability to graphene oxide. The 2D nanofluidics platform is also used to demonstrate "kirigami nanofluidics", where ion transport can be manipulated by cutting the film into specific shapes. This can give rise to ionic current rectification. The rectification effect is attributed to the size and shape mismatch of the concentration polarization zones developed at the inlets and outlets of the nanofluidic channels. The kirigami nanofluidic platform can be used to fabricate ionic diodes and other simple devices. This material platform is expected to be a useful tool for nanofluidics researchers, because it offers a way to carry out nanofluidic experiments quickly with minimal equipment and little expense.
Determination of High-affinity Antibody-antigen Binding Kinetics Using Four Biosensor Platforms.

PubMed

Yang, Danlin; Singh, Ajit; Wu, Helen; Kroe-Barrett, Rachel

2017-04-17

Label-free optical biosensors are powerful tools in drug discovery for the characterization of biomolecular interactions. In this study, we describe the use of four routinely used biosensor platforms in our laboratory to evaluate the binding affinity and kinetics of ten high-affinity monoclonal antibodies (mAbs) against human proprotein convertase subtilisin kexin type 9 (PCSK9). While both Biacore T100 and ProteOn XPR36 are derived from the well-established Surface Plasmon Resonance (SPR) technology, the former has four flow cells connected by serial flow configuration, whereas the latter presents 36 reaction spots in parallel through an improvised 6 x 6 crisscross microfluidic channel configuration. The IBIS MX96 also operates based on the SPR sensor technology, with an additional imaging feature that provides detection in spatial orientation. This detection technique coupled with the Continuous Flow Microspotter (CFM) expands the throughput significantly by enabling multiplex array printing and detection of 96 reaction sports simultaneously. In contrast, the Octet RED384 is based on the BioLayer Interferometry (BLI) optical principle, with fiber-optic probes acting as the biosensor to detect interference pattern changes upon binding interactions at the tip surface. Unlike the SPR-based platforms, the BLI system does not rely on continuous flow fluidics; instead, the sensor tips collect readings while they are immersed in analyte solutions of a 384-well microplate during orbital agitation. Each of these biosensor platforms has its own advantages and disadvantages. To provide a direct comparison of these instruments' ability to provide quality kinetic data, the described protocols illustrate experiments that use the same assay format and the same high-quality reagents to characterize antibody-antigen kinetics that fit the simple 1:1 molecular interaction model.
NSTX-U Control System Upgrades

DOE PAGES

Erickson, K. G.; Gates, D. A.; Gerhardt, S. P.; ...

2014-06-01

The National Spherical Tokamak Experiment (NSTX) is undergoing a wealth of upgrades (NSTX-U). These upgrades, especially including an elongated pulse length, require broad changes to the control system that has served NSTX well. A new fiber serial Front Panel Data Port input and output (I/O) stream will supersede the aging copper parallel version. Driver support for the new I/O and cyber security concerns require updating the operating system from Redhat Enterprise Linux (RHEL) v4 to RedHawk (based on RHEL) v6. While the basic control system continues to use the General Atomics Plasma Control System (GA PCS), the effort to forwardmore » port the entire software package to run under 64-bit Linux instead of 32-bit Linux included PCS modifications subsequently shared with GA and other PCS users. Software updates focused on three key areas: (1) code modernization through coding standards (C99/C11), (2) code portability and maintainability through use of the GA PCS code generator, and (3) support of 64-bit platforms. Central to the control system upgrade is the use of a complete real time (RT) Linux platform provided by Concurrent Computer Corporation, consisting of a computer (iHawk), an operating system and drivers (RedHawk), and RT tools (NightStar). Strong vendor support coupled with an extensive RT toolset influenced this decision. The new real-time Linux platform, I/O, and software engineering will foster enhanced capability and performance for NSTX-U plasma control.« less
A Re-programmable Platform for Dynamic Burn-in Test of Xilinx Virtexll 3000 FPGA for Military and Aerospace Applications

NASA Technical Reports Server (NTRS)

Roosta, Ramin; Wang, Xinchen; Sadigursky, Michael; Tracton, Phil

2004-01-01

Field Programmable Gate Arrays (FPGA) have played increasingly important roles in military and aerospace applications. Xilinx SRAM-based FPGAs have been extensively used in commercial applications. They have been used less frequently in space flight applications due to their susceptibility to single-event upsets. Reliability of these devices in space applications is a concern that has not been addressed. The objective of this project is to design a fully programmable hardware/software platform that allows (but is not limited to) comprehensive static/dynamic burn-in test of Virtex-II 3000 FPGAs, at speed test and SEU test. Conventional methods test very few discrete AC parameters (primarily switching) of a given integrated circuit. This approach will test any possible configuration of the FPGA and any associated performance parameters. It allows complete or partial re-programming of the FPGA and verification of the program by using read back followed by dynamic test. Designers have full control over which functional elements of the FPGA to stress. They can completely simulate all possible types of configurations/functions. Another benefit of this platform is that it allows collecting information on elevation of the junction temperature as a function of gate utilization, operating frequency and functionality. A software tool has been implemented to demonstrate the various features of the system. The software consists of three major parts: the parallel interface driver, main system procedure and a graphical user interface (GUI).
Computer-aided programming for message-passing system; Problems and a solution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, M.Y.; Gajski, D.D.

1989-12-01

As the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more difficult and error-prone. Program development tools are necessary since programmers are not able to develop complex parallel programs efficiently. Parallel models of computation, parallelization problems, and tools for computer-aided programming (CAP) are discussed. As an example, a CAP tool that performs scheduling and inserts communication primitives automatically is described. It also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs.
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.
C 3, A Command-line Catalog Cross-match Tool for Large Astrophysical Catalogs

NASA Astrophysics Data System (ADS)

Riccio, Giuseppe; Brescia, Massimo; Cavuoti, Stefano; Mercurio, Amata; di Giorgio, Anna Maria; Molinari, Sergio

2017-02-01

Modern Astrophysics is based on multi-wavelength data organized into large and heterogeneous catalogs. Hence, the need for efficient, reliable and scalable catalog cross-matching methods plays a crucial role in the era of the petabyte scale. Furthermore, multi-band data have often very different angular resolution, requiring the highest generality of cross-matching features, mainly in terms of region shape and resolution. In this work we present C 3 (Command-line Catalog Cross-match), a multi-platform application designed to efficiently cross-match massive catalogs. It is based on a multi-core parallel processing paradigm and conceived to be executed as a stand-alone command-line process or integrated within any generic data reduction/analysis pipeline, providing the maximum flexibility to the end-user, in terms of portability, parameter configuration, catalog formats, angular resolution, region shapes, coordinate units and cross-matching types. Using real data, extracted from public surveys, we discuss the cross-matching capabilities and computing time efficiency also through a direct comparison with some publicly available tools, chosen among the most used within the community, and representative of different interface paradigms. We verified that the C 3 tool has excellent capabilities to perform an efficient and reliable cross-matching between large data sets. Although the elliptical cross-match and the parametric handling of angular orientation and offset are known concepts in the astrophysical context, their availability in the presented command-line tool makes C 3 competitive in the context of public astronomical tools.

The R package "sperrorest" : Parallelized spatial error estimation and variable importance assessment for geospatial machine learning

NASA Astrophysics Data System (ADS)

Schratz, Patrick; Herrmann, Tobias; Brenning, Alexander

2017-04-01

Computational and statistical prediction methods such as the support vector machine have gained popularity in remote-sensing applications in recent years and are often compared to more traditional approaches like maximum-likelihood classification. However, the accuracy assessment of such predictive models in a spatial context needs to account for the presence of spatial autocorrelation in geospatial data by using spatial cross-validation and bootstrap strategies instead of their now more widely used non-spatial equivalent. The R package sperrorest by A. Brenning [IEEE International Geoscience and Remote Sensing Symposium, 1, 374 (2012)] provides a generic interface for performing (spatial) cross-validation of any statistical or machine-learning technique available in R. Since spatial statistical models as well as flexible machine-learning algorithms can be computationally expensive, parallel computing strategies are required to perform cross-validation efficiently. The most recent major release of sperrorest therefore comes with two new features (aside from improved documentation): The first one is the parallelized version of sperrorest(), parsperrorest(). This function features two parallel modes to greatly speed up cross-validation runs. Both parallel modes are platform independent and provide progress information. par.mode = 1 relies on the pbapply package and calls interactively (depending on the platform) parallel::mclapply() or parallel::parApply() in the background. While forking is used on Unix-Systems, Windows systems use a cluster approach for parallel execution. par.mode = 2 uses the foreach package to perform parallelization. This method uses a different way of cluster parallelization than the parallel package does. In summary, the robustness of parsperrorest() is increased with the implementation of two independent parallel modes. A new way of partitioning the data in sperrorest is provided by partition.factor.cv(). This function gives the user the possibility to perform cross-validation at the level of some grouping structure. As an example, in remote sensing of agricultural land uses, pixels from the same field contain nearly identical information and will thus be jointly placed in either the test set or the training set. Other spatial sampling resampling strategies are already available and can be extended by the user.
Solving optimization problems on computational grids.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wright, S. J.; Mathematics and Computer Science

2001-05-01

Multiprocessor computing platforms, which have become more and more widely available since the mid-1980s, are now heavily used by organizations that need to solve very demanding computational problems. Parallel computing is now central to the culture of many research communities. Novel parallel approaches were developed for global optimization, network optimization, and direct-search methods for nonlinear optimization. Activity was particularly widespread in parallel branch-and-bound approaches for various problems in combinatorial and network optimization. As the cost of personal computers and low-end workstations has continued to fall, while the speed and capacity of processors and networks have increased dramatically, 'cluster' platforms havemore » become popular in many settings. A somewhat different type of parallel computing platform know as a computational grid (alternatively, metacomputer) has arisen in comparatively recent times. Broadly speaking, this term refers not to a multiprocessor with identical processing nodes but rather to a heterogeneous collection of devices that are widely distributed, possibly around the globe. The advantage of such platforms is obvious: they have the potential to deliver enormous computing power. Just as obviously, however, the complexity of grids makes them very difficult to use. The Condor team, headed by Miron Livny at the University of Wisconsin, were among the pioneers in providing infrastructure for grid computations. More recently, the Globus project has developed technologies to support computations on geographically distributed platforms consisting of high-end computers, storage and visualization devices, and other scientific instruments. In 1997, we started the metaneos project as a collaborative effort between optimization specialists and the Condor and Globus groups. Our aim was to address complex, difficult optimization problems in several areas, designing and implementing the algorithms and the software infrastructure need to solve these problems on computational grids. This article describes some of the results we have obtained during the first three years of the metaneos project. Our efforts have led to development of the runtime support library MW for implementing algorithms with master-worker control structure on Condor platforms. This work is discussed here, along with work on algorithms and codes for integer linear programming, the quadratic assignment problem, and stochastic linear programmming. Our experiences in the metaneos project have shown that cheap, powerful computational grids can be used to tackle large optimization problems of various types. In an industrial or commercial setting, the results demonstrate that one may not have to buy powerful computational servers to solve many of the large problems arising in areas such as scheduling, portfolio optimization, or logistics; the idle time on employee workstations (or, at worst, an investment in a modest cluster of PCs) may do the job. For the optimization research community, our results motivate further work on parallel, grid-enabled algorithms for solving very large problems of other types. The fact that very large problems can be solved cheaply allows researchers to better understand issues of 'practical' complexity and of the role of heuristics.« less
Gas chromatography fractionation platform featuring parallel flame-ionization detection and continuous high-resolution analyte collection in 384-well plates.

PubMed

Jonker, Willem; Clarijs, Bas; de Witte, Susannah L; van Velzen, Martin; de Koning, Sjaak; Schaap, Jaap; Somsen, Govert W; Kool, Jeroen

2016-09-02

Gas chromatography (GC) is a superior separation technique for many compounds. However, fractionation of a GC eluate for analyte isolation and/or post-column off-line analysis is not straightforward, and existing platforms are limited in the number of fractions that can be collected. Moreover, aerosol formation may cause serious analyte losses. Previously, our group has developed a platform that resolved these limitations of GC fractionation by post-column infusion of a trap solvent prior to continuous small-volume fraction collection in a 96-wells plate (Pieke et al., 2013 [17]). Still, this GC fractionation set-up lacked a chemical detector for the on-line recording of chromatograms, and the introduction of trap solvent resulted in extensive peak broadening for late-eluting compounds. This paper reports advancements to the fractionation platform allowing flame ionization detection (FID) parallel to high-resolution collection of a full GC chromatograms in up to 384 nanofractions of 7s each. To this end, a post-column split was incorporated which directs part of the eluate towards FID. Furthermore, a solvent heating device was developed for stable delivery of preheated/vaporized trap solvent, which significantly reduced band broadening by post-column infusion. In order to achieve optimal analyte trapping, several solvents were tested at different flow rates. The repeatability of the optimized GC fraction collection process was assessed demonstrating the possibility of up-concentration of isolated analytes by repetitive analyses of the same sample. The feasibility of the improved GC fractionation platform for bioactivity screening of toxic compounds was studied by the analysis of a mixture of test pesticides, which after fractionation were subjected to a post-column acetylcholinesterase (AChE) assay. Fractions showing AChE inhibition could be unambiguously correlated with peaks from the parallel-recorded FID chromatogram. Copyright © 2016 Elsevier B.V. All rights reserved.
Exploiting Parallel R in the Cloud with SPRINT

PubMed Central

Piotrowski, M.; McGilvary, G.A.; Sloan, T. M.; Mewissen, M.; Lloyd, A.D.; Forster, T.; Mitchell, L.; Ghazal, P.; Hill, J.

2012-01-01

Background Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. PMID:23223611
Exploiting parallel R in the cloud with SPRINT.

PubMed

Piotrowski, M; McGilvary, G A; Sloan, T M; Mewissen, M; Lloyd, A D; Forster, T; Mitchell, L; Ghazal, P; Hill, J

2013-01-01

Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon's Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of the algorithm. Resource underutilization can further improve the time to result. End-user's location impacts on costs due to factors such as local taxation. Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds.
Parallelization of Rocket Engine Simulator Software (PRESS)

NASA Technical Reports Server (NTRS)

Cezzar, Ruknet

1998-01-01

We have outlined our work in the last half of the funding period. We have shown how a demo package for RESSAP using MPI can be done. However, we also mentioned the difficulties with the UNIX platform. We have reiterated some of the suggestions made during the presentation of the progress of the at Fourth Annual HBCU Conference. Although we have discussed, in some detail, how TURBDES/PUMPDES software can be run in parallel using MPI, at present, we are unable to experiment any further with either MPI or PVM. Due to X windows not being implemented, we are also not able to experiment further with XPVM, which it will be recalled, has a nice GUI interface. There are also some concerns, on our part, about MPI being an appropriate tool. The best thing about MPr is that it is public domain. Although and plenty of documentation exists for the intricacies of using MPI, little information is available on its actual implementations. Other than very typical, somewhat contrived examples, such as Jacobi algorithm for solving Laplace's equation, there are few examples which can readily be applied to real situations, such as in our case. In effect, the review of literature on both MPI and PVM, and there is a lot, indicate something similar to the enormous effort which was spent on LISP and LISP-like languages as tools for artificial intelligence research. During the development of a book on programming languages [12], when we searched the literature for very simple examples like taking averages, reading and writing records, multiplying matrices, etc., we could hardly find a any! Yet, so much was said and done on that topic in academic circles. It appears that we faced the same problem with MPI, where despite significant documentation, we could not find even a simple example which supports course-grain parallelism involving only a few processes. From the foregoing, it appears that a new direction may be required for more productive research during the extension period (10/19/98 - 10/18/99). At the least, the research would need to be done on Windows 95/Windows NT based platforms. Moreover, with the acquisition of Lahey Fortran package for PC platform, and the existing Borland C + + 5. 0, we can do work on C + + wrapper issues. We have carefully studied the blueprint for Space Transportation Propulsion Integrated Design Environment for the next 25 years [13] and found the inclusion of HBCUs in that effort encouraging. Especially in the long period for which a map is provided, there is no doubt that HBCUs will grow and become better equipped to do meaningful research. In the shorter period, as was suggested in our presentation at the HBCU conference, some key decisions regarding the aging Fortran based software for rocket propellants will need to be made. One important issue is whether or not object oriented languages such as C + + or Java should be used for distributed computing. Whether or not "distributed computing" is necessary for the existing software is yet another, larger, question to be tackled with.
Characterization of three-dimensional cancer cell migration in mixed collagen-Matrigel scaffolds using microfluidics and image analysis.

PubMed

Anguiano, María; Castilla, Carlos; Maška, Martin; Ederra, Cristina; Peláez, Rafael; Morales, Xabier; Muñoz-Arrieta, Gorka; Mujika, Maite; Kozubek, Michal; Muñoz-Barrutia, Arrate; Rouzaut, Ana; Arana, Sergio; Garcia-Aznar, José Manuel; Ortiz-de-Solorzano, Carlos

2017-01-01

Microfluidic devices are becoming mainstream tools to recapitulate in vitro the behavior of cells and tissues. In this study, we use microfluidic devices filled with hydrogels of mixed collagen-Matrigel composition to study the migration of lung cancer cells under different cancer invasion microenvironments. We present the design of the microfluidic device, characterize the hydrogels morphologically and mechanically and use quantitative image analysis to measure the migration of H1299 lung adenocarcinoma cancer cells in different experimental conditions. Our results show the plasticity of lung cancer cell migration, which turns from mesenchymal in collagen only matrices, to lobopodial in collagen-Matrigel matrices that approximate the interface between a disrupted basement membrane and the underlying connective tissue. Our quantification of migration speed confirms a biphasic role of Matrigel. At low concentration, Matrigel facilitates migration, most probably by providing a supportive and growth factor retaining environment. At high concentration, Matrigel slows down migration, possibly due excessive attachment. Finally, we show that antibody-based integrin blockade promotes a change in migration phenotype from mesenchymal or lobopodial to amoeboid and analyze the effect of this change in migration dynamics, in regards to the structure of the matrix. In summary, we describe and characterize a robust microfluidic platform and a set of software tools that can be used to study lung cancer cell migration under different microenvironments and experimental conditions. This platform could be used in future studies, thus benefitting from the advantages introduced by microfluidic devices: precise control of the environment, excellent optical properties, parallelization for high throughput studies and efficient use of therapeutic drugs.
A parallel and sensitive software tool for methylation analysis on multicore platforms.

PubMed

Tárraga, Joaquín; Pérez, Mariano; Orduña, Juan M; Duato, José; Medina, Ignacio; Dopazo, Joaquín

2015-10-01

DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. We present a new software tool, called HPG-Methyl, which efficiently maps bisulphite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPG-Methyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulphite reads. Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password 'anonymous'). juan.orduna@uv.es or jdopazo@cipf.es. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
GeolOkit 1.0: a new Open Source, Cross-Platform software for geological data visualization in Google Earth environment

NASA Astrophysics Data System (ADS)

Triantafyllou, Antoine; Bastin, Christophe; Watlet, Arnaud

2016-04-01

GIS software suites are today's essential tools to gather and visualise geological data, to apply spatial and temporal analysis and in fine, to create and share interactive maps for further geosciences' investigations. For these purposes, we developed GeolOkit: an open-source, freeware and lightweight software, written in Python, a high-level, cross-platform programming language. GeolOkit software is accessible through a graphical user interface, designed to run in parallel with Google Earth. It is a super user-friendly toolbox that allows 'geo-users' to import their raw data (e.g. GPS, sample locations, structural data, field pictures, maps), to use fast data analysis tools and to plot these one into Google Earth environment using KML code. This workflow requires no need of any third party software, except Google Earth itself. GeolOkit comes with large number of geosciences' labels, symbols, colours and placemarks and may process : (i) multi-points data, (ii) contours via several interpolations methods, (iii) discrete planar and linear structural data in 2D or 3D supporting large range of structures input format, (iv) clustered stereonets and rose diagram, (v) drawn cross-sections as vertical sections, (vi) georeferenced maps and vectors, (vii) field pictures using either geo-tracking metadata from a camera built-in GPS module, or the same-day track of an external GPS. We are looking for you to discover all the functionalities of GeolOkit software. As this project is under development, we are definitely looking to discussions regarding your proper needs, your ideas and contributions to GeolOkit project.
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallel rendering

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
The Parallel Worm Tracker: A Platform for Measuring Average Speed and Drug-Induced Paralysis in Nematodes

PubMed Central

Ramot, Daniel; Johnson, Brandon E.; Berry, Tommie L.; Carnell, Lucinda; Goodman, Miriam B.

2008-01-01

Background Caenorhabditis elegans locomotion is a simple behavior that has been widely used to dissect genetic components of behavior, synaptic transmission, and muscle function. Many of the paradigms that have been created to study C. elegans locomotion rely on qualitative experimenter observation. Here we report the implementation of an automated tracking system developed to quantify the locomotion of multiple individual worms in parallel. Methodology/Principal Findings Our tracking system generates a consistent measurement of locomotion that allows direct comparison of results across experiments and experimenters and provides a standard method to share data between laboratories. The tracker utilizes a video camera attached to a zoom lens and a software package implemented in MATLAB®. We demonstrate several proof-of-principle applications for the tracker including measuring speed in the absence and presence of food and in the presence of serotonin. We further use the tracker to automatically quantify the time course of paralysis of worms exposed to aldicarb and levamisole and show that tracker performance compares favorably to data generated using a hand-scored metric. Conclusions/Signficance Although this is not the first automated tracking system developed to measure C. elegans locomotion, our tracking software package is freely available and provides a simple interface that includes tools for rapid data collection and analysis. By contrast with other tools, it is not dependent on a specific set of hardware. We propose that the tracker may be used for a broad range of additional worm locomotion applications including genetic and chemical screening. PMID:18493300
A fast ultrasonic simulation tool based on massively parallel implementations

NASA Astrophysics Data System (ADS)

Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain

2014-02-01

This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.
e-Collaboration for Earth observation (E-CEO): the Cloud4SAR interferometry data challenge

NASA Astrophysics Data System (ADS)

Casu, Francesco; Manunta, Michele; Boissier, Enguerran; Brito, Fabrice; Aas, Christina; Lavender, Samantha; Ribeiro, Rita; Farres, Jordi

2014-05-01

The e-Collaboration for Earth Observation (E-CEO) project addresses the technologies and architectures needed to provide a collaborative research Platform for automating data mining and processing, and information extraction experiments. The Platform serves for the implementation of Data Challenge Contests focusing on Information Extraction for Earth Observations (EO) applications. The possibility to implement multiple processors within a Common Software Environment facilitates the validation, evaluation and transparent peer comparison among different methodologies, which is one of the main requirements rose by scientists who develop algorithms in the EO field. In this scenario, we set up a Data Challenge, referred to as Cloud4SAR (http://wiki.services.eoportal.org/tiki-index.php?page=ECEO), to foster the deployment of Interferometric SAR (InSAR) processing chains within a Cloud Computing platform. While a large variety of InSAR processing software tools are available, they require a high level of expertise and a complex user interaction to be effectively run. Computing a co-seismic interferogram or a 20-years deformation time series on a volcanic area are not easy tasks to be performed in a fully unsupervised way and/or in very short time (hours or less). Benefiting from ESA's E-CEO platform, participants can optimise algorithms on a Virtual Sandbox environment without being expert programmers, and compute results on high performing Cloud platforms. Cloud4SAR requires solving a relatively easy InSAR problem by trying to maximize the exploitation of the processing capabilities provided by a Cloud Computing infrastructure. The proposed challenge offers two different frameworks, each dedicated to participants with different skills, identified as Beginners and Experts. For both of them, the contest mainly resides in the degree of automation of the deployed algorithms, no matter which one is used, as well as in the capability of taking effective benefit from a parallel computing environment.
Software for pre-processing Illumina next-generation sequencing short read sequences

PubMed Central

2014-01-01

Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness. Conclusions Trimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies. ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects. PMID:24955109
AC losses in horizontally parallel HTS tapes for possible wireless power transfer applications

NASA Astrophysics Data System (ADS)

Shen, Boyang; Geng, Jianzhao; Zhang, Xiuchang; Fu, Lin; Li, Chao; Zhang, Heng; Dong, Qihuan; Ma, Jun; Gawith, James; Coombs, T. A.

2017-12-01

This paper presents the concept of using horizontally parallel HTS tapes with AC loss study, and the investigation on possible wireless power transfer (WPT) applications. An example of three parallel HTS tapes was proposed, whose AC loss study was carried out both from experiment using electrical method; and simulation using 2D H-formulation on the FEM platform of COMSOL Multiphysics. The electromagnetic induction around the three parallel tapes was monitored using COMSOL simulation. The electromagnetic induction and AC losses generated by a conventional three turn coil was simulated as well, and then compared to the case of three parallel tapes with the same AC transport current. The analysis demonstrates that HTS parallel tapes could be potentially used into wireless power transfer systems, which could have lower total AC losses than conventional HTS coils.
Parallel Grid Manipulations in Earth Science Calculations

NASA Technical Reports Server (NTRS)

Sawyer, W.; Lucchesi, R.; daSilva, A.; Takacs, L. L.

1999-01-01

The National Aeronautics and Space Administration (NASA) Data Assimilation Office (DAO) at the Goddard Space Flight Center is moving its data assimilation system to massively parallel computing platforms. This parallel implementation of GEOS DAS will be used in the DAO's normal activities, which include reanalysis of data, and operational support for flight missions. Key components of GEOS DAS, including the gridpoint-based general circulation model and a data analysis system, are currently being parallelized. The parallelization of GEOS DAS is also one of the HPCC Grand Challenge Projects. The GEOS-DAS software employs several distinct grids. Some examples are: an observation grid- an unstructured grid of points at which observed or measured physical quantities from instruments or satellites are associated- a highly-structured latitude-longitude grid of points spanning the earth at given latitude-longitude coordinates at which prognostic quantities are determined, and a computational lat-lon grid in which the pole has been moved to a different location to avoid computational instabilities. Each of these grids has a different structure and number of constituent points. In spite of that, there are numerous interactions between the grids, e.g., values on one grid must be interpolated to another, or, in other cases, grids need to be redistributed on the underlying parallel platform. The DAO has designed a parallel integrated library for grid manipulations (PILGRIM) to support the needed grid interactions with maximum efficiency. It offers a flexible interface to generate new grids, define transformations between grids and apply them. Basic communication is currently MPI, however the interfaces defined here could conceivably be implemented with other message-passing libraries, e.g., Cray SHMEM, or with shared-memory constructs. The library is written in Fortran 90. First performance results indicate that even difficult problems, such as above-mentioned pole rotation- a sparse interpolation with little data locality between the physical lat-lon grid and a pole rotated computational grid- can be solved efficiently and at the GFlop/s rates needed to solve tomorrow's high resolution earth science models. In the subsequent presentation we will discuss the design and implementation of PILGRIM as well as a number of the problems it is required to solve. Some conclusions will be drawn about the potential performance of the overall earth science models on the supercomputer platforms foreseen for these problems.
Seismicity map tools for earthquake studies

NASA Astrophysics Data System (ADS)

Boucouvalas, Anthony; Kaskebes, Athanasios; Tselikas, Nikos

2014-05-01

We report on the development of new and online set of tools for use within Google Maps, for earthquake research. We demonstrate this server based and online platform (developped with PHP, Javascript, MySQL) with the new tools using a database system with earthquake data. The platform allows us to carry out statistical and deterministic analysis on earthquake data use of Google Maps and plot various seismicity graphs. The tool box has been extended to draw on the map line segments, multiple straight lines horizontally and vertically as well as multiple circles, including geodesic lines. The application is demonstrated using localized seismic data from the geographic region of Greece as well as other global earthquake data. The application also offers regional segmentation (NxN) which allows the studying earthquake clustering, and earthquake cluster shift within the segments in space. The platform offers many filters such for plotting selected magnitude ranges or time periods. The plotting facility allows statistically based plots such as cumulative earthquake magnitude plots and earthquake magnitude histograms, calculation of 'b' etc. What is novel for the platform is the additional deterministic tools. Using the newly developed horizontal and vertical line and circle tools we have studied the spatial distribution trends of many earthquakes and we here show for the first time the link between Fibonacci Numbers and spatiotemporal location of some earthquakes. The new tools are valuable for examining visualizing trends in earthquake research as it allows calculation of statistics as well as deterministic precursors. We plan to show many new results based on our newly developed platform.
Microfluidic Platform for Parallel Single Cell Analysis for Diagnostic Applications.

PubMed

Le Gac, Séverine

2017-01-01

Cell populations are heterogeneous: they can comprise different cell types or even cells at different stages of the cell cycle and/or of biological processes. Furthermore, molecular processes taking place in cells are stochastic in nature. Therefore, cellular analysis must be brought down to the single cell level to get useful insight into biological processes, and to access essential molecular information that would be lost when using a cell population analysis approach. Furthermore, to fully characterize a cell population, ideally, information both at the single cell level and on the whole cell population is required, which calls for analyzing each individual cell in a population in a parallel manner. This single cell level analysis approach is particularly important for diagnostic applications to unravel molecular perturbations at the onset of a disease, to identify biomarkers, and for personalized medicine, not only because of the heterogeneity of the cell sample, but also due to the availability of a reduced amount of cells, or even unique cells. This chapter presents a versatile platform meant for the parallel analysis of individual cells, with a particular focus on diagnostic applications and the analysis of cancer cells. We first describe one essential step of this parallel single cell analysis protocol, which is the trapping of individual cells in dedicated structures. Following this, we report different steps of a whole analytical process, including on-chip cell staining and imaging, cell membrane permeabilization and/or lysis using either chemical or physical means, and retrieval of the cell molecular content in dedicated channels for further analysis. This series of experiments illustrates the versatility of the herein-presented platform and its suitability for various analysis schemes and different analytical purposes.
pcircle - A Suite of Scalable Parallel File System Tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

WANG, FEIYI

2015-10-01

Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.

Evaluation and demonstration of commercialization potential of CCSI tools within gPROMS advanced simulation platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lawal, Adekola; Schmal, Pieter; Ramos, Alfredo

PSE, in the first phase of the CCSI commercialization project, set out to identify market opportunities for the CCSI tools combined with existing gPROMS platform capabilities and develop a clear technical plan for the proposed commercialization activities.
Guide to Sea Ice Information and Sea Ice Data Online - the Sea Ice Knowledge and Data Platform www.meereisportal.de and www.seaiceportal.de

NASA Astrophysics Data System (ADS)

Treffeisen, R. E.; Nicolaus, M.; Bartsch, A.; Fritzsch, B.; Grosfeld, K.; Haas, C.; Hendricks, S.; Heygster, G.; Hiller, W.; Krumpen, T.; Melsheimer, C.; Ricker, R.; Weigelt, M.

2016-12-01

The combination of multi-disciplinary sea ice science and the rising demand of society for up-to-date information and user customized products places emphasis on creating new ways of communication between science and society. The new knowledge platform is a contribution to the cross-linking of scientifically qualified information on climate change, and focuses on the theme: `sea ice' in both Polar Regions. With this platform, the science opens to these changing societal demands. It is the first comprehensive German speaking knowledge platform on sea ice; the platform went online in 2013. The web site delivers popularized information for the general public as well as scientific data meant primarily for the more expert readers and scientists. It also provides various tools allowing for visitor interaction. The demand for the web site indicates a high level of interest from both the general public and experts. It communicates science-based information to improve awareness and understanding of sea ice related research. The principle concept of the new knowledge platform is based on three pillars: (1) sea ice knowledge and background information, (2) data portal with visualizations, and (3) expert knowledge, latest research results and press releases. Since then, the content and selection of data sets increased and the data portal received increasing attention, also from the international science community. Meanwhile, we are providing near-real time and archived data of many key parameters of sea ice and its snow cover. The data sets result from measurements acquired by various platforms as well as numerical simulations. Satellite observations (e.g., AMSR2, CryoSat-2 and SMOS) of sea ice concentration, freeboard, thickness and drift are available as gridded data sets. Sea ice and snow temperatures and thickness as well as atmospheric parameters are available from autonomous ice-tethered platforms (buoys). Additional ship observations, ice station measurements, and mooring time series are compiled as data collections over the last decade. In parallel, we are continuously extending our meta-data and uncertainty information for all data sets. We will present the portal, its content and function, but we are also asking for direct user feedback and are open for potential new partners.
Interfacing Computer Aided Parallelization and Performance Analysis

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)

2003-01-01

When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.
Scalable, High-performance 3D Imaging Software Platform: System Architecture and Application to Virtual Colonoscopy

PubMed Central

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli; Brett, Bevin

2013-01-01

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. In this work, we have developed a software platform that is designed to support high-performance 3D medical image processing for a wide range of applications using increasingly available and affordable commodity computing systems: multi-core, clusters, and cloud computing systems. To achieve scalable, high-performance computing, our platform (1) employs size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D image processing algorithms; (2) supports task scheduling for efficient load distribution and balancing; and (3) consists of a layered parallel software libraries that allow a wide range of medical applications to share the same functionalities. We evaluated the performance of our platform by applying it to an electronic cleansing system in virtual colonoscopy, with initial experimental results showing a 10 times performance improvement on an 8-core workstation over the original sequential implementation of the system. PMID:23366803
Algorithms and programming tools for image processing on the MPP

NASA Technical Reports Server (NTRS)

Reeves, A. P.

1985-01-01

Topics addressed include: data mapping and rotational algorithms for the Massively Parallel Processor (MPP); Parallel Pascal language; documentation for the Parallel Pascal Development system; and a description of the Parallel Pascal language used on the MPP.
Sierra Structural Dynamics User's Notes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reese, Garth M.

2015-10-19

Sierra/SD provides a massively parallel implementation of structural dynamics finite element analysis, required for high fidelity, validated models used in modal, vibration, static and shock analysis of weapons systems. This document provides a users guide to the input for Sierra/SD. Details of input specifications for the different solution types, output options, element types and parameters are included. The appendices contain detailed examples, and instructions for running the software on parallel platforms.
Sierra/SD User's Notes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Munday, Lynn Brendon; Day, David M.; Bunting, Gregory

Sierra/SD provides a massively parallel implementation of structural dynamics finite element analysis, required for high fidelity, validated models used in modal, vibration, static and shock analysis of weapons systems. This document provides a users guide to the input for Sierra/SD. Details of input specifications for the different solution types, output options, element types and parameters are included. The appendices contain detailed examples, and instructions for running the software on parallel platforms.
Tools for Creating Mobile Applications for Extension

ERIC Educational Resources Information Center

Drill, Sabrina L.

2012-01-01

Considerations and tools for developing mobile applications for Extension include evaluating the topic, purpose, and audience. Different computing platforms may be used, and apps designed as modified Web pages or implicitly programmed for a particular platform. User privacy is another important consideration, especially for data collection apps.…
OPAL: An Open-Source MPI-IO Library over Cray XT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Weikuan; Vetter, Jeffrey S; Canon, Richard Shane

Parallel IO over Cray XT is supported by a vendor-supplied MPI-IO package. This package contains a proprietary ADIO implementation built on top of the sysio library. While it is reasonable to maintain a stable code base for application scientists' convenience, it is also very important to the system developers and researchers to analyze and assess the effectiveness of parallel IO software, and accordingly, tune and optimize the MPI-IO implementation. A proprietary parallel IO code base relinquishes such flexibilities. On the other hand, a generic UFS-based MPI-IO implementation is typically used on many Linux-based platforms. We have developed an open-source MPI-IOmore » package over Lustre, referred to as OPAL (OPportunistic and Adaptive MPI-IO Library over Lustre). OPAL provides a single source-code base for MPI-IO over Lustre on Cray XT and Linux platforms. Compared to Cray implementation, OPAL provides a number of good features, including arbitrary specification of striping patterns and Lustre-stripe aligned file domain partitioning. This paper presents the performance comparisons between OPAL and Cray's proprietary implementation. Our evaluation demonstrates that OPAL achieves the performance comparable to the Cray implementation. We also exemplify the benefits of an open source package in revealing the underpinning of the parallel IO performance.« less
A holistic approach to SIM platform and its application to early-warning satellite system

NASA Astrophysics Data System (ADS)

Sun, Fuyu; Zhou, Jianping; Xu, Zheyao

2018-01-01

This study proposes a new simulation platform named Simulation Integrated Management (SIM) for the analysis of parallel and distributed systems. The platform eases the process of designing and testing both applications and architectures. The main characteristics of SIM are flexibility, scalability, and expandability. To improve the efficiency of project development, new models of early-warning satellite system were designed based on the SIM platform. Finally, through a series of experiments, the correctness of SIM platform and the aforementioned early-warning satellite models was validated, and the systematical analyses for the orbital determination precision of the ballistic missile during its entire flight process were presented, as well as the deviation of the launch/landing point. Furthermore, the causes of deviation and prevention methods will be fully explained. The simulation platform and the models will lay the foundations for further validations of autonomy technology in space attack-defense architecture research.
Pre-polishing on a CNC platform with bound abrasive contour tools

NASA Astrophysics Data System (ADS)

Schoeffer, Adrienne E.

2003-05-01

Deterministic micorgrinding (DMG) of optical glasses and ceramics is the commercial manufacturing process of choice to shape glass surfaces prior to final finishing. This process employs rigid bound matrix diamond tooling resulting in surface roughness values of 3-51.tm peak to valley and 100-400nm rms, as well as mid-spatial frequency tool marks that require subsequent removal in secondary finishing steps. The ability to pre-polish optical surfaces within the grinding platform would reduce final finishing process times. Bound abrasive contour wheels containing cerium oxide, alumina or zirconia abrasives were constructed with an epoxy matrix. The effects of abrasive type, composition, and erosion promoters were examined for tool hardness (Shore D), and tested with commercial optical glasses in an OptiproTM CNC grinding platform. Metrology protocols were developed to examine tool wear and subsequent surface roughness. Work is directed to demonstrating effective material removal, improved surface roughness and cutter mark removal.
Prepolishing on a CNC platform with bound abrasive contour tools

NASA Astrophysics Data System (ADS)

Schoeffler, Adrienne E.; Gregg, Leslie L.; Schoen, John M.; Fess, Edward M.; Hakiel, Michael; Jacobs, Stephen D.

2003-05-01

Deterministic microgrinding (DMG) of optical glasses and ceramics is the commercial manufacturing process of choice to shape glass surfaces prior to final finishing. This process employs rigid bound matrix diamond tooling resulting in surface roughness values of 3-5μm peak to valley and 100-400nm rms, as well as mid-spatial frequency tool marks that require subsequent removal in secondary finishing steps. The ability to pre-polish optical surfaces within the grinding platform would reduce final finishing process times. Bound abrasive contour wheels containing cerium oxide, alumina or zirconia abrasives were constructed with an epoxy matrix. The effects of abrasive type, composition, and erosion promoters were examined for tool hardness (Shore D), and tested with commercial optical glasses in an Optipro CNC grinding platform. Metrology protocols were developed to examine tool wear and subsequent surface roughness. Work is directed to demonstrating effective material removal, improved surface roughness and cutter mark removal.
Mobile Building Energy Audit and Modeling Tools: Cooperative Research and Development Final Report, CRADA Number CRD-11-00441

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brackney, L.

Broadly accessible, low cost, accurate, and easy-to-use energy auditing tools remain out of reach for managers of the aging U.S. building population (over 80% of U.S. commercial buildings are more than 10 years old*). concept3D and NREL's commercial buildings group will work to translate and extend NREL's existing spreadsheet-based energy auditing tool for a browser-friendly and mobile-computing platform. NREL will also work with concept3D to further develop a prototype geometry capture and materials inference tool operable on a smart phone/pad platform. These tools will be developed to interoperate with NREL's Building Component Library and OpenStudio energy modeling platforms, and willmore » be marketed by concept3D to commercial developers, academic institutions and governmental agencies. concept3D is NREL's lead developer and subcontractor of the Building Component Library.« less
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

PubMed

Thackston, Russell; Fortenberry, Ryan C

2015-05-05

The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
A Soft Parallel Kinematic Mechanism.

PubMed

White, Edward L; Case, Jennifer C; Kramer-Bottiglio, Rebecca

2018-02-01

In this article, we describe a novel holonomic soft robotic structure based on a parallel kinematic mechanism. The design is based on the Stewart platform, which uses six sensors and actuators to achieve full six-degree-of-freedom motion. Our design is much less complex than a traditional platform, since it replaces the 12 spherical and universal joints found in a traditional Stewart platform with a single highly deformable elastomer body and flexible actuators. This reduces the total number of parts in the system and simplifies the assembly process. Actuation is achieved through coiled-shape memory alloy actuators. State observation and feedback is accomplished through the use of capacitive elastomer strain gauges. The main structural element is an elastomer joint that provides antagonistic force. We report the response of the actuators and sensors individually, then report the response of the complete assembly. We show that the completed robotic system is able to achieve full position control, and we discuss the limitations associated with using responsive material actuators. We believe that control demonstrated on a single body in this work could be extended to chains of such bodies to create complex soft robots.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2001-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify the program execution without changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
Relative Debugging of Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2002-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular, the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify, the program execution with out changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
Technology Enhanced Learning for People with Intellectual Disabilities and Cerebral Paralysis: The MAS Platform

NASA Astrophysics Data System (ADS)

Colomo-Palacios, Ricardo; Paniagua-Martín, Fernando; García-Crespo, Ángel; Ruiz-Mezcua, Belén

Education for students with disabilities now takes place in a wide range of settings, thus, including a wider range of assistive tools. As a result of this, one of the most interesting application domains of technology enhanced learning is related to the adoption of learning technologies and designs for people with disabilities. Following this unstoppable trend, this paper presents MAS, a software platform aimed to help people with severe intellectual disabilities and cerebral paralysis in their learning processes. MAS, as a technology enhanced learning platform, provides several tools that supports learning and monitoring for people with special needs, including adaptative games, data processing and monitoring tools. Installed in a special needs education institution in Madrid, Spain, MAS provides special educators with a tool that improved students education processes.
pyAmpli: an amplicon-based variant filter pipeline for targeted resequencing data.

PubMed

Beyens, Matthias; Boeckx, Nele; Van Camp, Guy; Op de Beeck, Ken; Vandeweyer, Geert

2017-12-14

Haloplex targeted resequencing is a popular method to analyze both germline and somatic variants in gene panels. However, involved wet-lab procedures may introduce false positives that need to be considered in subsequent data-analysis. No variant filtering rationale addressing amplicon enrichment related systematic errors, in the form of an all-in-one package, exists to our knowledge. We present pyAmpli, a platform independent parallelized Python package that implements an amplicon-based germline and somatic variant filtering strategy for Haloplex data. pyAmpli can filter variants for systematic errors by user pre-defined criteria. We show that pyAmpli significantly increases specificity, without reducing sensitivity, essential for reporting true positive clinical relevant mutations in gene panel data. pyAmpli is an easy-to-use software tool which increases the true positive variant call rate in targeted resequencing data. It specifically reduces errors related to PCR-based enrichment of targeted regions.
Complex modulation using tandem polarization modulators

NASA Astrophysics Data System (ADS)

Hasan, Mehedi; Hall, Trevor

2017-11-01

A novel photonic technique for implementing frequency up-conversion or complex modulation is proposed. The proposed circuit consists of a sandwich of a quarter-wave plate between two polarization modulators, driven, respectively, by an in-phase and quadrature-phase signals. The operation of the circuit is modelled using a transmission matrix method. The theoretical prediction is then validated by simulation using an industry-standard software tool. The intrinsic conversion efficiency of the architecture is improved by 6 dB over a functionally equivalent design based on dual parallel Mach-Zehnder modulators. Non-ideal scenarios such as imperfect alignment of the optical components and power imbalances and phase errors in the electric drive signals are also analysed. As light travels, along one physical path, the proposed design can be implemented using discrete components with greater control of relative optical path length differences. The circuit can further be integrated in any material platform that offers electro-optic polarization modulators.

Single-cell imaging tools for brain energy metabolism: a review

PubMed Central

San Martín, Alejandro; Sotelo-Hitschfeld, Tamara; Lerchundi, Rodrigo; Fernández-Moncada, Ignacio; Ceballo, Sebastian; Valdebenito, Rocío; Baeza-Lehnert, Felipe; Alegría, Karin; Contreras-Baeza, Yasna; Garrido-Gerter, Pamela; Romero-Gómez, Ignacio; Barros, L. Felipe

2014-01-01

Abstract. Neurophotonics comes to light at a time in which advances in microscopy and improved calcium reporters are paving the way toward high-resolution functional mapping of the brain. This review relates to a parallel revolution in metabolism. We argue that metabolism needs to be approached both in vitro and in vivo, and that it does not just exist as a low-level platform but is also a relevant player in information processing. In recent years, genetically encoded fluorescent nanosensors have been introduced to measure glucose, glutamate, ATP, NADH, lactate, and pyruvate in mammalian cells. Reporting relative metabolite levels, absolute concentrations, and metabolic fluxes, these sensors are instrumental for the discovery of new molecular mechanisms. Sensors continue to be developed, which together with a continued improvement in protein expression strategies and new imaging technologies, herald an exciting era of high-resolution characterization of metabolism in the brain and other organs. PMID:26157964
A novel processing platform for post tape out flows

NASA Astrophysics Data System (ADS)

Vu, Hien T.; Kim, Soohong; Word, James; Cai, Lynn Y.

2018-03-01

As the computational requirements for post tape out (PTO) flows increase at the 7nm and below technology nodes, there is a need to increase the scalability of the computational tools in order to reduce the turn-around time (TAT) of the flows. Utilization of design hierarchy has been one proven method to provide sufficient partitioning to enable PTO processing. However, as the data is processed through the PTO flow, its effective hierarchy is reduced. The reduction is necessary to achieve the desired accuracy. Also, the sequential nature of the PTO flow is inherently non-scalable. To address these limitations, we are proposing a quasi-hierarchical solution that combines multiple levels of parallelism to increase the scalability of the entire PTO flow. In this paper, we describe the system and present experimental results demonstrating the runtime reduction through scalable processing with thousands of computational cores.
Modeling and simulation of a Stewart platform type parallel structure robot

NASA Technical Reports Server (NTRS)

Lim, Gee Kwang; Freeman, Robert A.; Tesar, Delbert

1989-01-01

The kinematics and dynamics of a Stewart Platform type parallel structure robot (NASA's Dynamic Docking Test System) were modeled using the method of kinematic influence coefficients (KIC) and isomorphic transformations of system dependence from one set of generalized coordinates to another. By specifying the end-effector (platform) time trajectory, the required generalized input forces which would theoretically yield the desired motion were determined. It was found that the relationship between the platform motion and the actuators motion was nonlinear. In addition, the contribution to the total generalized forces, required at the actuators, from the acceleration related terms were found to be more significant than the velocity related terms. Hence, the curve representing the total required actuator force generally resembled the curve for the acceleration related force. Another observation revealed that the acceleration related effective inertia matrix I sub dd had the tendency to decouple, with the elements on the main diagonal of I sub dd being larger than the off-diagonal elements, while the velocity related inertia power array P sub ddd did not show such tendency. This tendency results in the acceleration related force curve of a given actuator resembling the acceleration profile of that particular actuator. Furthermore, it was indicated that the effective inertia matrix for the legs is more decoupled than that for the platform. These observations provide essential information for further research to develop an effective control strategy for real-time control of the Dynamic Docking Test System.
Parallel transformation of K-SVD solar image denoising algorithm

NASA Astrophysics Data System (ADS)

Liang, Youwen; Tian, Yu; Li, Mei

2017-02-01

The images obtained by observing the sun through a large telescope always suffered with noise due to the low SNR. K-SVD denoising algorithm can effectively remove Gauss white noise. Training dictionaries for sparse representations is a time consuming task, due to the large size of the data involved and to the complexity of the training algorithms. In this paper, an OpenMP parallel programming language is proposed to transform the serial algorithm to the parallel version. Data parallelism model is used to transform the algorithm. Not one atom but multiple atoms updated simultaneously is the biggest change. The denoising effect and acceleration performance are tested after completion of the parallel algorithm. Speedup of the program is 13.563 in condition of using 16 cores. This parallel version can fully utilize the multi-core CPU hardware resources, greatly reduce running time and easily to transplant in multi-core platform.
Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation

PubMed Central

Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho

2014-01-01

The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation.

PubMed

Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho

2014-11-01

The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
Scalability and Portability of Two Parallel Implementations of ADI

NASA Technical Reports Server (NTRS)

Phung, Thanh; VanderWijngaart, Rob F.

1994-01-01

Two domain decompositions for the implementation of the NAS Scalar Penta-diagonal Parallel Benchmark on MIMD systems are investigated, namely transposition and multi-partitioning. Hardware platforms considered are the Intel iPSC/860 and Paragon XP/S-15, and clusters of SGI workstations on ethernet, communicating through PVM. It is found that the multi-partitioning strategy offers the kind of coarse granularity that allows scaling up to hundreds of processors on a massively parallel machine. Moreover, efficiency is retained when the code is ported verbatim (save message passing syntax) to a PVM environment on a modest size cluster of workstations.
Method for implementation of recursive hierarchical segmentation on parallel computers

NASA Technical Reports Server (NTRS)

Tilton, James C. (Inventor)

2005-01-01

A method, computer readable storage, and apparatus for implementing a recursive hierarchical segmentation algorithm on a parallel computing platform. The method includes setting a bottom level of recursion that defines where a recursive division of an image into sections stops dividing, and setting an intermediate level of recursion where the recursive division changes from a parallel implementation into a serial implementation. The segmentation algorithm is implemented according to the set levels. The method can also include setting a convergence check level of recursion with which the first level of recursion communicates with when performing a convergence check.
Understanding and Improving High-Performance I/O Subsystems

NASA Technical Reports Server (NTRS)

El-Ghazawi, Tarek A.; Frieder, Gideon; Clark, A. James

1996-01-01

This research program has been conducted in the framework of the NASA Earth and Space Science (ESS) evaluations led by Dr. Thomas Sterling. In addition to the many important research findings for NASA and the prestigious publications, the program has helped orienting the doctoral research program of two students towards parallel input/output in high-performance computing. Further, the experimental results in the case of the MasPar were very useful and helpful to MasPar with which the P.I. has had many interactions with the technical management. The contributions of this program are drawn from three experimental studies conducted on different high-performance computing testbeds/platforms, and therefore presented in 3 different segments as follows: 1. Evaluating the parallel input/output subsystem of a NASA high-performance computing testbeds, namely the MasPar MP- 1 and MP-2; 2. Characterizing the physical input/output request patterns for NASA ESS applications, which used the Beowulf platform; and 3. Dynamic scheduling techniques for hiding I/O latency in parallel applications such as sparse matrix computations. This study also has been conducted on the Intel Paragon and has also provided an experimental evaluation for the Parallel File System (PFS) and parallel input/output on the Paragon. This report is organized as follows. The summary of findings discusses the results of each of the aforementioned 3 studies. Three appendices, each containing a key scholarly research paper that details the work in one of the studies are included.
A simple capacitive method to evaluate ethanol fuel samples

NASA Astrophysics Data System (ADS)

Vello, Tatiana P.; de Oliveira, Rafael F.; Silva, Gustavo O.; de Camargo, Davi H. S.; Bufon, Carlos C. B.

2017-02-01

Ethanol is a biofuel used worldwide. However, the presence of excessive water either during the distillation process or by fraudulent adulteration is a major concern in the use of ethanol fuel. High water levels may cause engine malfunction, in addition to being considered illegal. Here, we describe the development of a simple, fast and accurate platform based on nanostructured sensors to evaluate ethanol samples. The device fabrication is facile, based on standard microfabrication and thin-film deposition methods. The sensor operation relies on capacitance measurements employing a parallel plate capacitor containing a conformational aluminum oxide (Al2O3) thin layer (15 nm). The sensor operates over the full range water concentration, i.e., from approximately 0% to 100% vol. of water in ethanol, with water traces being detectable down to 0.5% vol. These characteristics make the proposed device unique with respect to other platforms. Finally, the good agreement between the sensor response and analyses performed by gas chromatography of ethanol biofuel endorses the accuracy of the proposed method. Due to the full operation range, the reported sensor has the technological potential for use as a point-of-care analytical tool at gas stations or in the chemical, pharmaceutical, and beverage industries, to mention a few.
Crop classification and mapping based on Sentinel missions data in cloud environment

NASA Astrophysics Data System (ADS)

Lavreniuk, M. S.; Kussul, N.; Shelestov, A.; Vasiliev, V.

2017-12-01

Availability of high resolution satellite imagery (Sentinel-1/2/3, Landsat) over large territories opens new opportunities in agricultural monitoring. In particular, it becomes feasible to solve crop classification and crop mapping task at country and regional scale using time series of heterogenous satellite imagery. But in this case, we face with the problem of Big Data. Dealing with time series of high resolution (10 m) multispectral imagery we need to download huge volumes of data and then process them. The solution is to move "processing chain" closer to data itself to drastically shorten time for data transfer. One more advantage of such approach is the possibility to parallelize data processing workflow and efficiently implement machine learning algorithms. This could be done with cloud platform where Sentinel imagery are stored. In this study, we investigate usability and efficiency of two different cloud platforms Amazon and Google for crop classification and crop mapping problems. Two pilot areas were investigated - Ukraine and England. Google provides user friendly environment Google Earth Engine for Earth observation applications with a lot of data processing and machine learning tools already deployed. At the same time with Amazon one gets much more flexibility in implementation of his own workflow. Detailed analysis of pros and cons will be done in the presentation.
Performance Characterization of Global Address Space Applications: A Case Study with NWChem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hammond, Jeffrey R.; Krishnamoorthy, Sriram; Shende, Sameer

The use of global address space languages and one-sided communication for complex applications is gaining attention in the parallel computing community. However, lack of good evaluative methods to observe multiple levels of performance makes it difficult to isolate the cause of performance deficiencies and to understand the fundamental limitations of system and application design for future improvement. NWChem is a popular computational chemistry package which depends on the Global Arrays/ ARMCI suite for partitioned global address space functionality to deliver high-end molecular modeling capabilities. A workload characterization methodology was developed to support NWChem performance engineering on large-scale parallel platforms. Themore » research involved both the integration of performance instrumentation and measurement in the NWChem software, as well as the analysis of one-sided communication performance in the context of NWChem workloads. Scaling studies were conducted for NWChem on Blue Gene/P and on two large-scale clusters using different generation Infiniband interconnects and x86 processors. The performance analysis and results show how subtle changes in the runtime parameters related to the communication subsystem could have significant impact on performance behavior. The tool has successfully identified several algorithmic bottlenecks which are already being tackled by computational chemists to improve NWChem performance.« less
Robotic platform for parallelized cultivation and monitoring of microbial growth parameters in microwell plates.

PubMed

Knepper, Andreas; Heiser, Michael; Glauche, Florian; Neubauer, Peter

2014-12-01

The enormous variation possibilities of bioprocesses challenge process development to fix a commercial process with respect to costs and time. Although some cultivation systems and some devices for unit operations combine the latest technology on miniaturization, parallelization, and sensing, the degree of automation in upstream and downstream bioprocess development is still limited to single steps. We aim to face this challenge by an interdisciplinary approach to significantly shorten development times and costs. As a first step, we scaled down analytical assays to the microliter scale and created automated procedures for starting the cultivation and monitoring the optical density (OD), pH, concentrations of glucose and acetate in the culture medium, and product formation in fed-batch cultures in the 96-well format. Then, the separate measurements of pH, OD, and concentrations of acetate and glucose were combined to one method. This method enables automated process monitoring at dedicated intervals (e.g., also during the night). By this approach, we managed to increase the information content of cultivations in 96-microwell plates, thus turning them into a suitable tool for high-throughput bioprocess development. Here, we present the flowcharts as well as cultivation data of our automation approach. © 2014 Society for Laboratory Automation and Screening.
A Digital Knowledge Preservation Platform for Environmental Sciences

NASA Astrophysics Data System (ADS)

Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida; Perez, David

2017-04-01

The Digital Knowledge Preservation Platform is the evolution of a pilot project for Open Data supporting the full research data life cycle. It is currently being evolved at IFCA (Instituto de Física de Cantabria) as a combination of different open tools that have been extended: DMPTool (https://dmptool.org/) with pilot semantics features (RDF export, parameters definition), INVENIO (http://invenio-software.org/ ) customized version to integrate the entire research data life cycle and Jupyter (http://jupyter.org/) as processing tool and reproducibility environment. This complete platform aims to provide an integrated environment for research data management following the FAIR+R principles: -Findable: The Web portal based on Invenio provides a search engine and all elements including metadata to make them easily findable. -Accessible: Both data and software are available online with internal PIDs and DOIs (provided by Datacite). -Interoperable: Datasets can be combined to perform new analysis. The OAI-PMH standard is also integrated. -Re-usable: different licenses types and embargo periods can be defined. -+Reproducible: directly integrated with cloud computing resources. The deployment of the entire system over a Cloud framework helps to build a dynamic and scalable solution, not only for managing open datasets but also as a useful tool for the final user, who is able to directly process and analyse the open data. In parallel, the direct use of semantics and metadata is being explored and integrated in the framework. Ontologies, being a knowledge representation, can contribute to define the elements and relationships of the research data life cycle, including DMP, datasets, software, etc. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop the ontology we are using a graphical tool called Protégé. Protégé is a graphical ontology-development tool which supports a rich knowledge model and it is open-source and freely available. However in order to process and manage the ontology from the web framework, we are using Semantic MediaWiki, which is able to process queries. Semantic MediaWiki is an extension of MediaWiki where we can do semantic search and export data in RDF and CSV format. This system is used as a testbed for the potential use of semantics in a more general environment. This Digital Knowledge Preservation Platform is very closed related to INDIGO-DataCloud project (https://www.indigo-datacloud.eu) since the same data life cycle approach is taking into account (Planning, Collect, Curate, Analyze, Publish, Preserve). INDIGO-DataCloud solutions will be able to support all the different elements in the system, as we showed in the last Research Data Alliance Plenary. This presentation will show the different elements on the system and how they work, as well as the roadmap of their continuous integration.
Error modeling and sensitivity analysis of a parallel robot with SCARA(selective compliance assembly robot arm) motions

NASA Astrophysics Data System (ADS)

Chen, Yuzhen; Xie, Fugui; Liu, Xinjun; Zhou, Yanhua

2014-07-01

Parallel robots with SCARA(selective compliance assembly robot arm) motions are utilized widely in the field of high speed pick-and-place manipulation. Error modeling for these robots generally simplifies the parallelogram structures included by the robots as a link. As the established error model fails to reflect the error feature of the parallelogram structures, the effect of accuracy design and kinematic calibration based on the error model come to be undermined. An error modeling methodology is proposed to establish an error model of parallel robots with parallelogram structures. The error model can embody the geometric errors of all joints, including the joints of parallelogram structures. Thus it can contain more exhaustively the factors that reduce the accuracy of the robot. Based on the error model and some sensitivity indices defined in the sense of statistics, sensitivity analysis is carried out. Accordingly, some atlases are depicted to express each geometric error's influence on the moving platform's pose errors. From these atlases, the geometric errors that have greater impact on the accuracy of the moving platform are identified, and some sensitive areas where the pose errors of the moving platform are extremely sensitive to the geometric errors are also figured out. By taking into account the error factors which are generally neglected in all existing modeling methods, the proposed modeling method can thoroughly disclose the process of error transmission and enhance the efficacy of accuracy design and calibration.
Access to CAMAC from VxWorks and UNIX in DART

DOE Office of Scientific and Technical Information (OSTI.GOV)

Streets, J.; Meadows, J.; Moore, C.

1995-05-01

As part of the DART Project the authors have developed a package of software for CAMAC access from UNIX and VxWorks platforms, with support for several hardware interfaces. They report on developments for the CES CBD8210 VME to parallel CAMAC, the Hytec VSD2992 VME to serial CAMAC and Jorway 411S SCSI to parallel and serial CAMAC branch drivers, and give a summary of the timings obtained.
Comparison of Quantitative Mass Spectrometry Platforms for Monitoring Kinase ATP Probe Uptake in Lung Cancer.

PubMed

Hoffman, Melissa A; Fang, Bin; Haura, Eric B; Rix, Uwe; Koomen, John M

2018-01-05

Recent developments in instrumentation and bioinformatics have led to new quantitative mass spectrometry platforms including LC-MS/MS with data-independent acquisition (DIA) and targeted analysis using parallel reaction monitoring mass spectrometry (LC-PRM), which provide alternatives to well-established methods, such as LC-MS/MS with data-dependent acquisition (DDA) and targeted analysis using multiple reaction monitoring mass spectrometry (LC-MRM). These tools have been used to identify signaling perturbations in lung cancers and other malignancies, supporting the development of effective kinase inhibitors and, more recently, providing insights into therapeutic resistance mechanisms and drug repurposing opportunities. However, detection of kinases in biological matrices can be challenging; therefore, activity-based protein profiling enrichment of ATP-utilizing proteins was selected as a test case for exploring the limits of detection of low-abundance analytes in complex biological samples. To examine the impact of different MS acquisition platforms, quantification of kinase ATP uptake following kinase inhibitor treatment was analyzed by four different methods: LC-MS/MS with DDA and DIA, LC-MRM, and LC-PRM. For discovery data sets, DIA increased the number of identified kinases by 21% and reduced missingness when compared with DDA. In this context, MRM and PRM were most effective at identifying global kinome responses to inhibitor treatment, highlighting the value of a priori target identification and manual evaluation of quantitative proteomics data sets. We compare results for a selected set of desthiobiotinylated peptides from PRM, MRM, and DIA and identify considerations for selecting a quantification method and postprocessing steps that should be used for each data acquisition strategy.
B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.

PubMed

Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang

2016-03-01

Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.
Simplified Parallel Domain Traversal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Erickson III, David J

2011-01-01

Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep bymore » performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.« less
Runtime support for parallelizing data mining algorithms

NASA Astrophysics Data System (ADS)

Jin, Ruoming; Agrawal, Gagan

2002-03-01

With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of common data mining algorithms. In addition, we propose a reduction-object based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the technique we have developed starting from a common specification of the algorithm.

Assessing Digital Humanities Tools: Use of Scalar at a Research University

ERIC Educational Resources Information Center

Tracy, Daniel G.

2016-01-01

As librarians increasingly support digital publication platforms, they must also understand the user experience of these tools. This case study assesses use of Scalar, a digital humanities publishing platform for media-rich projects, at the University of Illinois at Urbana-Champaign. Based on a survey, interviews, and content analysis, the study…
CHRONIOUS: a wearable platform for monitoring and management of patients with chronic disease.

PubMed

Bellos, Christos; Papadopoulos, Athanassios; Rosso, Roberto; Fotiadis, Dimitrios I

2011-01-01

The CHRONIOUS system has been developed based on an open architecture design that consists of a set of subsystems which interact in order to provide all the needed services to the chronic disease patients. An advanced multi-parametric expert system is being implemented that fuses information effectively from various sources using intelligent techniques. Data are collected by sensors of a body network controlling vital signals while additional tools record dietary habits and plans, drug intake, environmental and biochemical parameters and activity data. The CHRONIOUS platform provides guidelines and standards for the future generations of "chronic disease management systems" and facilitates sophisticated monitoring tools. In addition, an ontological information retrieval system is being delivered satisfying the necessities for up-to-date clinical information of Chronic Obstructive pulmonary disease (COPD) and Chronic Kidney Disease (CKD). Moreover, support tools are being embedded in the system, such as the Mental Tools for the monitoring of patient mental health status. The integrated platform provides real-time patient monitoring and supervision, both indoors and outdoors and represents a generic platform for the management of various chronic diseases.
OpenNEX, a private-public partnership in support of the national climate assessment

NASA Astrophysics Data System (ADS)

Nemani, R. R.; Wang, W.; Michaelis, A.; Votava, P.; Ganguly, S.

2016-12-01

The NASA Earth Exchange (NEX) is a collaborative computing platform that has been developed with the objective of bringing scientists together with the software tools, massive global datasets, and supercomputing resources necessary to accelerate research in Earth systems science and global change. NEX is funded as an enabling tool for sustaining the national climate assessment. Over the past five years, researchers have used the NEX platform and produced a number of data sets highly relevant to the National Climate Assessment. These include high-resolution climate projections using different downscaling techniques and trends in historical climate from satellite data. To enable a broader community in exploiting the above datasets, the NEX team partnered with public cloud providers to create the OpenNEX platform. OpenNEX provides ready access to NEX data holdings on a number of public cloud platforms along with pertinent analysis tools and workflows in the form of Machine Images and Docker Containers, lectures and tutorials by experts. We will showcase some of the applications of OpenNEX data and tools by the community on Amazon Web Services, Google Cloud and the NEX Sandbox.
Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Moitra, Stuti

1996-01-01

Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
Toward 4D Nanoprinting with Tip-Induced Organic Surface Reactions.

PubMed

Carbonell, Carlos; Braunschweig, Adam B

2017-02-21

Future nanomanufacturing tools will prepare organic materials with complex four-dimensional (4D) structure, where the position (x, y, z) and chemical composition within a volume is controlled with sub-1 μm spatial resolution. Such tools could produce substrates that mimic biological interfaces, like the cell surface or the extracellular matrix, whose topology and chemical complexity combine to direct some of the most sophisticated biological events. The control of organic materials at the nanoscale-level of spatial resolution could revolutionize the assembly of next generation optical and electronic devices or substrates for tissue engineering or enable fundamental biological or material science investigations. Organic chemistry provides the requisite control over the orientation and position of matter within a nanoscale reference frame through the formation of new covalent bonds. Several challenges however preclude the integration of organic chemistry with conventional nanomanufacturing approaches, namely most nanolithography platforms would denature or destroy delicate organic and biologically active matter, confirming covalent bond formation at interfaces remains difficult, and finally, only a small handful of the reactions used to transform molecules in solution have been validated on surfaces. Thus, entirely new approaches, where organic transformations and spatial control are considered equally important contributors, are needed to create 4D organic nanoprinting platforms. This Account describes efforts from our group to reconcile nanolithography, and specifically massively parallel scanning probe lithography (SPL), with organic chemistry to further the goal of 4D organic nanoprinting. Massively parallel SPL involves arrays of elastomeric pyramids mounted onto piezoelectric actuators, and creates patterns with feature diameters below 50 nm by using the pyramidal tips for either the direct deposition of ink or the localized delivery of energy to a surface. While other groups have focused on tip and array architetctures, our efforts have been on exploring their use for localizing organic chemistry on surfaces with nanoscale spatial resolution in 3D. Herein we describe the use of massively parallel SPL to create covalently immobilized patterns of organic materials using thermal, catalytic, photochemical, and force-accelerated reactions. In doing so, we have developed a high-throughput protocol for confirming interfacial bond formation. These efforts have resulted in new opportunities for the preparation of glycan arrays, novel approaches for covalently patterning graphene, and a 3D nanoprinter by combining photochemical brush polymerizations with SPL. Achieving true 4D nanoprinting involves advances in surface chemistry and instrumentation development, and to this end 4D micropatterns were produced in a microfluidic photoreactor that can position polymers composed of different monomers within micrometer proximity. A substantial gap remains, however, between these current technologies and the future's 4D nanomanufacturing tools, but the marriage of SPL with organic chemistry is an important step toward this goal. As this field continues to mature we can expect bottom-up 4D nanomanufacturing to begin supplanting conventional top-down strategies for preparing electronics, bioarrays, and functional substrates. In addition, these new printing technologies may enable the preparation of synthetic targets, such as artificial biological interfaces, with a level of organic sophistication that is entirely unachievable using existing technologies.
[Acute lymphoblastic leukemia: a genomic perspective].

PubMed

Jiménez-Morales, Silvia; Hidalgo-Miranda, Alfredo; Ramírez-Bello, Julián

In parallel to the human genome sequencing project, several technological platforms have been developed that let us gain insight into the genome structure of human entities, as well as evaluate their usefulness in the clinical approach of the patient. Thus, in acute lymphoblastic leukemia (ALL), the most common pediatric malignancy, genomic tools promise to be useful to detect patients at high risk of relapse, either at diagnosis or during treatment (minimal residual disease), and they also increase the possibility to identify cases at risk of adverse reactions to chemotherapy. Therefore, the physician could offer patient-tailored therapeutic schemes. A clear example of the useful genomic tools is the identification of single nucleotide polymorphisms (SNPs) in the thiopurine methyl transferase (TPMT) gene, where the presence of two null alleles (homozygous or compound heterozygous) indicates the need to reduce the dose of mercaptopurine by up to 90% to avoid toxic effects which could lead to the death of the patient. In this review, we provide an overview of the genomic perspective of ALL, describing some strategies that contribute to the identification of biomarkers with potential clinical application. Copyright © 2017 Hospital Infantil de México Federico Gómez. Publicado por Masson Doyma México S.A. All rights reserved.
Launching GUPPI: the Green Bank Ultimate Pulsar Processing Instrument

NASA Astrophysics Data System (ADS)

DuPlain, Ron; Ransom, Scott; Demorest, Paul; Brandt, Patrick; Ford, John; Shelton, Amy L.

2008-08-01

The National Radio Astronomy Observatory (NRAO) is launching the Green Bank Ultimate Pulsar Processing Instrument (GUPPI), a prototype flexible digital signal processor designed for pulsar observations with the Robert C. Byrd Green Bank Telescope (GBT). GUPPI uses field programmable gate array (FPGA) hardware and design tools developed by the Center for Astronomy Signal Processing and Electronics Research (CASPER) at the University of California, Berkeley. The NRAO has been concurrently developing GUPPI software and hardware using minimal software resources. The software handles instrument monitor and control, data acquisition, and hardware interfacing. GUPPI is currently an expert-only spectrometer, but supports future integration with the full GBT production system. The NRAO was able to take advantage of the unique flexibility of the CASPER FPGA hardware platform, develop hardware and software in parallel, and build a suite of software tools for monitoring, controlling, and acquiring data with a new instrument over a short timeline of just a few months. The NRAO interacts regularly with CASPER and its users, and GUPPI stands as an example of what reconfigurable computing and open-source development can do for radio astronomy. GUPPI is modular for portability, and the NRAO provides the results of development as an open-source resource.
Continuum approach for aerothermal flow through ablative porous material using discontinuous Galerkin discretization.

NASA Astrophysics Data System (ADS)

Schrooyen, Pierre; Chatelain, Philippe; Hillewaert, Koen; Magin, Thierry E.

2014-11-01

The atmospheric entry of spacecraft presents several challenges in simulating the aerothermal flow around the heat shield. Predicting an accurate heat-flux is a complex task, especially regarding the interaction between the flow in the free stream and the erosion of the thermal protection material. To capture this interaction, a continuum approach is developed to go progressively from the region fully occupied by fluid to a receding porous medium. The volume averaged Navier-Stokes equations are used to model both phases in the same computational domain considering a single set of conservation laws. The porosity is itself a variable of the computation, allowing to take volumetric ablation into account through adequate source terms. This approach is implemented within a computational tool based on a high-order discontinuous Galerkin discretization. The multi-dimensional tool has already been validated and has proven its efficient parallel implementation. Within this platform, a fully implicit method was developed to simulate multi-phase reacting flows. Numerical results to verify and validate the methodology are considered within this work. Interactions between the flow and the ablated geometry are also presented. Supported by Fund for Research Training in Industry and Agriculture.
Solution of task related to control of swiss-type automatic lathe to get planes parallel to part axis

NASA Astrophysics Data System (ADS)

Tabekina, N. A.; Chepchurov, M. S.; Evtushenko, E. I.; Dmitrievsky, B. S.

2018-05-01

The work solves the problem of automation of machining process namely turning to produce parts having the planes parallel to an axis of rotation of part without using special tools. According to the results, the availability of the equipment of a high speed electromechanical drive to control the operative movements of lathe machine will enable one to get the planes parallel to the part axis. The method of getting planes parallel to the part axis is based on the mathematical model, which is presented as functional dependency between the conveying velocity of the driven element and the time. It describes the operative movements of lathe machine all over the tool path. Using the model of movement of the tool, it has been found that the conveying velocity varies from the maximum to zero value. It will allow one to carry out the reverse of the drive. The scheme of tool placement regarding the workpiece has been proposed for unidirectional movement of the driven element at high conveying velocity. The control method of CNC machines can be used for getting geometrically complex parts on the lathe without using special milling tools.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.

PubMed

Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros

2014-06-25

The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions

PubMed Central

2014-01-01

Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
Geospatial Service Platform for Education and Research

NASA Astrophysics Data System (ADS)

Gong, J.; Wu, H.; Jiang, W.; Guo, W.; Zhai, X.; Yue, P.

2014-04-01

We propose to advance the scientific understanding through applications of geospatial service platforms, which can help students and researchers investigate various scientific problems in a Web-based environment with online tools and services. The platform also offers capabilities for sharing data, algorithm, and problem-solving knowledge. To fulfil this goal, the paper introduces a new course, named "Geospatial Service Platform for Education and Research", to be held in the ISPRS summer school in May 2014 at Wuhan University, China. The course will share cutting-edge achievements of a geospatial service platform with students from different countries, and train them with online tools from the platform for geospatial data processing and scientific research. The content of the course includes the basic concepts of geospatial Web services, service-oriented architecture, geoprocessing modelling and chaining, and problem-solving using geospatial services. In particular, the course will offer a geospatial service platform for handson practice. There will be three kinds of exercises in the course: geoprocessing algorithm sharing through service development, geoprocessing modelling through service chaining, and online geospatial analysis using geospatial services. Students can choose one of them, depending on their interests and background. Existing geoprocessing services from OpenRS and GeoPW will be introduced. The summer course offers two service chaining tools, GeoChaining and GeoJModelBuilder, as instances to explain specifically the method for building service chains in view of different demands. After this course, students can learn how to use online service platforms for geospatial resource sharing and problem-solving.
Chromatography process development in the quality by design paradigm I: Establishing a high-throughput process development platform as a tool for estimating "characterization space" for an ion exchange chromatography step.

PubMed

Bhambure, R; Rathore, A S

2013-01-01

This article describes the development of a high-throughput process development (HTPD) platform for developing chromatography steps. An assessment of the platform as a tool for establishing the "characterization space" for an ion exchange chromatography step has been performed by using design of experiments. Case studies involving use of a biotech therapeutic, granulocyte colony-stimulating factor have been used to demonstrate the performance of the platform. We discuss the various challenges that arise when working at such small volumes along with the solutions that we propose to alleviate these challenges to make the HTPD data suitable for empirical modeling. Further, we have also validated the scalability of this platform by comparing the results from the HTPD platform (2 and 6 μL resin volumes) against those obtained at the traditional laboratory scale (resin volume, 0.5 mL). We find that after integration of the proposed correction factors, the HTPD platform is capable of performing the process optimization studies at 170-fold higher productivity. The platform is capable of providing semi-quantitative assessment of the effects of the various input parameters under consideration. We think that platform such as the one presented is an excellent tool for examining the "characterization space" and reducing the extensive experimentation at the traditional lab scale that is otherwise required for establishing the "design space." Thus, this platform will specifically aid in successful implementation of quality by design in biotech process development. This is especially significant in view of the constraints with respect to time and resources that the biopharma industry faces today. Copyright © 2013 American Institute of Chemical Engineers.
Ground and Aerial Digital Documentation of Cultural Heritage: Providing Tools for 3d Exploitation of Archaeological Data

NASA Astrophysics Data System (ADS)

Cantoro, G.

2017-02-01

Archaeology is by its nature strictly connected with the physical landscape and as such it explores the inter-relations of individuals with places in which they leave and the nature that surrounds them. Since its earliest stages, archaeology demonstrated its permeability to scientific methods and innovative techniques or technologies. Archaeologists were indeed between the first to adopt GIS platforms (since already almost three decades) on large scale and are now between the most demanding customers for emerging technologies such as digital photogrammetry and drone-aided aerial photography. This paper aims at presenting case studies where the "3D approach" can be critically analysed and compared with more traditional means of documentation. Spot-light is directed towards the benefits of a specifically designed platform for user to access the 3D point-clouds and explore their characteristics. Beside simple measuring and editing tools, models are presented in their actual context and location, with historical and archaeological information provided on the side. As final step of a parallel project on geo-referencing and making available a large archive of aerial photographs, 3D models derived from photogrammetric processing of images have been uploaded and linked to photo-footprints polygons. Of great importance in such context is the possibility to interchange the point-cloud colours with satellite imagery from OpenLayers. This approach makes it possible to explore different landscape configurations due to time-changes with simple clicks. In these cases, photogrammetry or 3D laser scanning replaced, sided or integrated legacy documentation, creating at once a new set of information for forthcoming research and ideally new discoveries.
Characterization of three-dimensional cancer cell migration in mixed collagen-Matrigel scaffolds using microfluidics and image analysis

PubMed Central

Maška, Martin; Ederra, Cristina; Peláez, Rafael; Morales, Xabier; Muñoz-Arrieta, Gorka; Mujika, Maite; Kozubek, Michal; Muñoz-Barrutia, Arrate; Rouzaut, Ana; Arana, Sergio; Garcia-Aznar, José Manuel; Ortiz-de-Solorzano, Carlos

2017-01-01

Microfluidic devices are becoming mainstream tools to recapitulate in vitro the behavior of cells and tissues. In this study, we use microfluidic devices filled with hydrogels of mixed collagen-Matrigel composition to study the migration of lung cancer cells under different cancer invasion microenvironments. We present the design of the microfluidic device, characterize the hydrogels morphologically and mechanically and use quantitative image analysis to measure the migration of H1299 lung adenocarcinoma cancer cells in different experimental conditions. Our results show the plasticity of lung cancer cell migration, which turns from mesenchymal in collagen only matrices, to lobopodial in collagen-Matrigel matrices that approximate the interface between a disrupted basement membrane and the underlying connective tissue. Our quantification of migration speed confirms a biphasic role of Matrigel. At low concentration, Matrigel facilitates migration, most probably by providing a supportive and growth factor retaining environment. At high concentration, Matrigel slows down migration, possibly due excessive attachment. Finally, we show that antibody-based integrin blockade promotes a change in migration phenotype from mesenchymal or lobopodial to amoeboid and analyze the effect of this change in migration dynamics, in regards to the structure of the matrix. In summary, we describe and characterize a robust microfluidic platform and a set of software tools that can be used to study lung cancer cell migration under different microenvironments and experimental conditions. This platform could be used in future studies, thus benefitting from the advantages introduced by microfluidic devices: precise control of the environment, excellent optical properties, parallelization for high throughput studies and efficient use of therapeutic drugs. PMID:28166248
Geolokit: An interactive tool for visualising and exploring geoscientific data in Google Earth

NASA Astrophysics Data System (ADS)

Triantafyllou, Antoine; Watlet, Arnaud; Bastin, Christophe

2017-10-01

Virtual globes have been developed to showcase different types of data combining a digital elevation model and basemaps of high resolution satellite imagery. Hence, they became a standard to share spatial data and information, although they suffer from a lack of toolboxes dedicated to the formatting of large geoscientific dataset. From this perspective, we developed Geolokit: a free and lightweight software that allows geoscientists - and every scientist working with spatial data - to import their data (e.g., sample collections, structural geology, cross-sections, field pictures, georeferenced maps), to handle and to transcribe them to Keyhole Markup Language (KML) files. KML files are then automatically opened in the Google Earth virtual globe and the spatial data accessed and shared. Geolokit comes with a large number of dedicated tools that can process and display: (i) multi-points data, (ii) scattered data interpolations, (iii) structural geology features in 2D and 3D, (iv) rose diagrams, stereonets and dip-plunge polar histograms, (v) cross-sections and oriented rasters, (vi) georeferenced field pictures, (vii) georeferenced maps and projected gridding. Therefore, together with Geolokit, Google Earth becomes not only a powerful georeferenced data viewer but also a stand-alone work platform. The toolbox (available online at http://www.geolokit.org) is written in Python, a high-level, cross-platform programming language and is accessible through a graphical user interface, designed to run in parallel with Google Earth, through a workflow that requires no additional third party software. Geolokit features are demonstrated in this paper using typical datasets gathered from two case studies illustrating its applicability at multiple scales of investigation: a petro-structural investigation of the Ile d'Yeu orthogneissic unit (Western France) and data collection of the Mariana oceanic subduction zone (Western Pacific).
AEGIS: a wildfire prevention and management information system

NASA Astrophysics Data System (ADS)

Kalabokidis, K.; Ager, A.; Finney, M.; Athanasis, N.; Palaiologou, P.; Vasilakos, C.

2015-10-01

A Web-GIS wildfire prevention and management platform (AEGIS) was developed as an integrated and easy-to-use decision support tool (http://aegis.aegean.gr). The AEGIS platform assists with early fire warning, fire planning, fire control and coordination of firefighting forces by providing access to information that is essential for wildfire management. Databases were created with spatial and non-spatial data to support key system functionalities. Updated land use/land cover maps were produced by combining field inventory data with high resolution multispectral satellite images (RapidEye) to be used as inputs in fire propagation modeling with the Minimum Travel Time algorithm. End users provide a minimum number of inputs such as fire duration, ignition point and weather information to conduct a fire simulation. AEGIS offers three types of simulations; i.e. single-fire propagations, conditional burn probabilities and at the landscape-level, similar to the FlamMap fire behavior modeling software. Artificial neural networks (ANN) were utilized for wildfire ignition risk assessment based on various parameters, training methods, activation functions, pre-processing methods and network structures. The combination of ANNs and expected burned area maps produced an integrated output map for fire danger prediction. The system also incorporates weather measurements from remote automatic weather stations and weather forecast maps. The structure of the algorithms relies on parallel processing techniques (i.e. High Performance Computing and Cloud Computing) that ensure computational power and speed. All AEGIS functionalities are accessible to authorized end users through a web-based graphical user interface. An innovative mobile application, AEGIS App, acts as a complementary tool to the web-based version of the system.
Open | SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis

DOE PAGES

Schulz, Martin; Galarowicz, Jim; Maghrak, Don; ...

2008-01-01

Over the last decades a large number of performance tools has been developed to analyze and optimize high performance applications. Their acceptance by end users, however, has been slow: each tool alone is often limited in scope and comes with widely varying interfaces and workflow constraints, requiring different changes in the often complex build and execution infrastructure of the target application. We started the Open | SpeedShop project about 3 years ago to overcome these limitations and provide efficient, easy to apply, and integrated performance analysis for parallel systems. Open | SpeedShop has two different faces: it provides an interoperable tool set covering themore » most common analysis steps as well as a comprehensive plugin infrastructure for building new tools. In both cases, the tools can be deployed to large scale parallel applications using DPCL/Dyninst for distributed binary instrumentation. Further, all tools developed within or on top of Open | SpeedShop are accessible through multiple fully equivalent interfaces including an easy-to-use GUI as well as an interactive command line interface reducing the usage threshold for those tools.« less
AZTEC. Parallel Iterative method Software for Solving Linear Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hutchinson, S.; Shadid, J.; Tuminaro, R.

1995-07-01

AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
The "Vsoil Platform" : a tool to integrate the various physical, chemical and biological processes contributing to the soil functioning at the local scale.

NASA Astrophysics Data System (ADS)

Lafolie, François; Cousin, Isabelle; Mollier, Alain; Pot, Valérie; Moitrier, Nicolas; Balesdent, Jérome; bruckler, Laurent; Moitrier, Nathalie; Nouguier, Cédric; Richard, Guy

2014-05-01

Models describing the soil functioning are valuable tools for addressing challenging issues related to agricultural production, soil protection or biogeochemical cycles. Coupling models that address different scientific fields is actually required in order to develop numerical tools able to simulate the complex interactions and feed-backs occurring within a soil profile in interaction with climate and human activities. We present here a component-based modelling platform named "VSoil", that aims at designing, developing, implementing and coupling numerical representation of biogeochemical and physical processes in soil, from the aggregate to the profile scales. The platform consists of four softwares, i) Vsoil_Processes dedicated to the conceptual description of processes and of their inputs and outputs, ii) Vsoil_Modules devoted to the development of numerical representation of elementary processes as modules, iii) Vsoil_Models which permits the coupling of modules to create models, iv) Vsoil_Player for the run of the model and the primary analysis of results. The platform is designed to be a collaborative tool, helping scientists to share not only their models, but also the scientific knowledge on which the models are built. The platform is based on the idea that processes of any kind can be described and characterized by their inputs (state variables required) and their outputs. The links between the processes are automatically detected by the platform softwares. For any process, several numerical representations (modules) can be developed and made available to platform users. When developing modules, the platform takes care of many aspects of the development task so that the user can focus on numerical calculations. Fortran2008 and C++ are the supported languages and existing codes can be easily incorporated into platform modules. Building a model from available modules simply requires selecting the processes being accounted for and for each process a module. During this task, the platform displays available modules and checks the compatibility between the modules. The model (main program) is automatically created when compatible modules have been selected for all the processes. A GUI is automatically generated to help the user providing parameters and initial situations. Numerical results can be immediately visualized, archived and exported. The platform also provides facilities to carry out sensitivity analysis. Parameters estimation and links with databases are being developed. The platform can be freely downloaded from the web site (http://www.inra.fr/sol_virtuel/) with a set of processes, variables, modules and models. However, it is designed so that any user can add its own components. Theses adds-on can be shared with co-workers by means of an export/import mechanism using the e-mail. The adds-on can also be made available to the whole community of platform users when developers asked for. A filtering tool is available to explore the content of the platform (processes, variables, modules, models).

Grace: A cross-platform micromagnetic simulator on graphics processing units

NASA Astrophysics Data System (ADS)

Zhu, Ru

2015-12-01

A micromagnetic simulator running on graphics processing units (GPUs) is presented. Different from GPU implementations of other research groups which are predominantly running on NVidia's CUDA platform, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform independent. It runs on GPUs from venders including NVidia, AMD and Intel, and achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude. The simulator paved the way for running large size micromagnetic simulations on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics cards, and is freely available to download.
Cross-platform validation and analysis environment for particle physics

NASA Astrophysics Data System (ADS)

Chekanov, S. V.; Pogrebnyak, I.; Wilbern, D.

2017-11-01

A multi-platform validation and analysis framework for public Monte Carlo simulation for high-energy particle collisions is discussed. The front-end of this framework uses the Python programming language, while the back-end is written in Java, which provides a multi-platform environment that can be run from a web browser and can easily be deployed at the grid sites. The analysis package includes all major software tools used in high-energy physics, such as Lorentz vectors, jet algorithms, histogram packages, graphic canvases, and tools for providing data access. This multi-platform software suite, designed to minimize OS-specific maintenance and deployment time, is used for online validation of Monte Carlo event samples through a web interface.
Portable parallel stochastic optimization for the design of aeropropulsion components

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Rhodes, G. S.

1994-01-01

This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
An Expert Assistant for Computer Aided Parallelization

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Chun, Robert; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit

2004-01-01

The prototype implementation of an expert system was developed to assist the user in the computer aided parallelization process. The system interfaces to tools for automatic parallelization and performance analysis. By fusing static program structure information and dynamic performance analysis data the expert system can help the user to filter, correlate, and interpret the data gathered by the existing tools. Sections of the code that show poor performance and require further attention are rapidly identified and suggestions for improvements are presented to the user. In this paper we describe the components of the expert system and discuss its interface to the existing tools. We present a case study to demonstrate the successful use in full scale scientific applications.
GPU computing in medical physics: a review.

PubMed

Pratx, Guillem; Xing, Lei

2011-05-01

The graphics processing unit (GPU) has emerged as a competitive platform for computing massively parallel problems. Many computing applications in medical physics can be formulated as data-parallel tasks that exploit the capabilities of the GPU for reducing processing times. The authors review the basic principles of GPU computing as well as the main performance optimization techniques, and survey existing applications in three areas of medical physics, namely image reconstruction, dose calculation and treatment plan optimization, and image processing.
The Open Source Snowpack modelling ecosystem

NASA Astrophysics Data System (ADS)

Bavay, Mathias; Fierz, Charles; Egger, Thomas; Lehning, Michael

2016-04-01

As a large number of numerical snow models are available, a few stand out as quite mature and widespread. One such model is SNOWPACK, the Open Source model that is developed at the WSL Institute for Snow and Avalanche Research SLF. Over the years, various tools have been developed around SNOWPACK in order to expand its use or to integrate additional features. Today, the model is part of a whole ecosystem that has evolved to both offer seamless integration and high modularity so each tool can easily be used outside the ecosystem. Many of these Open Source tools experience their own, autonomous development and are successfully used in their own right in other models and applications. There is Alpine3D, the spatially distributed version of SNOWPACK, that forces it with terrain-corrected radiation fields and optionally with blowing and drifting snow. This model can be used on parallel systems (either with OpenMP or MPI) and has been used for applications ranging from climate change to reindeer herding. There is the MeteoIO pre-processing library that offers fully integrated data access, data filtering, data correction, data resampling and spatial interpolations. This library is now used by several other models and applications. There is the SnopViz snow profile visualization library and application that supports both measured and simulated snow profiles (relying on the CAAML standard) as well as time series. This JavaScript application can be used standalone without any internet connection or served on the web together with simulation results. There is the OSPER data platform effort with a data management service (build on the Global Sensor Network (GSN) platform) as well as a data documenting system (metadata management as a wiki). There are several distributed hydrological models for mountainous areas in ongoing development that require very little information about the soil structure based on the assumption that in step terrain, the most relevant information is contained in the Digital Elevation Model (DEM). There is finally a set of tools making up the operational chain to automatically run, monitor and publish SNOWPACK simulations for operational avalanche warning purposes. This tool chain has been developed with the aim of offering very low maintenance operation and very fast deployment and to easily adapt to other avalanche services.
Which benefits in the use of a modeling platform : The VSoil example.

NASA Astrophysics Data System (ADS)

Lafolie, François; Cousin, Isabelle; Mollier, Alain; Pot, Valérie; Maron, Pierre-Alain; Moitrier, Nicolas; Nouguier, Cedric; Moitrier, Nathalie; Beudez, Nicolas

2015-04-01

In the environmental community the need for coupling the models and the associated knowledges emerged recently. The development of a coupling tool or of a modeling platform is mainly driven by the necessity to create models accounting for multiple processes and to take into account the feed back between these processes. Models focusing on a restricted number of processes exist and thus the coupling of these numerical tools appeared as an efficient and rapid mean to fill up the identified gaps. Several tools have been proposed : OMS3 (David et al. 2013) ; CSDMS framework (Peckham et al. 2013) ; the Open MI project developed within the frame of European Community (Open MI, 2011). However, what we should expect from a modeling platform could be more ambitious than only coupling existing numerical codes. We believe that we need to share easily not only our numerical representations but also the attached knowledges. We need to rapidly and easily develop complex models to have tools to bring responses to current issues on soil functioning and soil evolution within the frame of global change. We also need to share in a common frame our visions of soil functioning at various scales, one the one hand to strengthen our collaborations, and, on the other hand, to make them visible by the other communities working on environmental issues. The presentation will briefly present the VSoil platform. The platform is able to manipulate concepts and numerical representations of these processes. The tool helps in assembling modules to create a model and automatically generates an executable code and a GUI. Potentialities of the tool will be illustrated on few selected cases.
Instrumentation, performance visualization, and debugging tools for multiprocessors

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

1991-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.
Does Knowing Lead to Doing in the Case of Learning Platforms?

ERIC Educational Resources Information Center

Underwood, Jean D. M.; Stiller, James

2014-01-01

There have been significant advance in educational technology but they have not always brought about measurable shifts in user behavior. This study examined the relationship between teachers' knowledge about a tool and their use of that tool. In many secondary schools use of a Learning Platforms (LPs) is no longer optional although the degree of…
Full speed ahead for software

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolfe, A.

1986-03-10

Supercomputing software is moving into high gear, spurred by the rapid spread of supercomputers into new applications. The critical challenge is how to develop tools that will make it easier for programmers to write applications that take advantage of vectorizing in the classical supercomputer and the parallelism that is emerging in supercomputers and minisupercomputers. Writing parallel software is a challenge that every programmer must face because parallel architectures are springing up across the range of computing. Cray is developing a host of tools for programmers. Tools to support multitasking (in supercomputer parlance, multitasking means dividing up a single program tomore » run on multiple processors) are high on Cray's agenda. On tap for multitasking is Premult, dubbed a microtasking tool. As a preprocessor for Cray's CFT77 FORTRAN compiler, Premult will provide fine-grain multitasking.« less
ProteoWizard: open source software for rapid proteomics tools development.

PubMed

Kessner, Darren; Chambers, Matt; Burke, Robert; Agus, David; Mallick, Parag

2008-11-01

The ProteoWizard software project provides a modular and extensible set of open-source, cross-platform tools and libraries. The tools perform proteomics data analyses; the libraries enable rapid tool creation by providing a robust, pluggable development framework that simplifies and unifies data file access, and performs standard proteomics and LCMS dataset computations. The library contains readers and writers of the mzML data format, which has been written using modern C++ techniques and design principles and supports a variety of platforms with native compilers. The software has been specifically released under the Apache v2 license to ensure it can be used in both academic and commercial projects. In addition to the library, we also introduce a rapidly growing set of companion tools whose implementation helps to illustrate the simplicity of developing applications on top of the ProteoWizard library. Cross-platform software that compiles using native compilers (i.e. GCC on Linux, MSVC on Windows and XCode on OSX) is available for download free of charge, at http://proteowizard.sourceforge.net. This website also provides code examples, and documentation. It is our hope the ProteoWizard project will become a standard platform for proteomics development; consequently, code use, contribution and further development are strongly encouraged.
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zuo, Wangda; McNeil, Andrew; Wetter, Michael

2011-09-06

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
On efficiency of fire simulation realization: parallelization with greater number of computational meshes

NASA Astrophysics Data System (ADS)

Valasek, Lukas; Glasa, Jan

2017-12-01

Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE PAGES

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

2016-09-18

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
Integrating Cache Performance Modeling and Tuning Support in Parallelization Tools

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

With the resurgence of distributed shared memory (DSM) systems based on cache-coherent Non Uniform Memory Access (ccNUMA) architectures and increasing disparity between memory and processors speeds, data locality overheads are becoming the greatest bottlenecks in the way of realizing potential high performance of these systems. While parallelization tools and compilers facilitate the users in porting their sequential applications to a DSM system, a lot of time and effort is needed to tune the memory performance of these applications to achieve reasonable speedup. In this paper, we show that integrating cache performance modeling and tuning support within a parallelization environment can alleviate this problem. The Cache Performance Modeling and Prediction Tool (CPMP), employs trace-driven simulation techniques without the overhead of generating and managing detailed address traces. CPMP predicts the cache performance impact of source code level "what-if" modifications in a program to assist a user in the tuning process. CPMP is built on top of a customized version of the Computer Aided Parallelization Tools (CAPTools) environment. Finally, we demonstrate how CPMP can be applied to tune a real Computational Fluid Dynamics (CFD) application.
Xyce Parallel Electronic Simulator Users Guide Version 6.2.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)« less
Xyce Parallel Electronic Simulator Users Guide Version 6.4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2015 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)« less
A software platform for the analysis of dermatology images

NASA Astrophysics Data System (ADS)

Vlassi, Maria; Mavraganis, Vlasios; Asvestas, Panteleimon

2017-11-01

The purpose of this paper is to present a software platform developed in Python programming environment that can be used for the processing and analysis of dermatology images. The platform provides the capability for reading a file that contains a dermatology image. The platform supports image formats such as Windows bitmaps, JPEG, JPEG2000, portable network graphics, TIFF. Furthermore, it provides suitable tools for selecting, either manually or automatically, a region of interest (ROI) on the image. The automated selection of a ROI includes filtering for smoothing the image and thresholding. The proposed software platform has a friendly and clear graphical user interface and could be a useful second-opinion tool to a dermatologist. Furthermore, it could be used to classify images including from other anatomical parts such as breast or lung, after proper re-training of the classification algorithms.
Integrating Remote and Social Sensing Data for a Scenario on Secure Societies in Big Data Platform

NASA Astrophysics Data System (ADS)

Albani, Sergio; Lazzarini, Michele; Koubarakis, Manolis; Taniskidou, Efi Karra; Papadakis, George; Karkaletsis, Vangelis; Giannakopoulos, George

2016-08-01

In the framework of the Horizon 2020 project BigDataEurope (Integrating Big Data, Software & Communities for Addressing Europe's Societal Challenges), a pilot for the Secure Societies Societal Challenge was designed considering the requirements coming from relevant stakeholders. The pilot is focusing on the integration in a Big Data platform of data coming from remote and social sensing.The information on land changes coming from the Copernicus Sentinel 1A sensor (Change Detection workflow) is integrated with information coming from selected Twitter and news agencies accounts (Event Detection workflow) in order to provide the user with multiple sources of information.The Change Detection workflow implements a processing chain in a distributed parallel manner, exploiting the Big Data capabilities in place; the Event Detection workflow implements parallel and distributed social media and news agencies monitoring as well as suitable mechanisms to detect and geo-annotate the related events.

Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gittens, Alex; Devarakonda, Aditya; Racah, Evan

We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scalingmore » experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.« less
Kinematics and design of a class of parallel manipulators

NASA Astrophysics Data System (ADS)

Hertz, Roger Barry

1998-12-01

This dissertation is concerned with the kinematic analysis and design of a class of three degree-of-freedom, spatial parallel manipulators. The class of manipulators is characterized by two platforms, between which are three legs, each possessing a succession of revolute, spherical, and revolute joints. The class is termed the "revolute-spherical-revolute" class of parallel manipulators. Two members of this class are examined. The first mechanism is a double-octahedral variable-geometry truss, and the second is termed a double tripod. The history the mechanisms is explored---the variable-geometry truss dates back to 1984, while predecessors of the double tripod mechanism date back to 1869. This work centers on the displacement analysis of these three-degree-of-freedom mechanisms. Two types of problem are solved: the forward displacement analysis (forward kinematics) and the inverse displacement analysis (inverse kinematics). The kinematic model of the class of mechanism is general in nature. A classification scheme for the revolute-spherical-revolute class of mechanism is introduced, which uses dominant geometric features to group designs into 8 different sub-classes. The forward kinematics problem is discussed: given a set of independently controllable input variables, solve for the relative position and orientation between the two platforms. For the variable-geometry truss, the controllable input variables are assumed to be the linear (prismatic) joints. For the double tripod, the controllable input variables are the three revolute joints adjacent to the base (proximal) platform. Multiple solutions are presented to the forward kinematics problem, indicating that there are many different positions (assemblies) that the manipulator can assume with equivalent inputs. For the double tripod these solutions can be expressed as a 16th degree polynomial in one unknown, while for the variable-geometry truss there exist two 16th degree polynomials, giving rise to 256 solutions. For special cases of the double tripod, the forward kinematics problem is shown to have a closed-form solution. Numerical examples are presented for the solution to the forward kinematics. A double tripod is presented that admits 16 unique and real forward kinematics solutions. Another example for a variable geometry truss is given that possesses 64 real solutions: 8 for each 16th order polynomial. The inverse kinematics problem is also discussed: given the relative position of the hand (end-effector), which is rigidly attached to one platform, solve for the independently controlled joint variables. Iterative solutions are proposed for both the variable-geometry truss and the double tripod. For special cases of both mechanisms, closed-form solutions are given. The practical problems of designing, building, and controlling a double-tripod manipulator are addressed. The resulting manipulator is a first-of-its kind prototype of a tapered (asymmetric) double-tripod manipulator. Real-time forward and inverse kinematics algorithms on an industrial robot controller is presented. The resulting performance of the prototype is impressive, since it was to achieve a maximum tool-tip speed of 4064 mm/s, maximum acceleration of 5 g, and a cycle time of 1.2 seconds for a typical pick-and-place pattern.
ESTEST: An Open Science Platform for Electronic Structure Research

ERIC Educational Resources Information Center

Yuan, Gary

2012-01-01

Open science platforms in support of data generation, analysis, and dissemination are becoming indispensible tools for conducting research. These platforms use informatics and information technologies to address significant problems in open science data interoperability, verification & validation, comparison, analysis, post-processing,…
Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures

NASA Astrophysics Data System (ADS)

Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.

2016-12-01

The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.
Optimizing Irregular Applications for Energy and Performance on the Tilera Many-core Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chavarría-Miranda, Daniel; Panyala, Ajay R.; Halappanavar, Mahantesh

Optimizing applications simultaneously for energy and performance is a complex problem. High performance, parallel, irregular applications are notoriously hard to optimize due to their data-dependent memory accesses, lack of structured locality and complex data structures and code patterns. Irregular kernels are growing in importance in applications such as machine learning, graph analytics and combinatorial scientific computing. Performance- and energy-efficient implementation of these kernels on modern, energy efficient, multicore and many-core platforms is therefore an important and challenging problem. We present results from optimizing two irregular applications { the Louvain method for community detection (Grappolo), and high-performance conjugate gradient (HPCCG) {more » on the Tilera many-core system. We have significantly extended MIT's OpenTuner auto-tuning framework to conduct a detailed study of platform-independent and platform-specific optimizations to improve performance as well as reduce total energy consumption. We explore the optimization design space along three dimensions: memory layout schemes, compiler-based code transformations, and optimization of parallel loop schedules. Using auto-tuning, we demonstrate whole node energy savings of up to 41% relative to a baseline instantiation, and up to 31% relative to manually optimized variants.« less
Two degrees of freedom parallel linkageto track solarthermal platforms installed on ships

NASA Astrophysics Data System (ADS)

Visa, I.; Cotorcea, A.; Moldovan, M.; Neagoe, M.

2016-08-01

Transportation is responsible at global level for one third of the total energy consumption. Solutions to reduce conventional fuel consumption are under research, to improve the systems’ efficiency and to replace the current fossil fuels. There already are several applications, usually onsmall maritime vehicles, using photovoltaic systems to cover the electric energy demand on-board andto support the owners’ commitment towards sustainability. In most cases, these systems are fixed, parallely aligned with the deck; thus, the amount of solar energy received is heavily reduced (down to 50%) as compared to the available irradiance. Large scale, feasible applications require to maximize the energy output of the solar convertors implemented on ships; using solar tracking systems is an obvious path, allowing a gain up to 35...40% in the output energy, as compared to fixed systems. Spatial limitations, continuous movement of the ship and harsh navigation condition are the main barriers in implementation. This paper proposes a solar tracking system with two degrees of freedom, for a solar thermal platform, based on a parallel linkage with sphericaljoints, considered as Multibody System. The analytical model for mobile platform position, pressure angles and a numerical example are given in the paper.
An embedded multi-core parallel model for real-time stereo imaging

NASA Astrophysics Data System (ADS)

He, Wenjing; Hu, Jian; Niu, Jingyu; Li, Chuanrong; Liu, Guangyu

2018-04-01

The real-time processing based on embedded system will enhance the application capability of stereo imaging for LiDAR and hyperspectral sensor. The task partitioning and scheduling strategies for embedded multiprocessor system starts relatively late, compared with that for PC computer. In this paper, aimed at embedded multi-core processing platform, a parallel model for stereo imaging is studied and verified. After analyzing the computing amount, throughout capacity and buffering requirements, a two-stage pipeline parallel model based on message transmission is established. This model can be applied to fast stereo imaging for airborne sensors with various characteristics. To demonstrate the feasibility and effectiveness of the parallel model, a parallel software was designed using test flight data, based on the 8-core DSP processor TMS320C6678. The results indicate that the design performed well in workload distribution and had a speed-up ratio up to 6.4.
The method of parallel-hierarchical transformation for rapid recognition of dynamic images using GPGPU technology

NASA Astrophysics Data System (ADS)

Timchenko, Leonid; Yarovyi, Andrii; Kokriatskaya, Nataliya; Nakonechna, Svitlana; Abramenko, Ludmila; Ławicki, Tomasz; Popiel, Piotr; Yesmakhanova, Laura

2016-09-01

The paper presents a method of parallel-hierarchical transformations for rapid recognition of dynamic images using GPU technology. Direct parallel-hierarchical transformations based on cluster CPU-and GPU-oriented hardware platform. Mathematic models of training of the parallel hierarchical (PH) network for the transformation are developed, as well as a training method of the PH network for recognition of dynamic images. This research is most topical for problems on organizing high-performance computations of super large arrays of information designed to implement multi-stage sensing and processing as well as compaction and recognition of data in the informational structures and computer devices. This method has such advantages as high performance through the use of recent advances in parallelization, possibility to work with images of ultra dimension, ease of scaling in case of changing the number of nodes in the cluster, auto scan of local network to detect compute nodes.
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Chen, Yousu; Wu, Di

2015-12-09

Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
Generalized parallel-perspective stereo mosaics from airborne video.

PubMed

Zhu, Zhigang; Hanson, Allen R; Riseman, Edward M

2004-02-01

In this paper, we present a new method for automatically and efficiently generating stereoscopic mosaics by seamless registration of images collected by a video camera mounted on an airborne platform. Using a parallel-perspective representation, a pair of geometrically registered stereo mosaics can be precisely constructed under quite general motion. A novel parallel ray interpolation for stereo mosaicing (PRISM) approach is proposed to make stereo mosaics seamless in the presence of obvious motion parallax and for rather arbitrary scenes. Parallel-perspective stereo mosaics generated with the PRISM method have better depth resolution than perspective stereo due to the adaptive baseline geometry. Moreover, unlike previous results showing that parallel-perspective stereo has a constant depth error, we conclude that the depth estimation error of stereo mosaics is in fact a linear function of the absolute depths of a scene. Experimental results on long video sequences are given.
Xyce Parallel Electronic Simulator - Users' Guide Version 2.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hutchinson, Scott A; Hoekstra, Robert J.; Russo, Thomas V.

This manual describes the use of theXyceParallel Electronic Simulator.Xycehasbeen designed as a SPICE-compatible, high-performance analog circuit simulator, andhas been written to support the simulation needs of the Sandia National Laboratorieselectrical designers. This development has focused on improving capability over thecurrent state-of-the-art in the following areas:%04Capability to solve extremely large circuit problems by supporting large-scale par-allel computing platforms (up to thousands of processors). Note that this includessupport for most popular parallel and serial computers.%04Improved performance for all numerical kernels (e.g., time integrator, nonlinearand linear solvers) through state-of-the-art algorithms and novel techniques.%04Device models which are specifically tailored to meet Sandia's needs, includingmanymore » radiation-aware devices.3 XyceTMUsers' Guide%04Object-oriented code design and implementation using modern coding practicesthat ensure that theXyceParallel Electronic Simulator will be maintainable andextensible far into the future.Xyceis a parallel code in the most general sense of the phrase - a message passingparallel implementation - which allows it to run efficiently on the widest possible numberof computing platforms. These include serial, shared-memory and distributed-memoryparallel as well as heterogeneous platforms. Careful attention has been paid to thespecific nature of circuit-simulation problems to ensure that optimal parallel efficiencyis achieved as the number of processors grows.The development ofXyceprovides a platform for computational research and de-velopment aimed specifically at the needs of the Laboratory. WithXyce, Sandia hasan %22in-house%22 capability with which both new electrical (e.g., device model develop-ment) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms)research and development can be performed. As a result,Xyceis a unique electricalsimulation capability, designed to meet the unique needs of the laboratory.4 XyceTMUsers' GuideAcknowledgementsThe authors would like to acknowledge the entire Sandia National Laboratories HPEMS(High Performance Electrical Modeling and Simulation) team, including Steve Wix, CarolynBogdan, Regina Schells, Ken Marx, Steve Brandon and Bill Ballard, for their support onthis project. We also appreciate very much the work of Jim Emery, Becky Arnold and MikeWilliamson for the help in reviewing this document.Lastly, a very special thanks to Hue Lai for typesetting this document with LATEX.TrademarksThe information herein is subject to change without notice.Copyrightc 2002-2003 Sandia Corporation. All rights reserved.XyceTMElectronic Simulator andXyceTMtrademarks of Sandia Corporation.Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence DesignSystems, Inc.Silicon Graphics, the Silicon Graphics logo and IRIX are registered trademarks of SiliconGraphics, Inc.Microsoft, Windows and Windows 2000 are registered trademark of Microsoft Corporation.Solaris and UltraSPARC are registered trademarks of Sun Microsystems Corporation.Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation.HP and Alpha are registered trademarks of Hewlett-Packard company.Amtec and TecPlot are trademarks of Amtec Engineering, Inc.Xyce's expression library is based on that inside Spice 3F5 developed by the EECS De-partment at the University of California.All other trademarks are property of their respective owners.ContactsBug Reportshttp://tvrusso.sandia.gov/bugzillaEmailxyce-support%40sandia.govWorld Wide Webhttp://www.cs.sandia.gov/xyce5 XyceTMUsers' GuideThis page is left intentionally blank6« less
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

NASA Astrophysics Data System (ADS)

Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

2013-01-01

Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.
Microfluidic Pneumatic Logic Circuits and Digital Pneumatic Microprocessors for Integrated Microfluidic Systems

PubMed Central

Rhee, Minsoung

2010-01-01

We have developed pneumatic logic circuits and microprocessors built with microfluidic channels and valves in polydimethylsiloxane (PDMS). The pneumatic logic circuits perform various combinational and sequential logic calculations with binary pneumatic signals (atmosphere and vacuum), producing cascadable outputs based on Boolean operations. A complex microprocessor is constructed from combinations of various logic circuits and receives pneumatically encoded serial commands at a single input line. The device then decodes the temporal command sequence by spatial parallelization, computes necessary logic calculations between parallelized command bits, stores command information for signal transportation and maintenance, and finally executes the command for the target devices. Thus, such pneumatic microprocessors will function as a universal on-chip control platform to perform complex parallel operations for large-scale integrated microfluidic devices. To demonstrate the working principles, we have built 2-bit, 3-bit, 4-bit, and 8-bit microprecessors to control various target devices for applications such as four color dye mixing, and multiplexed channel fluidic control. By significantly reducing the need for external controllers, the digital pneumatic microprocessor can be used as a universal on-chip platform to autonomously manipulate microfluids in a high throughput manner. PMID:19823730
Learning in Parallel: Using Parallel Corpora to Enhance Written Language Acquisition at the Beginning Level

ERIC Educational Resources Information Center

Bluemel, Brody

2014-01-01

This article illustrates the pedagogical value of incorporating parallel corpora in foreign language education. It explores the development of a Chinese/English parallel corpus designed specifically for pedagogical application. The corpus tool was created to aid language learners in reading comprehension and writing development by making foreign…
Using a blog as an integrated eLearning tool and platform.

PubMed

Goh, Poh Sun

2016-06-01

Technology enhanced learning or eLearning allows educators to expand access to educational content, promotes engagement with students and makes it easier for students to access educational material at a time, place and pace which suits them. The challenge for educators beginning their eLearning journey is to decide where to start, which includes the choice of an eLearning tool and platform. This article will share one educator's decision making process, and experience using blogs as a flexible and versatile integrated eLearning tool and platform. Apart from being a cost effective/free tool and platform, blogs offer the possibility of creating a hyperlinked indexed content repository, for both created and curated educational material; as well as a distribution and engagement tool and platform. Incorporating pedagogically sound activities and educational practices into a blog promote a structured templated teaching process, which can be reproduced. Moving from undergraduate to postgraduate training, educational blogs supported by a comprehensive online case-based repository offer the possibility of training beyond competency towards proficiency and expert level performance through a process of deliberate practice. By documenting educational content and the student engagement and learning process, as well as feedback and personal reflection of educational sessions, blogs can also form the basis for a teaching portfolio, and provide evidence and data of scholarly teaching and educational scholarship. Looking into the future, having a collection of readily accessible indexed hyperlinked teaching material offers the potential to do on the spot teaching with illustrative material called up onto smart surfaces, and displayed on holographic interfaces.
Digital imaging of root traits (DIRT): a high-throughput computing and collaboration platform for field-based root phenomics.

PubMed

Das, Abhiram; Schneider, Hannah; Burridge, James; Ascanio, Ana Karine Martinez; Wojciechowski, Tobias; Topp, Christopher N; Lynch, Jonathan P; Weitz, Joshua S; Bucksch, Alexander

2015-01-01

Plant root systems are key drivers of plant function and yield. They are also under-explored targets to meet global food and energy demands. Many new technologies have been developed to characterize crop root system architecture (CRSA). These technologies have the potential to accelerate the progress in understanding the genetic control and environmental response of CRSA. Putting this potential into practice requires new methods and algorithms to analyze CRSA in digital images. Most prior approaches have solely focused on the estimation of root traits from images, yet no integrated platform exists that allows easy and intuitive access to trait extraction and analysis methods from images combined with storage solutions linked to metadata. Automated high-throughput phenotyping methods are increasingly used in laboratory-based efforts to link plant genotype with phenotype, whereas similar field-based studies remain predominantly manual low-throughput. Here, we present an open-source phenomics platform "DIRT", as a means to integrate scalable supercomputing architectures into field experiments and analysis pipelines. DIRT is an online platform that enables researchers to store images of plant roots, measure dicot and monocot root traits under field conditions, and share data and results within collaborative teams and the broader community. The DIRT platform seamlessly connects end-users with large-scale compute "commons" enabling the estimation and analysis of root phenotypes from field experiments of unprecedented size. DIRT is an automated high-throughput computing and collaboration platform for field based crop root phenomics. The platform is accessible at http://www.dirt.iplantcollaborative.org/ and hosted on the iPlant cyber-infrastructure using high-throughput grid computing resources of the Texas Advanced Computing Center (TACC). DIRT is a high volume central depository and high-throughput RSA trait computation platform for plant scientists working on crop roots. It enables scientists to store, manage and share crop root images with metadata and compute RSA traits from thousands of images in parallel. It makes high-throughput RSA trait computation available to the community with just a few button clicks. As such it enables plant scientists to spend more time on science rather than on technology. All stored and computed data is easily accessible to the public and broader scientific community. We hope that easy data accessibility will attract new tool developers and spur creative data usage that may even be applied to other fields of science.
Assessing students' sentiments towards the use of a Building Information Modelling (BIM) learning platform in a construction project management course

NASA Astrophysics Data System (ADS)

Suwal, Sunil; Singh, Vishal

2018-07-01

Building Information Modelling (BIM) tools and processes are increasingly adopted and implemented in the construction industry. Consequently, BIM education is considered increasingly important in Architecture, Engineering and Construction (AEC) education. While most of the research and literature on BIM education in engineering studies has focused on BIM implementation strategies, processes, benefits, and challenges, there is limited study on students' perception towards the implementation of BIM courses, or about online BIM learning platforms, or about the BIM tools themselves. Therefore, this paper takes the first steps towards addressing this gap. This study analyses students' (57 students) perception and sentiments towards the use of an online BIM learning platform and explores the potential implications of the findings for BIM education. The findings suggest that online BIM learning platforms are highly rated by students as a positive learning experience, indicating the need for greater integration of such tools and approaches in AEC courses.
Optical RISC computer

NASA Astrophysics Data System (ADS)

Guilfoyle, Peter S.; Stone, Richard V.; Hessenbruch, John M.; Zeise, Frederick F.

1993-07-01

A second generation digital optical computer (DOC II) has been developed which utilizes a RISC based operating system as its host. This 32 bit, high performance (12.8 GByte/sec), computing platform demonstrates a number of basic principals that are inherent to parallel free space optical interconnects such as speed (up to 1012 bit operations per second) and low power 1.2 fJ per bit). Although DOC II is a general purpose machine, special purpose applications have been developed and are currently being evaluated on the optical platform.
FLAME: A platform for high performance computing of complex systems, applied for three case studies

DOE PAGES

Kiran, Mariam; Bicak, Mesude; Maleki-Dizaji, Saeedeh; ...

2011-01-01

FLAME allows complex models to be automatically parallelised on High Performance Computing (HPC) grids enabling large number of agents to be simulated over short periods of time. Modellers are hindered by complexities of porting models on parallel platforms and time taken to run large simulations on a single machine, which FLAME overcomes. Three case studies from different disciplines were modelled using FLAME, and are presented along with their performance results on a grid.

An Energy-Efficient and Scalable Deep Learning/Inference Processor With Tetra-Parallel MIMD Architecture for Big Data Applications.

PubMed

Park, Seong-Wook; Park, Junyoung; Bong, Kyeongryeol; Shin, Dongjoo; Lee, Jinmook; Choi, Sungpill; Yoo, Hoi-Jun

2015-12-01

Deep Learning algorithm is widely used for various pattern recognition applications such as text recognition, object recognition and action recognition because of its best-in-class recognition accuracy compared to hand-crafted algorithm and shallow learning based algorithms. Long learning time caused by its complex structure, however, limits its usage only in high-cost servers or many-core GPU platforms so far. On the other hand, the demand on customized pattern recognition within personal devices will grow gradually as more deep learning applications will be developed. This paper presents a SoC implementation to enable deep learning applications to run with low cost platforms such as mobile or portable devices. Different from conventional works which have adopted massively-parallel architecture, this work adopts task-flexible architecture and exploits multiple parallelism to cover complex functions of convolutional deep belief network which is one of popular deep learning/inference algorithms. In this paper, we implement the most energy-efficient deep learning and inference processor for wearable system. The implemented 2.5 mm × 4.0 mm deep learning/inference processor is fabricated using 65 nm 8-metal CMOS technology for a battery-powered platform with real-time deep inference and deep learning operation. It consumes 185 mW average power, and 213.1 mW peak power at 200 MHz operating frequency and 1.2 V supply voltage. It achieves 411.3 GOPS peak performance and 1.93 TOPS/W energy efficiency, which is 2.07× higher than the state-of-the-art.
Community Crowd-Funded Solar Finance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jagerson, Gordon "Ty"

The award supported the demonstration and development of the Village Power Platform, which enables community organizations to more readily develop, finance and operate solar installations on local community organizations. The platform enables partial or complete local ownership of the solar installation. The award specifically supported key features including financial modeling tools, community communications tools, crowdfunding mechanisms, a mobile app, and other critical features.
WebArray: an online platform for microarray data analysis

PubMed Central

Xia, Xiaoqin; McClelland, Michael; Wang, Yipeng

2005-01-01

Background Many cutting-edge microarray analysis tools and algorithms, including commonly used limma and affy packages in Bioconductor, need sophisticated knowledge of mathematics, statistics and computer skills for implementation. Commercially available software can provide a user-friendly interface at considerable cost. To facilitate the use of these tools for microarray data analysis on an open platform we developed an online microarray data analysis platform, WebArray, for bench biologists to utilize these tools to explore data from single/dual color microarray experiments. Results The currently implemented functions were based on limma and affy package from Bioconductor, the spacings LOESS histogram (SPLOSH) method, PCA-assisted normalization method and genome mapping method. WebArray incorporates these packages and provides a user-friendly interface for accessing a wide range of key functions of limma and others, such as spot quality weight, background correction, graphical plotting, normalization, linear modeling, empirical bayes statistical analysis, false discovery rate (FDR) estimation, chromosomal mapping for genome comparison. Conclusion WebArray offers a convenient platform for bench biologists to access several cutting-edge microarray data analysis tools. The website is freely available at . It runs on a Linux server with Apache and MySQL. PMID:16371165
GC31G-1182: Opennex, a Private-Public Partnership in Support of the National Climate Assessment

NASA Technical Reports Server (NTRS)

Nemani, Ramakrishna R.; Wang, Weile; Michaelis, Andrew; Votava, Petr; Ganguly, Sangram

2016-01-01

The NASA Earth Exchange (NEX) is a collaborative computing platform that has been developed with the objective of bringing scientists together with the software tools, massive global datasets, and supercomputing resources necessary to accelerate research in Earth systems science and global change. NEX is funded as an enabling tool for sustaining the national climate assessment. Over the past five years, researchers have used the NEX platform and produced a number of data sets highly relevant to the National Climate Assessment. These include high-resolution climate projections using different downscaling techniques and trends in historical climate from satellite data. To enable a broader community in exploiting the above datasets, the NEX team partnered with public cloud providers to create the OpenNEX platform. OpenNEX provides ready access to NEX data holdings on a number of public cloud platforms along with pertinent analysis tools and workflows in the form of Machine Images and Docker Containers, lectures and tutorials by experts. We will showcase some of the applications of OpenNEX data and tools by the community on Amazon Web Services, Google Cloud and the NEX Sandbox.
Cross-platform validation and analysis environment for particle physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chekanov, S. V.; Pogrebnyak, I.; Wilbern, D.

A multi-platform validation and analysis framework for public Monte Carlo simulation for high-energy particle collisions is discussed. The front-end of this framework uses the Python programming language, while the back-end is written in Java, which provides a multi-platform environment that can be run from a web browser and can easily be deployed at the grid sites. The analysis package includes all major software tools used in high-energy physics, such as Lorentz vectors, jet algorithms, histogram packages, graphic canvases, and tools for providing data access. This multi-platform software suite, designed to minimize OS-specific maintenance and deployment time, is used for onlinemore » validation of Monte Carlo event samples through a web interface.« less
SBML-PET-MPI: a parallel parameter estimation tool for Systems Biology Markup Language based models.

PubMed

Zi, Zhike

2011-04-01

Parameter estimation is crucial for the modeling and dynamic analysis of biological systems. However, implementing parameter estimation is time consuming and computationally demanding. Here, we introduced a parallel parameter estimation tool for Systems Biology Markup Language (SBML)-based models (SBML-PET-MPI). SBML-PET-MPI allows the user to perform parameter estimation and parameter uncertainty analysis by collectively fitting multiple experimental datasets. The tool is developed and parallelized using the message passing interface (MPI) protocol, which provides good scalability with the number of processors. SBML-PET-MPI is freely available for non-commercial use at http://www.bioss.uni-freiburg.de/cms/sbml-pet-mpi.html or http://sites.google.com/site/sbmlpetmpi/.
DCL System Using Deep Learning Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals

DTIC Science & Technology

2012-09-30

platform (HPC) was developed, called the HPC-Acoustic Data Accelerator, or HPC-ADA for short. The HPC-ADA was designed based on fielded systems [1-4...software (Detection cLassificaiton for MAchine learning - High Peformance Computing). The software package was designed to utilize parallel and...Sedna [7] and is designed using a parallel architecture2, allowing existing algorithms to distribute to the various processing nodes with minimal changes
Micro/Nanoscale Parallel Patterning of Functional Biomolecules, Organic Fluorophores and Colloidal Nanocrystals

PubMed Central

2009-01-01

We describe the design and optimization of a reliable strategy that combines self-assembly and lithographic techniques, leading to very precise micro-/nanopositioning of biomolecules for the realization of micro- and nanoarrays of functional DNA and antibodies. Moreover, based on the covalent immobilization of stable and versatile SAMs of programmable chemical reactivity, this approach constitutes a general platform for the parallel site-specific deposition of a wide range of molecules such as organic fluorophores and water-soluble colloidal nanocrystals. PMID:20596482
Access to CAMAC from VxWorks and UNIX in DART

NASA Astrophysics Data System (ADS)

Streets, J.; Meadows, J.; Moore, C.; Pordes, R.; Slimmer, D.; Vittone, M.; Stern, E.

1996-02-01

As part of the DART Project [Data acquisition for the next Generation Fermilab Fixed Target Experiments] we have developed a package of software for CAMAC access from UNIX and VxWorks platforms, with support for several hardware interfaces. We report on developments for the CES CBD8210 VME to parallel CAMAC, the Hytec VSD2992 VME to serial CAMAC and Jorway 411s SCSI to parallel and serial CAMAC branch drivers, and give a summary of the timings obtained.
High-throughput process development: I. Process chromatography.

PubMed

Rathore, Anurag S; Bhambure, Rahul

2014-01-01

Chromatographic separation serves as "a workhorse" for downstream process development and plays a key role in removal of product-related, host cell-related, and process-related impurities. Complex and poorly characterized raw materials and feed material, low feed concentration, product instability, and poor mechanistic understanding of the processes are some of the critical challenges that are faced during development of a chromatographic step. Traditional process development is performed as trial-and-error-based evaluation and often leads to a suboptimal process. High-throughput process development (HTPD) platform involves an integration of miniaturization, automation, and parallelization and provides a systematic approach for time- and resource-efficient chromatography process development. Creation of such platforms requires integration of mechanistic knowledge of the process with various statistical tools for data analysis. The relevance of such a platform is high in view of the constraints with respect to time and resources that the biopharma industry faces today. This protocol describes the steps involved in performing HTPD of process chromatography step. It described operation of a commercially available device (PreDictor™ plates from GE Healthcare). This device is available in 96-well format with 2 or 6 μL well size. We also discuss the challenges that one faces when performing such experiments as well as possible solutions to alleviate them. Besides describing the operation of the device, the protocol also presents an approach for statistical analysis of the data that is gathered from such a platform. A case study involving use of the protocol for examining ion-exchange chromatography of granulocyte colony-stimulating factor (GCSF), a therapeutic product, is briefly discussed. This is intended to demonstrate the usefulness of this protocol in generating data that is representative of the data obtained at the traditional lab scale. The agreement in the data is indeed very significant (regression coefficient 0.93). We think that this protocol will be of significant value to those involved in performing high-throughput process development of process chromatography.
30 CFR 56.7051 - Loose objects on the mast or drill platform.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Loose objects on the mast or drill platform. 56... Drilling and Rotary Jet Piercing Drilling § 56.7051 Loose objects on the mast or drill platform. To prevent injury to personnel, tools and other objects shall not be left loose on the mast or drill platform. ...
30 CFR 56.7051 - Loose objects on the mast or drill platform.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Loose objects on the mast or drill platform. 56... Drilling and Rotary Jet Piercing Drilling § 56.7051 Loose objects on the mast or drill platform. To prevent injury to personnel, tools and other objects shall not be left loose on the mast or drill platform. ...
30 CFR 56.7051 - Loose objects on the mast or drill platform.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Loose objects on the mast or drill platform. 56... Drilling and Rotary Jet Piercing Drilling § 56.7051 Loose objects on the mast or drill platform. To prevent injury to personnel, tools and other objects shall not be left loose on the mast or drill platform. ...
Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes.

PubMed

Margaria, Tiziana; Kubczak, Christian; Steffen, Bernhard

2008-04-25

With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout. Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution. As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way.
Simulation platform of LEO satellite communication system based on OPNET

NASA Astrophysics Data System (ADS)

Zhang, Yu; Zhang, Yong; Li, Xiaozhuo; Wang, Chuqiao; Li, Haihao

2018-02-01

For the purpose of verifying communication protocol in the low earth orbit (LEO) satellite communication system, an Optimized Network Engineering Tool (OPNET) based simulation platform is built. Using the three-layer modeling mechanism, the network model, the node model and the process model of the satellite communication system are built respectively from top to bottom, and the protocol will be implemented by finite state machine and Proto-C language. According to satellite orbit parameters, orbit files are generated via Satellite Tool Kit (STK) and imported into OPNET, and the satellite nodes move along their orbits. The simulation platform adopts time-slot-driven mode, divides simulation time into continuous time slots, and allocates slot number for each time slot. A resource allocation strategy is simulated on this platform, and the simulation results such as resource utilization rate, system throughput and packet delay are analyzed, which indicate that this simulation platform has outstanding versatility.
Scalable Visual Analytics of Massive Textual Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krishnan, Manoj Kumar; Bohn, Shawn J.; Cowley, Wendy E.

2007-04-01

This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.
Gas diffusion as a new fluidic unit operation for centrifugal microfluidic platforms.

PubMed

Ymbern, Oriol; Sández, Natàlia; Calvo-López, Antonio; Puyol, Mar; Alonso-Chamarro, Julian

2014-03-07

A centrifugal microfluidic platform prototype with an integrated membrane for gas diffusion is presented for the first time. The centrifugal platform allows multiple and parallel analysis on a single disk and integrates at least ten independent microfluidic subunits, which allow both calibration and sample determination. It is constructed with a polymeric substrate material and it is designed to perform colorimetric determinations by the use of a simple miniaturized optical detection system. The determination of three different analytes, sulfur dioxide, nitrite and carbon dioxide, is carried out as a proof of concept of a versatile microfluidic system for the determination of analytes which involve a gas diffusion separation step during the analytical procedure.
Automatic Multilevel Parallelization Using OpenMP

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

2002-01-01

In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing

NASA Astrophysics Data System (ADS)

Tang, Jingyin; Matyas, Corene J.

2018-02-01

Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.

GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

PubMed

Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

2013-01-28

Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.
Use of the NetBeans Platform for NASA Robotic Conjunction Assessment Risk Analysis

NASA Technical Reports Server (NTRS)

Sabey, Nickolas J.

2014-01-01

The latest Java and JavaFX technologies are very attractive software platforms for customers involved in space mission operations such as those of NASA and the US Air Force. For NASA Robotic Conjunction Assessment Risk Analysis (CARA), the NetBeans platform provided an environment in which scalable software solutions could be developed quickly and efficiently. Both Java 8 and the NetBeans platform are in the process of simplifying CARA development in secure environments by providing a significant amount of capability in a single accredited package, where accreditation alone can account for 6-8 months for each library or software application. Capabilities either in use or being investigated by CARA include: 2D and 3D displays with JavaFX, parallelization with the new Streams API, and scalability through the NetBeans plugin architecture.
Utilization of Social Media Platforms for Educational Purposes among the Faculty of Higher Education with Special Reference to Tamil Nadu

ERIC Educational Resources Information Center

Vivakaran, Mangala Vadivu; Neelamalar, M.

2018-01-01

Social media tools are observed to play a vital role in the renovation of the conventional teaching and learning practices across the globe. Though primarily developed for online social communication, social media platforms tend to possess suitable tools that can be used for instructional purposes in order to initiate active learning among…
Tools to Analyze Morphology and Spatially Mapped Molecular Data | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

This project is to develop, deploy, and disseminate a suite of open source tools and integrated informatics platform that will facilitate multi-scale, correlative analyses of high resolution whole slide tissue image data, spatially mapped genetics and molecular data for cancer research. This platform will play an essential role in supporting studies of tumor initiation, development, heterogeneity, invasion, and metastasis.
Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

PubMed

Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

2015-01-01

Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable.
The Parallel System for Integrating Impact Models and Sectors (pSIMS)

NASA Technical Reports Server (NTRS)

Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian

2014-01-01

We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.
Composing Data Parallel Code for a SPARQL Graph Engine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste

Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basicmore » graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.« less
Sentinel-1 automatic processing chain for volcanic and seismic areas monitoring within the Geohazards Exploitation Platform (GEP)

NASA Astrophysics Data System (ADS)

De Luca, Claudio; Zinno, Ivana; Manunta, Michele; Lanari, Riccardo; Casu, Francesco

2016-04-01

The microwave remote sensing scenario is rapidly evolving through development of new sensor technology for Earth Observation (EO). In particular, Sentinel-1A (S1A) is the first of a sensors' constellation designed to provide a satellite data stream for the Copernicus European program. Sentinel-1A has been specifically designed to provide, over land, Differential Interferometric Synthetic Aperture Radar (DInSAR) products to analyze and investigate Earth's surface displacements. S1A peculiarities include wide ground coverage (250 km of swath), C-band operational frequency and short revisit time (that will reduce from 12 to 6 days when the twin system Sentinel-1B will be placed in orbit during 2016). Such characteristics, together with the global coverage acquisition policy, make the Sentinel-1 constellation to be extremely suitable for volcanic and seismic areas studying and monitoring worldwide, thus allowing the generation of both ground displacement information with increasing rapidity and new geological understanding. The main acquisition mode over land is the so called Interferometric Wide Swath (IWS) that is based on the Terrain Observation by Progressive Scans (TOPS) technique and that guarantees the mentioned S1A large coverage characteristics at expense of a not trivial interferometric processing. Moreover, the satellite spatial coverage and the reduced revisit time will lead to an exponential increase of the data archives that, after the launch of Sentine-1B, will reach about 3TB per day. Therefore, the EO scientific community needs from the one hand automated and effective DInSAR tools able to address the S1A processing complexity, and from the other hand the computing and storage capacities to face out the expected large amount of data. Then, it is becoming more crucial to move processors and tools close to the satellite archives, being not efficient anymore the approach of downloading and processing data with in-house computing facilities. To address these issues, ESA recently funded the development of the Geohazards Exploitation Platform (GEP), a project aimed at putting together data, processing tools and results to make them accessible to the EO scientific community, with particular emphasis to the Geohazard Supersites & Natural Laboratories and the CEOS Seismic Hazards and Volcanoes Pilots. In this work we present the integration of the parallel version of a well-known DInSAR algorithm referred to as Small BAseline Subset (P-SBAS) within the GEP platform for processing Sentinel-1 data. The integration allowed us to set up an operational on-demand web tool, open to every user, aimed at automatically processing S1A data for the generation of SBAS displacement time-series. Main characteristics as well as a number of experimental results obtained by using the implemented web tool will be also shown. This work is partially supported by: the RITMARE project of Italian MIUR, the DPC-CNR agreement and the ESA GEP project.
The High-Throughput Protein Sample Production Platform of the Northeast Structural Genomics Consortium

PubMed Central

Xiao, Rong; Anderson, Stephen; Aramini, James; Belote, Rachel; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John K.; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Jiang, Mei; Kornhaber, Gregory J.; Lee, Dong Yup; Locke, Jessica Y.; Ma, Li-Chung; Maglaqui, Melissa; Mao, Lei; Mitra, Saheli; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Sharma, Seema; Shastry, Ritu; Swapna, G.V.T.; Tong, Saichu N.; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.; Acton, Thomas B.

2014-01-01

We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities. PMID:20688167
Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems

NASA Technical Reports Server (NTRS)

Ecer, A.; Chien, Y. P.; Boenisch, T.; Akay, H. U.

2000-01-01

The developed methodology is aimed at improving the efficiency of executing block-structured algorithms on parallel, distributed, heterogeneous computers. The basic approach of these algorithms is to divide the flow domain into many sub- domains called blocks, and solve the governing equations over these blocks. Dynamic load balancing problem is defined as the efficient distribution of the blocks among the available processors over a period of several hours of computations. In environments with computers of different architecture, operating systems, CPU speed, memory size, load, and network speed, balancing the loads and managing the communication between processors becomes crucial. Load balancing software tools for mutually dependent parallel processes have been created to efficiently utilize an advanced computation environment and algorithms. These tools are dynamic in nature because of the chances in the computer environment during execution time. More recently, these tools were extended to a second operating system: NT. In this paper, the problems associated with this application will be discussed. Also, the developed algorithms were combined with the load sharing capability of LSF to efficiently utilize workstation clusters for parallel computing. Finally, results will be presented on running a NASA based code ADPAC to demonstrate the developed tools for dynamic load balancing.
A framework for accelerated phototrophic bioprocess development: integration of parallelized microscale cultivation, laboratory automation and Kriging-assisted experimental design.

PubMed

Morschett, Holger; Freier, Lars; Rohde, Jannis; Wiechert, Wolfgang; von Lieres, Eric; Oldiges, Marco

2017-01-01

Even though microalgae-derived biodiesel has regained interest within the last decade, industrial production is still challenging for economic reasons. Besides reactor design, as well as value chain and strain engineering, laborious and slow early-stage parameter optimization represents a major drawback. The present study introduces a framework for the accelerated development of phototrophic bioprocesses. A state-of-the-art micro-photobioreactor supported by a liquid-handling robot for automated medium preparation and product quantification was used. To take full advantage of the technology's experimental capacity, Kriging-assisted experimental design was integrated to enable highly efficient execution of screening applications. The resulting platform was used for medium optimization of a lipid production process using Chlorella vulgaris toward maximum volumetric productivity. Within only four experimental rounds, lipid production was increased approximately threefold to 212 ± 11 mg L -1 d -1 . Besides nitrogen availability as a key parameter, magnesium, calcium and various trace elements were shown to be of crucial importance. Here, synergistic multi-parameter interactions as revealed by the experimental design introduced significant further optimization potential. The integration of parallelized microscale cultivation, laboratory automation and Kriging-assisted experimental design proved to be a fruitful tool for the accelerated development of phototrophic bioprocesses. By means of the proposed technology, the targeted optimization task was conducted in a very timely and material-efficient manner.
Fully accelerating quantum Monte Carlo simulations of real materials on GPU clusters

NASA Astrophysics Data System (ADS)

Esler, Kenneth

2011-03-01

Quantum Monte Carlo (QMC) has proved to be an invaluable tool for predicting the properties of matter from fundamental principles, combining very high accuracy with extreme parallel scalability. By solving the many-body Schrödinger equation through a stochastic projection, it achieves greater accuracy than mean-field methods and better scaling with system size than quantum chemical methods, enabling scientific discovery across a broad spectrum of disciplines. In recent years, graphics processing units (GPUs) have provided a high-performance and low-cost new approach to scientific computing, and GPU-based supercomputers are now among the fastest in the world. The multiple forms of parallelism afforded by QMC algorithms make the method an ideal candidate for acceleration in the many-core paradigm. We present the results of porting the QMCPACK code to run on GPU clusters using the NVIDIA CUDA platform. Using mixed precision on GPUs and MPI for intercommunication, we observe typical full-application speedups of approximately 10x to 15x relative to quad-core CPUs alone, while reproducing the double-precision CPU results within statistical error. We discuss the algorithm modifications necessary to achieve good performance on this heterogeneous architecture and present the results of applying our code to molecules and bulk materials. Supported by the U.S. DOE under Contract No. DOE-DE-FG05-08OR23336 and by the NSF under No. 0904572.
A parallel solver for huge dense linear systems

NASA Astrophysics Data System (ADS)

Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.

2011-11-01

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system: Linux/Unix Has the code been vectorized or parallelized?: Yes, includes MPI primitives. RAM: Tested for up to 190 GB Classification: 6.5 External routines: MPI ( http://www.mpi-forum.org/), BLAS ( http://www.netlib.org/blas/), PLAPACK ( http://www.cs.utexas.edu/~plapack/), POOCLAPACK ( ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution). Catalogue identifier of previous version: AEHU_v1_0 Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533 Does the new version supersede the previous version?: Yes Nature of problem: Huge scale dense systems of linear equations, Ax=B, beyond standard LAPACK capabilities. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic. Summary of revisions: Version 1.1 Can be used to solve linear systems using double-precision arithmetic. New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment. Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores.
Distributed and Collaborative Software Analysis

NASA Astrophysics Data System (ADS)

Ghezzi, Giacomo; Gall, Harald C.

Throughout the years software engineers have come up with a myriad of specialized tools and techniques that focus on a certain type of software analysissoftware analysis such as source code analysis, co-change analysis or bug prediction. However, easy and straight forward synergies between these analyses and tools rarely exist because of their stand-alone nature, their platform dependence, their different input and output formats and the variety of data to analyze. As a consequence, distributed and collaborative software analysiscollaborative software analysis scenarios and in particular interoperability are severely limited. We describe a distributed and collaborative software analysis platform that allows for a seamless interoperability of software analysis tools across platform, geographical and organizational boundaries. We realize software analysis tools as services that can be accessed and composed over the Internet. These distributed analysis services shall be widely accessible in our incrementally augmented Software Analysis Broker software analysis broker where organizations and tool providers can register and share their tools. To allow (semi-) automatic use and composition of these tools, they are classified and mapped into a software analysis taxonomy and adhere to specific meta-models and ontologiesontologies for their category of analysis.
A parallel algorithm for multi-level logic synthesis using the transduction method. M.S. Thesis

NASA Technical Reports Server (NTRS)

Lim, Chieng-Fai

1991-01-01

The Transduction Method has been shown to be a powerful tool in the optimization of multilevel networks. Many tools such as the SYLON synthesis system (X90), (CM89), (LM90) have been developed based on this method. A parallel implementation is presented of SYLON-XTRANS (XM89) on an eight processor Encore Multimax shared memory multiprocessor. It minimizes multilevel networks consisting of simple gates through parallel pruning, gate substitution, gate merging, generalized gate substitution, and gate input reduction. This implementation, called Parallel TRANSduction (PTRANS), also uses partitioning to break large circuits up and performs inter- and intra-partition dynamic load balancing. With this, good speedups and high processor efficiencies are achievable without sacrificing the resulting circuit quality.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

NASA Technical Reports Server (NTRS)

Dorband, John E.; Aburdene, Maurice F.

2002-01-01

Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Light-weight Parallel Python Tools for Earth System Modeling Workflows

NASA Astrophysics Data System (ADS)

Mickelson, S. A.; Paul, K.; Xu, H.; Dennis, J.; Brown, D. I.

2015-12-01

With the growth in computing power over the last 30 years, earth system modeling codes have become increasingly data-intensive. As an example, it is expected that the data required for the next Intergovernmental Panel on Climate Change (IPCC) Assessment Report (AR6) will increase by more than 10x to an expected 25PB per climate model. Faced with this daunting challenge, developers of the Community Earth System Model (CESM) have chosen to change the format of their data for long-term storage from time-slice to time-series, in order to reduce the required download bandwidth needed for later analysis and post-processing by climate scientists. Hence, efficient tools are required to (1) perform the transformation of the data from time-slice to time-series format and to (2) compute climatology statistics, needed for many diagnostic computations, on the resulting time-series data. To address the first of these two challenges, we have developed a parallel Python tool for converting time-slice model output to time-series format. To address the second of these challenges, we have developed a parallel Python tool to perform fast time-averaging of time-series data. These tools are designed to be light-weight, be easy to install, have very few dependencies, and can be easily inserted into the Earth system modeling workflow with negligible disruption. In this work, we present the motivation, approach, and testing results of these two light-weight parallel Python tools, as well as our plans for future research and development.
Parallel processing optimization strategy based on MapReduce model in cloud storage environment

NASA Astrophysics Data System (ADS)

Cui, Jianming; Liu, Jiayi; Li, Qiuyan

2017-05-01

Currently, a large number of documents in the cloud storage process employed the way of packaging after receiving all the packets. From the local transmitter this stored procedure to the server, packing and unpacking will consume a lot of time, and the transmission efficiency is low as well. A new parallel processing algorithm is proposed to optimize the transmission mode. According to the operation machine graphs model work, using MPI technology parallel execution Mapper and Reducer mechanism. It is good to use MPI technology to implement Mapper and Reducer parallel mechanism. After the simulation experiment of Hadoop cloud computing platform, this algorithm can not only accelerate the file transfer rate, but also shorten the waiting time of the Reducer mechanism. It will break through traditional sequential transmission constraints and reduce the storage coupling to improve the transmission efficiency.
Parallel algorithm of VLBI software correlator under multiprocessor environment

NASA Astrophysics Data System (ADS)

Zheng, Weimin; Zhang, Dong

2007-11-01

The correlator is the key signal processing equipment of a Very Lone Baseline Interferometry (VLBI) synthetic aperture telescope. It receives the mass data collected by the VLBI observatories and produces the visibility function of the target, which can be used to spacecraft position, baseline length measurement, synthesis imaging, and other scientific applications. VLBI data correlation is a task of data intensive and computation intensive. This paper presents the algorithms of two parallel software correlators under multiprocessor environments. A near real-time correlator for spacecraft tracking adopts the pipelining and thread-parallel technology, and runs on the SMP (Symmetric Multiple Processor) servers. Another high speed prototype correlator using the mixed Pthreads and MPI (Massage Passing Interface) parallel algorithm is realized on a small Beowulf cluster platform. Both correlators have the characteristic of flexible structure, scalability, and with 10-station data correlating abilities.
A dynamic bead-based microarray for parallel DNA detection

NASA Astrophysics Data System (ADS)

Sochol, R. D.; Casavant, B. P.; Dueck, M. E.; Lee, L. P.; Lin, L.

2011-05-01

A microfluidic system has been designed and constructed by means of micromachining processes to integrate both microfluidic mixing of mobile microbeads and hydrodynamic microbead arraying capabilities on a single chip to simultaneously detect multiple bio-molecules. The prototype system has four parallel reaction chambers, which include microchannels of 18 × 50 µm2 cross-sectional area and a microfluidic mixing section of 22 cm length. Parallel detection of multiple DNA oligonucleotide sequences was achieved via molecular beacon probes immobilized on polystyrene microbeads of 16 µm diameter. Experimental results show quantitative detection of three distinct DNA oligonucleotide sequences from the Hepatitis C viral (HCV) genome with single base-pair mismatch specificity. Our dynamic bead-based microarray offers an effective microfluidic platform to increase parallelization of reactions and improve microbead handling for various biological applications, including bio-molecule detection, medical diagnostics and drug screening.

Fast forward kinematics algorithm for real-time and high-precision control of the 3-RPS parallel mechanism

NASA Astrophysics Data System (ADS)

Wang, Yue; Yu, Jingjun; Pei, Xu

2018-06-01

A new forward kinematics algorithm for the mechanism of 3-RPS (R: Revolute; P: Prismatic; S: Spherical) parallel manipulators is proposed in this study. This algorithm is primarily based on the special geometric conditions of the 3-RPS parallel mechanism, and it eliminates the errors produced by parasitic motions to improve and ensure accuracy. Specifically, the errors can be less than 10-6. In this method, only the group of solutions that is consistent with the actual situation of the platform is obtained rapidly. This algorithm substantially improves calculation efficiency because the selected initial values are reasonable, and all the formulas in the calculation are analytical. This novel forward kinematics algorithm is well suited for real-time and high-precision control of the 3-RPS parallel mechanism.
Evaluation of a new eLearning platform for distance teaching of microsurgery.

PubMed

Messaoudi, T; Bodin, F; Hidalgo Diaz, J J; Ichihara, S; Fikry, T; Lacreuse, I; Liverneaux, P; Facca, S

2015-06-01

Online learning (or eLearning) is in constant evolution in medicine. An analytical survey of the websites of eight academic societies and medical schools was carried out. These sites were evaluated against parameters that define the quality of an eLearning website, as well as the shareable content object reference model (SCORM) technical standards. All studied platforms were maintained by a webmaster and regularly updated. Only two platforms had teleconference opportunities, five had courses in PDF format, and four allowed online testing. Based on SCORM standards, only four platforms allowed direct access without a password. The content of all platforms was adaptable, interoperable and reusable. But their sustainability was difficult to assess. In parallel, we developed the first eLearning platform to be used as part of a university diploma in microsurgery in France. The platform was evaluated by students enrolled this diploma program. A satisfaction survey and platform evaluation showed that students were generally satisfied and had used the platform for microsurgery education, especially the seven students living abroad. ELearning for microsurgery allows the content to be continuously updated, makes for fewer classroom visits, provides easy remote access, and especially better training time management and cost savings in terms of travel and accommodations. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
XMOS XC-2 Development Board for Mechanical Control and Data Collection

NASA Technical Reports Server (NTRS)

Jarnot, Robert F.; Bowden, William J.

2011-01-01

The scanning microwave limb sounder (SMLS) will use technological improvements in low-noise mixers to provide precise data on the Earth s atmospheric composition with high spatial resolution. This project focuses on the design and implementation of a realtime control system needed for airborne engineering tests of the SMLS. The system must coordinate the actuation of optical components using four motors with encoder readback, while collecting synchronized telemetric data from a GPS receiver and 3-axis gyrometric system. A graphical user interface for testing the control system was also designed using Python. Although the system could have been implemented with an FPGA(fieldprogrammable gate array)-based setup, a processor development kit manufactured by XMOS was chosen. The XMOS architecture allows parallel execution of multiple tasks on separate threads, making it ideal for this application. It is easily programmed using XC (a subset of C). The necessary communication interfaces were implemented in software, including Ethernet, with significant cost and time reduction compared to an FPGA-based approach. A simple approach to control the chopper, calibration mirror, and gimbal for the airborne SMLS was needed. The XMOS board allows for multiple threads and real-time data acquisition. The XC-2 development kit is an attractive choice for synchronized, real-time, event-driven applications. The XMOS is based on the transputer microprocessor architecture developed for parallel computing, which is being revamped in this new platform. The XMOS device has multiple cores capable of running parallel applications on separate threads. The threads communicate with each other via user-defined channels capable of transmitting data within the device. XMOS provides a C-based development environment using XC, which eliminates the need for custom tool kits associated with FPGA programming. The XC-2 has four cores and necessary hardware for Ethernet I/O.
Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading.

PubMed

Rahn, René; Budach, Stefan; Costanza, Pascal; Ehrhardt, Marcel; Hancox, Jonny; Reinert, Knut

2018-05-03

Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence alignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms. rene.rahn@fu-berlin.de.
One-dimensional acoustic standing waves in rectangular channels for flow cytometry.

PubMed

Austin Suthanthiraraj, Pearlson P; Piyasena, Menake E; Woods, Travis A; Naivar, Mark A; Lόpez, Gabriel P; Graves, Steven W

2012-07-01

Flow cytometry has become a powerful analytical tool for applications ranging from blood diagnostics to high throughput screening of molecular assemblies on microsphere arrays. However, instrument size, expense, throughput, and consumable use limit its use in resource poor areas of the world, as a component in environmental monitoring, and for detection of very rare cell populations. For these reasons, new technologies to improve the size and cost-to-performance ratio of flow cytometry are required. One such technology is the use of acoustic standing waves that efficiently concentrate cells and particles to the center of flow channels for analysis. The simplest form of this method uses one-dimensional acoustic standing waves to focus particles in rectangular channels. We have developed one-dimensional acoustic focusing flow channels that can be fabricated in simple capillary devices or easily microfabricated using photolithography and deep reactive ion etching. Image and video analysis demonstrates that these channels precisely focus single flowing streams of particles and cells for traditional flow cytometry analysis. Additionally, use of standing waves with increasing harmonics and in parallel microfabricated channels is shown to effectively create many parallel focused streams. Furthermore, we present the fabrication of an inexpensive optical platform for flow cytometry in rectangular channels and use of the system to provide precise analysis. The simplicity and low-cost of the acoustic focusing devices developed here promise to be effective for flow cytometers that have reduced size, cost, and consumable use. Finally, the straightforward path to parallel flow streams using one-dimensional multinode acoustic focusing, indicates that simple acoustic focusing in rectangular channels may also have a prominent role in high-throughput flow cytometry. Copyright © 2012 Elsevier Inc. All rights reserved.
Development of Modern Performance Assessment Tools and Capabilities for Underground Disposal of Transuranic Waste at WIPP

NASA Astrophysics Data System (ADS)

Zeitler, T.; Kirchner, T. B.; Hammond, G. E.; Park, H.

2014-12-01

The Waste Isolation Pilot Plant (WIPP) has been developed by the U.S. Department of Energy (DOE) for the geologic (deep underground) disposal of transuranic (TRU) waste. Containment of TRU waste at the WIPP is regulated by the U.S. Environmental Protection Agency (EPA). The DOE demonstrates compliance with the containment requirements by means of performance assessment (PA) calculations. WIPP PA calculations estimate the probability and consequence of potential radionuclide releases from the repository to the accessible environment for a regulatory period of 10,000 years after facility closure. The long-term performance of the repository is assessed using a suite of sophisticated computational codes. In a broad modernization effort, the DOE has overseen the transfer of these codes to modern hardware and software platforms. Additionally, there is a current effort to establish new performance assessment capabilities through the further development of the PFLOTRAN software, a state-of-the-art massively parallel subsurface flow and reactive transport code. Improvements to the current computational environment will result in greater detail in the final models due to the parallelization afforded by the modern code. Parallelization will allow for relatively faster calculations, as well as a move from a two-dimensional calculation grid to a three-dimensional grid. The result of the modernization effort will be a state-of-the-art subsurface flow and transport capability that will serve WIPP PA into the future. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. This research is funded by WIPP programs administered by the Office of Environmental Management (EM) of the U.S Department of Energy.
Automatic Multilevel Parallelization Using OpenMP

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

2002-01-01

In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
Code Parallelization with CAPO: A User Manual

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

2001-01-01

A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.
Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes

PubMed Central

Margaria, Tiziana; Kubczak, Christian; Steffen, Bernhard

2008-01-01

Background With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout. Methods Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution. Conclusions As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way. PMID:18460173
Parallel evolution of image processing tools for multispectral imagery

NASA Astrophysics Data System (ADS)

Harvey, Neal R.; Brumby, Steven P.; Perkins, Simon J.; Porter, Reid B.; Theiler, James P.; Young, Aaron C.; Szymanski, John J.; Bloch, Jeffrey J.

2000-11-01

We describe the implementation and performance of a parallel, hybrid evolutionary-algorithm-based system, which optimizes image processing tools for feature-finding tasks in multi-spectral imagery (MSI) data sets. Our system uses an integrated spatio-spectral approach and is capable of combining suitably-registered data from different sensors. We investigate the speed-up obtained by parallelization of the evolutionary process via multiple processors (a workstation cluster) and develop a model for prediction of run-times for different numbers of processors. We demonstrate our system on Landsat Thematic Mapper MSI , covering the recent Cerro Grande fire at Los Alamos, NM, USA.
Final Scientific Report: A Scalable Development Environment for Peta-Scale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karbach, Carsten; Frings, Wolfgang

2013-02-22

This document is the final scientific report of the project DE-SC000120 (A scalable Development Environment for Peta-Scale Computing). The objective of this project is the extension of the Parallel Tools Platform (PTP) for applying it to peta-scale systems. PTP is an integrated development environment for parallel applications. It comprises code analysis, performance tuning, parallel debugging and system monitoring. The contribution of the Juelich Supercomputing Centre (JSC) aims to provide a scalable solution for system monitoring of supercomputers. This includes the development of a new communication protocol for exchanging status data between the target remote system and the client running PTP.more » The communication has to work for high latency. PTP needs to be implemented robustly and should hide the complexity of the supercomputer's architecture in order to provide a transparent access to various remote systems via a uniform user interface. This simplifies the porting of applications to different systems, because PTP functions as abstraction layer between parallel application developer and compute resources. The common requirement for all PTP components is that they have to interact with the remote supercomputer. E.g. applications are built remotely and performance tools are attached to job submissions and their output data resides on the remote system. Status data has to be collected by evaluating outputs of the remote job scheduler and the parallel debugger needs to control an application executed on the supercomputer. The challenge is to provide this functionality for peta-scale systems in real-time. The client server architecture of the established monitoring application LLview, developed by the JSC, can be applied to PTP's system monitoring. LLview provides a well-arranged overview of the supercomputer's current status. A set of statistics, a list of running and queued jobs as well as a node display mapping running jobs to their compute resources form the user display of LLview. These monitoring features have to be integrated into the development environment. Besides showing the current status PTP's monitoring also needs to allow for submitting and canceling user jobs. Monitoring peta-scale systems especially deals with presenting the large amount of status data in a useful manner. Users require to select arbitrary levels of detail. The monitoring views have to provide a quick overview of the system state, but also need to allow for zooming into specific parts of the system, into which the user is interested in. At present, the major batch systems running on supercomputers are PBS, TORQUE, ALPS and LoadLeveler, which have to be supported by both the monitoring and the job controlling component. Finally, PTP needs to be designed as generic as possible, so that it can be extended for future batch systems.« less
Cascade photonic integrated circuit architecture for electro-optic in-phase quadrature/single sideband modulation or frequency conversion.

PubMed

Hasan, Mehedi; Hall, Trevor

2015-11-01

A photonic integrated circuit architecture for implementing frequency upconversion is proposed. The circuit consists of a 1×2 splitter and 2×1 combiner interconnected by two stages of differentially driven phase modulators having 2×2 multimode interference coupler between the stages. A transfer matrix approach is used to model the operation of the architecture. The predictions of the model are validated by simulations performed using an industry standard software tool. The intrinsic conversion efficiency of the proposed design is improved by 6 dB over the alternative functionally equivalent circuit based on dual parallel Mach-Zehnder modulators known in the prior art. A two-tone analysis is presented to study the linearity of the proposed circuit, and a comparison is provided over the alternative. The proposed circuit is suitable for integration in any platform that offers linear electro-optic phase modulation such as LiNbO(3), silicon, III-V, or hybrid technology.
FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

PubMed

Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

2014-06-14

Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.
A Secure Web Application Providing Public Access to High-Performance Data Intensive Scientific Resources - ScalaBLAST Web Application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.

2008-05-04

This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less
Cloud Computing for Protein-Ligand Binding Site Comparison

PubMed Central

2013-01-01

The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery. PMID:23762824
Cloud computing for protein-ligand binding site comparison.

PubMed

Hung, Che-Lun; Hua, Guan-Jie

2013-01-01

The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.
PAUSE: A Patient-Centric Tool to Support Patient-Provider Engagement on Menopause

PubMed Central

Ashkenazy, Rebecca; Peterson, Mary Elizabeth

2018-01-01

There are powerful demographic, political, and environmental trends shaping women’s health. Increases in life expectancy, literacy, and empowerment are fueling expansions in education and advocacy. Research and development focuses on women’s health and fertility across an expanded age spectrum. There is also a cultural emphasis on antiaging and aesthetics. In parallel, the digital revolution is changing how health care is accessed by and delivered to women. A women’s journey through menopause is at the crossroads of these transformations. Medical and social platforms encourage women to embrace menopause as a pivotal life stage. Yet, many women are reticent to discuss “the transition” due to embarrassment about its symptoms, lack of awareness of its physical manifestations, or fear of aging. We introduce a patient-centric framework to support patient-provider engagement on menopause: prevention, anxiety, urogenital symptoms, vasomotor symptoms, and education. Although not comprehensive, PAUSE represents an acronym and reminder to focus a portion of the medical interaction on menopause. PMID:29467590
Component-based integration of chemistry and optimization software.

PubMed

Kenny, Joseph P; Benson, Steven J; Alexeev, Yuri; Sarich, Jason; Janssen, Curtis L; McInnes, Lois Curfman; Krishnan, Manojkumar; Nieplocha, Jarek; Jurrus, Elizabeth; Fahlstrom, Carl; Windus, Theresa L

2004-11-15

Typical scientific software designs make rigid assumptions regarding programming language and data structures, frustrating software interoperability and scientific collaboration. Component-based software engineering is an emerging approach to managing the increasing complexity of scientific software. Component technology facilitates code interoperability and reuse. Through the adoption of methodology and tools developed by the Common Component Architecture Forum, we have developed a component architecture for molecular structure optimization. Using the NWChem and Massively Parallel Quantum Chemistry packages, we have produced chemistry components that provide capacity for energy and energy derivative evaluation. We have constructed geometry optimization applications by integrating the Toolkit for Advanced Optimization, Portable Extensible Toolkit for Scientific Computation, and Global Arrays packages, which provide optimization and linear algebra capabilities. We present a brief overview of the component development process and a description of abstract interfaces for chemical optimizations. The components conforming to these abstract interfaces allow the construction of applications using different chemistry and mathematics packages interchangeably. Initial numerical results for the component software demonstrate good performance, and highlight potential research enabled by this platform.
Advances in the Study of Heart Development and Disease Using Zebrafish

PubMed Central

Brown, Daniel R.; Samsa, Leigh Ann; Qian, Li; Liu, Jiandong

2016-01-01

Animal models of cardiovascular disease are key players in the translational medicine pipeline used to define the conserved genetic and molecular basis of disease. Congenital heart diseases (CHDs) are the most common type of human birth defect and feature structural abnormalities that arise during cardiac development and maturation. The zebrafish, Danio rerio, is a valuable vertebrate model organism, offering advantages over traditional mammalian models. These advantages include the rapid, stereotyped and external development of transparent embryos produced in large numbers from inexpensively housed adults, vast capacity for genetic manipulation, and amenability to high-throughput screening. With the help of modern genetics and a sequenced genome, zebrafish have led to insights in cardiovascular diseases ranging from CHDs to arrhythmia and cardiomyopathy. Here, we discuss the utility of zebrafish as a model system and summarize zebrafish cardiac morphogenesis with emphasis on parallels to human heart diseases. Additionally, we discuss the specific tools and experimental platforms utilized in the zebrafish model including forward screens, functional characterization of candidate genes, and high throughput applications. PMID:27335817

Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles.

PubMed

Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F; Perez, Danny

2017-10-21

Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.
Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles

NASA Astrophysics Data System (ADS)

Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny

2017-10-01

Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.
A challenge for theranostics: is the optimal particle for therapy also optimal for diagnostics?

NASA Astrophysics Data System (ADS)

Dreifuss, Tamar; Betzer, Oshra; Shilo, Malka; Popovtzer, Aron; Motiei, Menachem; Popovtzer, Rachela

2015-09-01

Theranostics is defined as the combination of therapeutic and diagnostic capabilities in the same agent. Nanotechnology is emerging as an efficient platform for theranostics, since nanoparticle-based contrast agents are powerful tools for enhancing in vivo imaging, while therapeutic nanoparticles may overcome several limitations of conventional drug delivery systems. Theranostic nanoparticles have drawn particular interest in cancer treatment, as they offer significant advantages over both common imaging contrast agents and chemotherapeutic drugs. However, the development of platforms for theranostic applications raises critical questions; is the optimal particle for therapy also the optimal particle for diagnostics? Are the specific characteristics needed to optimize diagnostic imaging parallel to those required for treatment applications? This issue is examined in the present study, by investigating the effect of the gold nanoparticle (GNP) size on tumor uptake and tumor imaging. A series of anti-epidermal growth factor receptor conjugated GNPs of different sizes (diameter range: 20-120 nm) was synthesized, and then their uptake by human squamous cell carcinoma head and neck cancer cells, in vitro and in vivo, as well as their tumor visualization capabilities were evaluated using CT. The results showed that the size of the nanoparticle plays an instrumental role in determining its potential activity in vivo. Interestingly, we found that although the highest tumor uptake was obtained with 20 nm C225-GNPs, the highest contrast enhancement in the tumor was obtained with 50 nm C225-GNPs, thus leading to the conclusion that the optimal particle size for drug delivery is not necessarily optimal for imaging. These findings stress the importance of the investigation and design of optimal nanoparticles for theranostic applications.Theranostics is defined as the combination of therapeutic and diagnostic capabilities in the same agent. Nanotechnology is emerging as an efficient platform for theranostics, since nanoparticle-based contrast agents are powerful tools for enhancing in vivo imaging, while therapeutic nanoparticles may overcome several limitations of conventional drug delivery systems. Theranostic nanoparticles have drawn particular interest in cancer treatment, as they offer significant advantages over both common imaging contrast agents and chemotherapeutic drugs. However, the development of platforms for theranostic applications raises critical questions; is the optimal particle for therapy also the optimal particle for diagnostics? Are the specific characteristics needed to optimize diagnostic imaging parallel to those required for treatment applications? This issue is examined in the present study, by investigating the effect of the gold nanoparticle (GNP) size on tumor uptake and tumor imaging. A series of anti-epidermal growth factor receptor conjugated GNPs of different sizes (diameter range: 20-120 nm) was synthesized, and then their uptake by human squamous cell carcinoma head and neck cancer cells, in vitro and in vivo, as well as their tumor visualization capabilities were evaluated using CT. The results showed that the size of the nanoparticle plays an instrumental role in determining its potential activity in vivo. Interestingly, we found that although the highest tumor uptake was obtained with 20 nm C225-GNPs, the highest contrast enhancement in the tumor was obtained with 50 nm C225-GNPs, thus leading to the conclusion that the optimal particle size for drug delivery is not necessarily optimal for imaging. These findings stress the importance of the investigation and design of optimal nanoparticles for theranostic applications. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr03119b
Methodologies and Tools for Tuning Parallel Programs: 80% Art, 20% Science, and 10% Luck

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Bailey, David (Technical Monitor)

1996-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. In the past few years, the ubiquitous introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CRI's Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance instrumentation/monitor/tuning technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g. AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.
Novel calibration tools and validation concepts for microarray-based platforms used in molecular diagnostics and food safety control.

PubMed

Brunner, C; Hoffmann, K; Thiele, T; Schedler, U; Jehle, H; Resch-Genger, U

2015-04-01

Commercial platforms consisting of ready-to-use microarrays printed with target-specific DNA probes, a microarray scanner, and software for data analysis are available for different applications in medical diagnostics and food analysis, detecting, e.g., viral and bacteriological DNA sequences. The transfer of these tools from basic research to routine analysis, their broad acceptance in regulated areas, and their use in medical practice requires suitable calibration tools for regular control of instrument performance in addition to internal assay controls. Here, we present the development of a novel assay-adapted calibration slide for a commercialized DNA-based assay platform, consisting of precisely arranged fluorescent areas of various intensities obtained by incorporating different concentrations of a "green" dye and a "red" dye in a polymer matrix. These dyes present "Cy3" and "Cy5" analogues with improved photostability, chosen based upon their spectroscopic properties closely matching those of common labels for the green and red channel of microarray scanners. This simple tool allows to efficiently and regularly assess and control the performance of the microarray scanner provided with the biochip platform and to compare different scanners. It will be eventually used as fluorescence intensity scale for referencing of assays results and to enhance the overall comparability of diagnostic tests.
C to VHDL compiler

NASA Astrophysics Data System (ADS)

Berdychowski, Piotr P.; Zabolotny, Wojciech M.

2010-09-01

The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.
Design and Analysis of a Compact Precision Positioning Platform Integrating Strain Gauges and the Piezoactuator

PubMed Central

Huang, Hu; Zhao, Hongwei; Yang, Zhaojun; Fan, Zunqiang; Wan, Shunguang; Shi, Chengli; Ma, Zhichao

2012-01-01

Miniaturization precision positioning platforms are needed for in situ nanomechanical test applications. This paper proposes a compact precision positioning platform integrating strain gauges and the piezoactuator. Effects of geometric parameters of two parallel plates on Von Mises stress distribution as well as static and dynamic characteristics of the platform were studied by the finite element method. Results of the calibration experiment indicate that the strain gauge sensor has good linearity and its sensitivity is about 0.0468 mV/μm. A closed-loop control system was established to solve the problem of nonlinearity of the platform. Experimental results demonstrate that for the displacement control process, both the displacement increasing portion and the decreasing portion have good linearity, verifying that the control system is available. The developed platform has a compact structure but can realize displacement measurement with the embedded strain gauges, which is useful for the closed-loop control and structure miniaturization of piezo devices. It has potential applications in nanoindentation and nanoscratch tests, especially in the field of in situ nanomechanical testing which requires compact structures. PMID:23012566
Development of embedded real-time and high-speed vision platform

NASA Astrophysics Data System (ADS)

Ouyang, Zhenxing; Dong, Yimin; Yang, Hua

2015-12-01

Currently, high-speed vision platforms are widely used in many applications, such as robotics and automation industry. However, a personal computer (PC) whose over-large size is not suitable and applicable in compact systems is an indispensable component for human-computer interaction in traditional high-speed vision platforms. Therefore, this paper develops an embedded real-time and high-speed vision platform, ER-HVP Vision which is able to work completely out of PC. In this new platform, an embedded CPU-based board is designed as substitution for PC and a DSP and FPGA board is developed for implementing image parallel algorithms in FPGA and image sequential algorithms in DSP. Hence, the capability of ER-HVP Vision with size of 320mm x 250mm x 87mm can be presented in more compact condition. Experimental results are also given to indicate that the real-time detection and counting of the moving target at a frame rate of 200 fps at 512 x 512 pixels under the operation of this newly developed vision platform are feasible.
GPURFSCREEN: a GPU based virtual screening tool using random forest classifier.

PubMed

Jayaraj, P B; Ajay, Mathias K; Nufail, M; Gopakumar, G; Jaleel, U C A

2016-01-01

In-silico methods are an integral part of modern drug discovery paradigm. Virtual screening, an in-silico method, is used to refine data models and reduce the chemical space on which wet lab experiments need to be performed. Virtual screening of a ligand data model requires large scale computations, making it a highly time consuming task. This process can be speeded up by implementing parallelized algorithms on a Graphical Processing Unit (GPU). Random Forest is a robust classification algorithm that can be employed in the virtual screening. A ligand based virtual screening tool (GPURFSCREEN) that uses random forests on GPU systems has been proposed and evaluated in this paper. This tool produces optimized results at a lower execution time for large bioassay data sets. The quality of results produced by our tool on GPU is same as that on a regular serial environment. Considering the magnitude of data to be screened, the parallelized virtual screening has a significantly lower running time at high throughput. The proposed parallel tool outperforms its serial counterpart by successfully screening billions of molecules in training and prediction phases.
Interactive 3D visualization for theoretical virtual observatories

NASA Astrophysics Data System (ADS)

Dykes, T.; Hassan, A.; Gheller, C.; Croton, D.; Krokos, M.

2018-06-01

Virtual observatories (VOs) are online hubs of scientific knowledge. They encompass a collection of platforms dedicated to the storage and dissemination of astronomical data, from simple data archives to e-research platforms offering advanced tools for data exploration and analysis. Whilst the more mature platforms within VOs primarily serve the observational community, there are also services fulfilling a similar role for theoretical data. Scientific visualization can be an effective tool for analysis and exploration of data sets made accessible through web platforms for theoretical data, which often contain spatial dimensions and properties inherently suitable for visualization via e.g. mock imaging in 2D or volume rendering in 3D. We analyse the current state of 3D visualization for big theoretical astronomical data sets through scientific web portals and virtual observatory services. We discuss some of the challenges for interactive 3D visualization and how it can augment the workflow of users in a virtual observatory context. Finally we showcase a lightweight client-server visualization tool for particle-based data sets, allowing quantitative visualization via data filtering, highlighting two example use cases within the Theoretical Astrophysical Observatory.
Robotics in endoscopy.

PubMed

Klibansky, David; Rothstein, Richard I

2012-09-01

The increasing complexity of intralumenal and emerging translumenal endoscopic procedures has created an opportunity to apply robotics in endoscopy. Computer-assisted or direct-drive robotic technology allows the triangulation of flexible tools through telemanipulation. The creation of new flexible operative platforms, along with other emerging technology such as nanobots and steerable capsules, can be transformational for endoscopic procedures. In this review, we cover some background information on the use of robotics in surgery and endoscopy, and review the emerging literature on platforms, capsules, and mini-robotic units. The development of techniques in advanced intralumenal endoscopy (endoscopic mucosal resection and endoscopic submucosal dissection) and translumenal endoscopic procedures (NOTES) has generated a number of novel platforms, flexible tools, and devices that can apply robotic principles to endoscopy. The development of a fully flexible endoscopic surgical toolkit will enable increasingly advanced procedures to be performed through natural orifices. The application of platforms and new flexible tools to the areas of advanced endoscopy and NOTES heralds the opportunity to employ useful robotic technology. Following the examples of the utility of robotics from the field of laparoscopic surgery, we can anticipate the emerging role of robotic technology in endoscopy.
SiGe BiCMOS manufacturing platform for mmWave applications

NASA Astrophysics Data System (ADS)

Kar-Roy, Arjun; Howard, David; Preisler, Edward; Racanelli, Marco; Chaudhry, Samir; Blaschke, Volker

2010-10-01

TowerJazz offers high volume manufacturable commercial SiGe BiCMOS technology platforms to address the mmWave market. In this paper, first, the SiGe BiCMOS process technology platforms such as SBC18 and SBC13 are described. These manufacturing platforms integrate 200 GHz fT/fMAX SiGe NPN with deep trench isolation into 0.18μm and 0.13μm node CMOS processes along with high density 5.6fF/μm2 stacked MIM capacitors, high value polysilicon resistors, high-Q metal resistors, lateral PNP transistors, and triple well isolation using deep n-well for mixed-signal integration, and, multiple varactors and compact high-Q inductors for RF needs. Second, design enablement tools that maximize performance and lowers costs and time to market such as scalable PSP and HICUM models, statistical and Xsigma models, reliability modeling tools, process control model tools, inductor toolbox and transmission line models are described. Finally, demonstrations in silicon for mmWave applications in the areas of optical networking, mobile broadband, phased array radar, collision avoidance radar and W-band imaging are listed.
Developing an Interactive Data Visualization Tool to Assess the Impact of Decision Support on Clinical Operations.

PubMed

Huber, Timothy C; Krishnaraj, Arun; Monaghan, Dayna; Gaskin, Cree M

2018-05-18

Due to mandates from recent legislation, clinical decision support (CDS) software is being adopted by radiology practices across the country. This software provides imaging study decision support for referring providers at the point of order entry. CDS systems produce a large volume of data, providing opportunities for research and quality improvement. In order to better visualize and analyze trends in this data, an interactive data visualization dashboard was created using a commercially available data visualization platform. Following the integration of a commercially available clinical decision support product into the electronic health record, a dashboard was created using a commercially available data visualization platform (Tableau, Seattle, WA). Data generated by the CDS were exported from the data warehouse, where they were stored, into the platform. This allowed for real-time visualization of the data generated by the decision support software. The creation of the dashboard allowed the output from the CDS platform to be more easily analyzed and facilitated hypothesis generation. Integrating data visualization tools into clinical decision support tools allows for easier data analysis and can streamline research and quality improvement efforts.
Parallelizing Timed Petri Net simulations

NASA Technical Reports Server (NTRS)

Nicol, David M.

1993-01-01

The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.
Using e-Learning Platforms for Mastery Learning in Developmental Mathematics Courses

ERIC Educational Resources Information Center

Boggs, Stacey; Shore, Mark; Shore, JoAnna

2004-01-01

Many colleges and universities have adopted e-learning platforms to utilize computers as an instructional tool in developmental (i.e., beginning and intermediate algebra) mathematics courses. An e-learning platform is a computer program used to enhance course instruction via computers and the Internet. Allegany College of Maryland is currently…
UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lockwood, Glenn K.; Yoo, Wucherl; Byna, Suren

I/O efficiency is essential to productivity in scientific computing, especially as many scientific domains become more data-intensive. Many characterization tools have been used to elucidate specific aspects of parallel I/O performance, but analyzing components of complex I/O subsystems in isolation fails to provide insight into critical questions: how do the I/O components interact, what are reasonable expectations for application performance, and what are the underlying causes of I/O performance problems? To address these questions while capitalizing on existing component-level characterization tools, we propose an approach that combines on-demand, modular synthesis of I/O characterization data into a unified monitoring and metricsmore » interface (UMAMI) to provide a normalized, holistic view of I/O behavior. We evaluate the feasibility of this approach by applying it to a month-long benchmarking study on two distinct largescale computing platforms. We present three case studies that highlight the importance of analyzing application I/O performance in context with both contemporaneous and historical component metrics, and we provide new insights into the factors affecting I/O performance. By demonstrating the generality of our approach, we lay the groundwork for a production-grade framework for holistic I/O analysis.« less
Organ-On-A-Chip Platforms: A Convergence of Advanced Materials, Cells, and Microscale Technologies.

PubMed

Ahadian, Samad; Civitarese, Robert; Bannerman, Dawn; Mohammadi, Mohammad Hossein; Lu, Rick; Wang, Erika; Davenport-Huyer, Locke; Lai, Ben; Zhang, Boyang; Zhao, Yimu; Mandla, Serena; Korolj, Anastasia; Radisic, Milica

2018-01-01

Significant advances in biomaterials, stem cell biology, and microscale technologies have enabled the fabrication of biologically relevant tissues and organs. Such tissues and organs, referred to as organ-on-a-chip (OOC) platforms, have emerged as a powerful tool in tissue analysis and disease modeling for biological and pharmacological applications. A variety of biomaterials are used in tissue fabrication providing multiple biological, structural, and mechanical cues in the regulation of cell behavior and tissue morphogenesis. Cells derived from humans enable the fabrication of personalized OOC platforms. Microscale technologies are specifically helpful in providing physiological microenvironments for tissues and organs. In this review, biomaterials, cells, and microscale technologies are described as essential components to construct OOC platforms. The latest developments in OOC platforms (e.g., liver, skeletal muscle, cardiac, cancer, lung, skin, bone, and brain) are then discussed as functional tools in simulating human physiology and metabolism. Future perspectives and major challenges in the development of OOC platforms toward accelerating clinical studies of drug discovery are finally highlighted. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
WDM mid-board optics for chip-to-chip wavelength routing interconnects in the H2020 ICT-STREAMS

NASA Astrophysics Data System (ADS)

Kanellos, G. T.; Pleros, N.

2017-02-01

Multi-socket server boards have emerged to increase the processing power density on the board level and further flatten the data center networks beyond leaf-spine architectures. Scaling however the number of processors per board puts current electronic technologies into challenge, as it requires high bandwidth interconnects and high throughput switches with increased number of ports that are currently unavailable. On-board optical interconnection has proved the potential to efficiently satisfy the bandwidth needs, but their use has been limited to parallel links without performing any smart routing functionality. With CWDM optical interconnects already a commodity, cyclical wavelength routing proposed to fit the datacom for rack-to-rack and board-to-board communication now becomes a promising on-board routing platform. ICT-STREAMS is a European research project that aims to combine WDM parallel on-board transceivers with a cyclical AWGR, in order to create a new board-level, chip-to-chip interconnection paradigm that will leverage WDM parallel transmission to a powerful wavelength routing platform capable to interconnect multiple processors with unprecedented bandwidth and throughput capacity. Direct, any-to-any, on-board interconnection of multiple processors will significantly contribute to further flatten the data centers and facilitate east-west communication. In the present communication, we present ICT-STREAMS on-board wavelength routing architecture for multiple chip-to-chip interconnections and evaluate the overall system performance in terms of throughput and latency for several schemes and traffic profiles. We also review recent advances of the ICT-STREAMS platform key-enabling technologies that span from Si in-plane lasers and polymer based electro-optical circuit boards to silicon photonics transceivers and photonic-crystal amplifiers.
Intra-Personal and Inter-Personal Kinetic Synergies During Jumping.

PubMed

Slomka, Kajetan; Juras, Grzegorz; Sobota, Grzegorz; Furmanek, Mariusz; Rzepko, Marian; Latash, Mark L

2015-12-22

We explored synergies between two legs and two subjects during preparation for a long jump into a target. Synergies were expected during one-person jumping. No such synergies were expected between two persons jumping in parallel without additional contact, while synergies were expected to emerge with haptic contact and become stronger with strong mechanical contact. Subjects performed jumps either alone (each foot standing on a separate force platform) or in dyads (parallel to each other, each person standing on a separate force platform) without any contact, with haptic contact, and with strong coupling. Strong negative correlations between pairs of force variables (strong synergies) were seen in the vertical force in one-person jumps and weaker synergies in two-person jumps with the strong contact. For other force variables, only weak synergies were present in one-person jumps and no negative correlations between pairs of force variable for two-person jumps. Pairs of moment variables from the two force platforms at steady state showed positive correlations, which were strong in one-person jumps and weaker, but still significant, in two-person jumps with the haptic and strong contact. Anticipatory synergy adjustments prior to action initiation were observed in one-person trials only. We interpret the different results for the force and moment variables at steady state as reflections of postural sway.
Intra-Personal and Inter-Personal Kinetic Synergies During Jumping

PubMed Central

Slomka, Kajetan; Juras, Grzegorz; Sobota, Grzegorz; Furmanek, Mariusz; Rzepko, Marian; Latash, Mark L.

2015-01-01

We explored synergies between two legs and two subjects during preparation for a long jump into a target. Synergies were expected during one-person jumping. No such synergies were expected between two persons jumping in parallel without additional contact, while synergies were expected to emerge with haptic contact and become stronger with strong mechanical contact. Subjects performed jumps either alone (each foot standing on a separate force platform) or in dyads (parallel to each other, each person standing on a separate force platform) without any contact, with haptic contact, and with strong coupling. Strong negative correlations between pairs of force variables (strong synergies) were seen in the vertical force in one-person jumps and weaker synergies in two-person jumps with the strong contact. For other force variables, only weak synergies were present in one-person jumps and no negative correlations between pairs of force variable for two-person jumps. Pairs of moment variables from the two force platforms at steady state showed positive correlations, which were strong in one-person jumps and weaker, but still significant, in two-person jumps with the haptic and strong contact. Anticipatory synergy adjustments prior to action initiation were observed in one-person trials only. We interpret the different results for the force and moment variables at steady state as reflections of postural sway. PMID:26839608

Some links on this page may take you to non-federal websites. Their policies may differ from this site.