commodity graphics accelerators: Topics by Science.gov

Sample records for commodity graphics accelerators

Evaluation of accelerated iterative x-ray CT image reconstruction using floating point graphics hardware.

PubMed

Kole, J S; Beekman, F J

2006-02-21

Statistical reconstruction methods offer possibilities to improve image quality as compared with analytical methods, but current reconstruction times prohibit routine application in clinical and micro-CT. In particular, for cone-beam x-ray CT, the use of graphics hardware has been proposed to accelerate the forward and back-projection operations, in order to reduce reconstruction times. In the past, wide application of this texture hardware mapping approach was hampered owing to limited intrinsic accuracy. Recently, however, floating point precision has become available in the latest generation commodity graphics cards. In this paper, we utilize this feature to construct a graphics hardware accelerated version of the ordered subset convex reconstruction algorithm. The aims of this paper are (i) to study the impact of using graphics hardware acceleration for statistical reconstruction on the reconstructed image accuracy and (ii) to measure the speed increase one can obtain by using graphics hardware acceleration. We compare the unaccelerated algorithm with the graphics hardware accelerated version, and for the latter we consider two different interpolation techniques. A simulation study of a micro-CT scanner with a mathematical phantom shows that at almost preserved reconstructed image accuracy, speed-ups of a factor 40 to 222 can be achieved, compared with the unaccelerated algorithm, and depending on the phantom and detector sizes. Reconstruction from physical phantom data reconfirms the usability of the accelerated algorithm for practical cases.
GPU acceleration for digitally reconstructed radiographs using bindless texture objects and CUDA/OpenGL interoperability.

PubMed

Abdellah, Marwan; Eldeib, Ayman; Owis, Mohamed I

2015-01-01

This paper features an advanced implementation of the X-ray rendering algorithm that harnesses the giant computing power of the current commodity graphics processors to accelerate the generation of high resolution digitally reconstructed radiographs (DRRs). The presented pipeline exploits the latest features of NVIDIA Graphics Processing Unit (GPU) architectures, mainly bindless texture objects and dynamic parallelism. The rendering throughput is substantially improved by exploiting the interoperability mechanisms between CUDA and OpenGL. The benchmarks of our optimized rendering pipeline reflect its capability of generating DRRs with resolutions of 2048(2) and 4096(2) at interactive and semi interactive frame-rates using an NVIDIA GeForce 970 GTX device.
Accelerated numerical processing of electronically recorded holograms with reduced speckle noise.

PubMed

Trujillo, Carlos; Garcia-Sucerquia, Jorge

2013-09-01

The numerical reconstruction of digitally recorded holograms suffers from speckle noise. An accelerated method that uses general-purpose computing in graphics processing units to reduce that noise is shown. The proposed methodology utilizes parallelized algorithms to record, reconstruct, and superimpose multiple uncorrelated holograms of a static scene. For the best tradeoff between reduction of the speckle noise and processing time, the method records, reconstructs, and superimposes six holograms of 1024 × 1024 pixels in 68 ms; for this case, the methodology reduces the speckle noise by 58% compared with that exhibited by a single hologram. The fully parallelized method running on a commodity graphics processing unit is one order of magnitude faster than the same technique implemented on a regular CPU using its multithreading capabilities. Experimental results are shown to validate the proposal.
Development and Evaluation of Sterographic Display for Lung Cancer Screening

DTIC Science & Technology

2008-12-01

burden. Application of GPUs – With the evolution of commodity graphics processing units (GPUs) for accelerating games on personal computers, over the...units, which are designed for rendering computer games , are readily available and can be programmed to perform the kinds of real-time calculations...575-581, 1994. 12. Anderson CM, Saloner D, Tsuruda JS, Shapeero LG, Lee RE. "Artifacts in maximun-intensity-projection display of MR angiograms
Chromium: A Stress-Processing Framework for Interactive Rendering on Clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humphreys, G,; Houston, M.; Ng, Y.-R.

2002-01-11

We describe Chromium, a system for manipulating streams of graphics API commands on clusters of workstations. Chromium's stream filters can be arranged to create sort-first and sort-last parallel graphics architectures that, in many cases, support the same applications while using only commodity graphics accelerators. In addition, these stream filters can be extended programmatically, allowing the user to customize the stream transformations performed by nodes in a cluster. Because our stream processing mechanism is completely general, any cluster-parallel rendering algorithm can be either implemented on top of or embedded in Chromium. In this paper, we give examples of real-world applications thatmore » use Chromium to achieve good scalability on clusters of workstations, and describe other potential uses of this stream processing technology. By completely abstracting the underlying graphics architecture, network topology, and API command processing semantics, we allow a variety of applications to run in different environments.« less
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment

PubMed Central

Manavski, Svetlin A; Valle, Giorgio

2008-01-01

Background Searching for similarities in protein and DNA databases has become a routine procedure in Molecular Biology. The Smith-Waterman algorithm has been available for more than 25 years. It is based on a dynamic programming approach that explores all the possible alignments between two sequences; as a result it returns the optimal local alignment. Unfortunately, the computational cost is very high, requiring a number of operations proportional to the product of the length of two sequences. Furthermore, the exponential growth of protein and DNA databases makes the Smith-Waterman algorithm unrealistic for searching similarities in large sets of sequences. For these reasons heuristic approaches such as those implemented in FASTA and BLAST tend to be preferred, allowing faster execution times at the cost of reduced sensitivity. The main motivation of our work is to exploit the huge computational power of commonly available graphic cards, to develop high performance solutions for sequence alignment. Results In this paper we present what we believe is the fastest solution of the exact Smith-Waterman algorithm running on commodity hardware. It is implemented in the recently released CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU) G80. Speeds of more than 3.5 GCUPS (Giga Cell Updates Per Second) are achieved on a workstation running two GeForce 8800 GTX. Exhaustive tests have been done to compare our implementation to SSEARCH and BLAST, running on a 3 GHz Intel Pentium IV processor. Our solution was also compared to a recently published GPU implementation and to a Single Instruction Multiple Data (SIMD) solution. These tests show that our implementation performs from 2 to 30 times faster than any other previous attempt available on commodity hardware. Conclusions The results show that graphic cards are now sufficiently advanced to be used as efficient hardware accelerators for sequence alignment. Their performance is better than any alternative available on commodity hardware platforms. The solution presented in this paper allows large scale alignments to be performed at low cost, using the exact Smith-Waterman algorithm instead of the largely adopted heuristic approaches. PMID:18387198
Real-time lens distortion correction: speed, accuracy and efficiency

NASA Astrophysics Data System (ADS)

Bax, Michael R.; Shahidi, Ramin

2014-11-01

Optical lens systems suffer from nonlinear geometrical distortion. Optical imaging applications such as image-enhanced endoscopy and image-based bronchoscope tracking require correction of this distortion for accurate localization, tracking, registration, and measurement of image features. Real-time capability is desirable for interactive systems and live video. The use of a texture-mapping graphics accelerator, which is standard hardware on current motherboard chipsets and add-in video graphics cards, to perform distortion correction is proposed. Mesh generation for image tessellation, an error analysis, and performance results are presented. It is shown that distortion correction using commodity graphics hardware is substantially faster than using the main processor and can be performed at video frame rates (faster than 30 frames per second), and that the polar-based method of mesh generation proposed here is more accurate than a conventional grid-based approach. Using graphics hardware to perform distortion correction is not only fast and accurate but also efficient as it frees the main processor for other tasks, which is an important issue in some real-time applications.
GPU-Accelerated Molecular Modeling Coming Of Age

PubMed Central

Stone, John E.; Hardy, David J.; Ufimtsev, Ivan S.

2010-01-01

Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. PMID:20675161
GPU-accelerated molecular modeling coming of age.

PubMed

Stone, John E; Hardy, David J; Ufimtsev, Ivan S; Schulten, Klaus

2010-09-01

Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. (c) 2010 Elsevier Inc. All rights reserved.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL.

PubMed

Stone, John E; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

2016-05-01

Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL

PubMed Central

Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

2016-01-01

Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
A survey of GPU-based acceleration techniques in MRI reconstructions

PubMed Central

Wang, Haifeng; Peng, Hanchuan; Chang, Yuchou

2018-01-01

Image reconstruction in magnetic resonance imaging (MRI) clinical applications has become increasingly more complicated. However, diagnostic and treatment require very fast computational procedure. Modern competitive platforms of graphics processing unit (GPU) have been used to make high-performance parallel computations available, and attractive to common consumers for computing massively parallel reconstruction problems at commodity price. GPUs have also become more and more important for reconstruction computations, especially when deep learning starts to be applied into MRI reconstruction. The motivation of this survey is to review the image reconstruction schemes of GPU computing for MRI applications and provide a summary reference for researchers in MRI community. PMID:29675361
A survey of GPU-based acceleration techniques in MRI reconstructions.

PubMed

Wang, Haifeng; Peng, Hanchuan; Chang, Yuchou; Liang, Dong

2018-03-01

Image reconstruction in magnetic resonance imaging (MRI) clinical applications has become increasingly more complicated. However, diagnostic and treatment require very fast computational procedure. Modern competitive platforms of graphics processing unit (GPU) have been used to make high-performance parallel computations available, and attractive to common consumers for computing massively parallel reconstruction problems at commodity price. GPUs have also become more and more important for reconstruction computations, especially when deep learning starts to be applied into MRI reconstruction. The motivation of this survey is to review the image reconstruction schemes of GPU computing for MRI applications and provide a summary reference for researchers in MRI community.
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2011 CFR

2011-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2012 CFR

2012-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2013 CFR

2013-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2010 CFR

2010-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2014 CFR

2014-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
Knowledge-based commodity distribution planning

NASA Technical Reports Server (NTRS)

Saks, Victor; Johnson, Ivan

1994-01-01

This paper presents an overview of a Decision Support System (DSS) that incorporates Knowledge-Based (KB) and commercial off the shelf (COTS) technology components. The Knowledge-Based Logistics Planning Shell (KBLPS) is a state-of-the-art DSS with an interactive map-oriented graphics user interface and powerful underlying planning algorithms. KBLPS was designed and implemented to support skilled Army logisticians to prepare and evaluate logistics plans rapidly, in order to support corps-level battle scenarios. KBLPS represents a substantial advance in graphical interactive planning tools, with the inclusion of intelligent planning algorithms that provide a powerful adjunct to the planning skills of commodity distribution planners.
Globally sourced mineral commodities used in U.S. Navy SEAL gear—An illustration of U.S. net import reliance

USGS Publications Warehouse

Brainard, Jamie; Nassar, Nedal T.; Gambogi, Joseph; Baker, Michael S.; Jarvis, Michael T.

2018-01-25

A U.S. Navy SEAL (an acronym for sea, air, land) carries gear containing at least 23 nonfuel mineral commodities for which the United States is greater than 50 percent net import reliant. The graphics display the leading world producers of selected nonfuel mineral commodities used to manufacture U.S. Navy SEAL gear. SEALs are members of the U.S. Navy's special operations forces.

Accelerating semantic graph databases on commodity clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morari, Alessandro; Castellana, Vito G.; Haglin, David J.

We are developing a full software system for accelerating semantic graph databases on commodity cluster that scales to hundreds of nodes while maintaining constant query throughput. Our framework comprises a SPARQL to C++ compiler, a library of parallel graph methods and a custom multithreaded runtime layer, which provides a Partitioned Global Address Space (PGAS) programming model with fork/join parallelism and automatic load balancing over a commodity clusters. We present preliminary results for the compiler and for the runtime.
GPU accelerated fuzzy connected image segmentation by using CUDA.

PubMed

Zhuge, Ying; Cao, Yong; Miller, Robert W

2009-01-01

Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present a parallel fuzzy connected image segmentation algorithm on Nvidia's Compute Unified Device Architecture (CUDA) platform for segmenting large medical image data sets. Our experiments based on three data sets with small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 7.2x, 7.3x, and 14.4x, correspondingly, for the three data sets over the sequential implementation of fuzzy connected image segmentation algorithm on CPU.
Fast 2D flood modelling using GPU technology - recent applications and new developments

NASA Astrophysics Data System (ADS)

Crossley, Amanda; Lamb, Rob; Waller, Simon; Dunning, Paul

2010-05-01

In recent years there has been considerable interest amongst scientists and engineers in exploiting the potential of commodity graphics hardware for desktop parallel computing. The Graphics Processing Units (GPUs) that are used in PC graphics cards have now evolved into powerful parallel co-processors that can be used to accelerate the numerical codes used for floodplain inundation modelling. We report in this paper on experience over the past two years in developing and applying two dimensional (2D) flood inundation models using GPUs to achieve significant practical performance benefits. Starting with a solution scheme for the 2D diffusion wave approximation to the 2D Shallow Water Equations (SWEs), we have demonstrated the capability to reduce model run times in ‘real-world' applications using GPU hardware and programming techniques. We then present results from a GPU-based 2D finite volume SWE solver. A series of numerical test cases demonstrate that the model produces outputs that are accurate and consistent with reference results published elsewhere. In comparisons conducted for a real world test case, the GPU-based SWE model was over 100 times faster than the CPU version. We conclude with some discussion of practical experience in using the GPU technology for flood mapping applications, and for research projects investigating use of Monte Carlo simulation methods for the analysis of uncertainty in 2D flood modelling.
Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

NASA Astrophysics Data System (ADS)

Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

2013-01-01

Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.
Effective correlator for RadioAstron project

NASA Astrophysics Data System (ADS)

Sergeev, Sergey

This paper presents the implementation of programme FX-correlator for Very Long Baseline Interferometry, adapted for the project "RadioAstron". Software correlator implemented for heterogeneous computing systems using graphics accelerators. It is shown that for the task interferometry implementation of the graphics hardware has a high efficiency. The host processor of heterogeneous computing system, performs the function of forming the data flow for graphics accelerators, the number of which corresponds to the number of frequency channels. So, for the Radioastron project, such channels is seven. Each accelerator is perform correlation matrix for all bases for a single frequency channel. Initial data is converted to the floating-point format, is correction for the corresponding delay function and computes the entire correlation matrix simultaneously. Calculation of the correlation matrix is performed using the sliding Fourier transform. Thus, thanks to the compliance of a solved problem for architecture graphics accelerators, managed to get a performance for one processor platform Kepler, which corresponds to the performance of this task, the computing cluster platforms Intel on four nodes. This task successfully scaled not only on a large number of graphics accelerators, but also on a large number of nodes with multiple accelerators.
The application of artificial intelligent techniques to accelerator operations at McMaster University

NASA Astrophysics Data System (ADS)

Poehlman, W. F. S.; Garland, Wm. J.; Stark, J. W.

1993-06-01

In an era of downsizing and a limited pool of skilled accelerator personnel from which to draw replacements for an aging workforce, the impetus to integrate intelligent computer automation into the accelerator operator's repertoire is strong. However, successful deployment of an "Operator's Companion" is not trivial. Both graphical and human factors need to be recognized as critical areas that require extra care when formulating the Companion. They include interactive graphical user's interface that mimics, for the operator, familiar accelerator controls; knowledge of acquisition phases during development must acknowledge the expert's mental model of machine operation; and automated operations must be seen as improvements to the operator's environment rather than threats of ultimate replacement. Experiences with the PACES Accelerator Operator Companion developed at two sites over the past three years are related and graphical examples are given. The scale of the work involves multi-computer control of various start-up/shutdown and tuning procedures for Model FN and KN Van de Graaff accelerators. The response from licensing agencies has been encouraging.
Optimizing a mobile robot control system using GPU acceleration

NASA Astrophysics Data System (ADS)

Tuck, Nat; McGuinness, Michael; Martin, Fred

2012-01-01

This paper describes our attempt to optimize a robot control program for the Intelligent Ground Vehicle Competition (IGVC) by running computationally intensive portions of the system on a commodity graphics processing unit (GPU). The IGVC Autonomous Challenge requires a control program that performs a number of different computationally intensive tasks ranging from computer vision to path planning. For the 2011 competition our Robot Operating System (ROS) based control system would not run comfortably on the multicore CPU on our custom robot platform. The process of profiling the ROS control program and selecting appropriate modules for porting to run on a GPU is described. A GPU-targeting compiler, Bacon, is used to speed up development and help optimize the ported modules. The impact of the ported modules on overall performance is discussed. We conclude that GPU optimization can free a significant amount of CPU resources with minimal effort for expensive user-written code, but that replacing heavily-optimized library functions is more difficult, and a much less efficient use of time.
GPU-based streaming architectures for fast cone-beam CT image reconstruction and demons deformable registration.

PubMed

Sharp, G C; Kandasamy, N; Singh, H; Folkert, M

2007-10-07

This paper shows how to significantly accelerate cone-beam CT reconstruction and 3D deformable image registration using the stream-processing model. We describe data-parallel designs for the Feldkamp, Davis and Kress (FDK) reconstruction algorithm, and the demons deformable registration algorithm, suitable for use on a commodity graphics processing unit. The streaming versions of these algorithms are implemented using the Brook programming environment and executed on an NVidia 8800 GPU. Performance results using CT data of a preserved swine lung indicate that the GPU-based implementations of the FDK and demons algorithms achieve a substantial speedup--up to 80 times for FDK and 70 times for demons when compared to an optimized reference implementation on a 2.8 GHz Intel processor. In addition, the accuracy of the GPU-based implementations was found to be excellent. Compared with CPU-based implementations, the RMS differences were less than 0.1 Hounsfield unit for reconstruction and less than 0.1 mm for deformable registration.
Distributed rendering for multiview parallax displays

NASA Astrophysics Data System (ADS)

Annen, T.; Matusik, W.; Pfister, H.; Seidel, H.-P.; Zwicker, M.

2006-02-01

3D display technology holds great promise for the future of television, virtual reality, entertainment, and visualization. Multiview parallax displays deliver stereoscopic views without glasses to arbitrary positions within the viewing zone. These systems must include a high-performance and scalable 3D rendering subsystem in order to generate multiple views at real-time frame rates. This paper describes a distributed rendering system for large-scale multiview parallax displays built with a network of PCs, commodity graphics accelerators, multiple projectors, and multiview screens. The main challenge is to render various perspective views of the scene and assign rendering tasks effectively. In this paper we investigate two different approaches: Optical multiplexing for lenticular screens and software multiplexing for parallax-barrier displays. We describe the construction of large-scale multi-projector 3D display systems using lenticular and parallax-barrier technology. We have developed different distributed rendering algorithms using the Chromium stream-processing framework and evaluate the trade-offs and performance bottlenecks. Our results show that Chromium is well suited for interactive rendering on multiview parallax displays.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Allada, Veerendra, Benjegerdes, Troy; Bode, Brett

Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as themore » workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.« less
Performance evaluation of throughput computing workloads using multi-core processors and graphics processors

NASA Astrophysics Data System (ADS)

Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.

2017-11-01

Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
MAPA: Implementation of the Standard Interchange Format and use for analyzing lattices

NASA Astrophysics Data System (ADS)

Shasharina, Svetlana G.; Cary, John R.

1997-05-01

MAPA (Modular Accelerator Physics Analysis) is an object oriented application for accelerator design and analysis with a Motif based graphical user interface. MAPA has been ported to AIX, Linux, HPUX, Solaris, and IRIX. MAPA provides an intuitive environment for accelerator study and design. The user can bring up windows for fully nonlinear analysis of accelerator lattices in any number of dimensions. The current graphical analysis methods of Lifetime plots and Surfaces of Section have been used to analyze the improved lattice designs of Wan, Cary, and Shasharina (this conference). MAPA can now read and write Standard Interchange Format (MAD) accelerator description files and it has a general graphical user interface for adding, changing, and deleting elements. MAPA's consistency checks prevent deletion of used elements and prevent creation of recursive beam lines. Plans include development of a richer set of modeling tools and the ability to invoke existing modeling codes through the MAPA interface. MAPA will be demonstrated on a Pentium 150 laptop running Linux.
Millisecond precision psychological research in a world of commodity computers: new hardware, new problems?

PubMed

Plant, Richard R; Turner, Garry

2009-08-01

Since the publication of Plant, Hammond, and Turner (2004), which highlighted a pressing need for researchers to pay more attention to sources of error in computer-based experiments, the landscape has undoubtedly changed, but not necessarily for the better. Readily available hardware has improved in terms of raw speed; multi core processors abound; graphics cards now have hundreds of megabytes of RAM; main memory is measured in gigabytes; drive space is measured in terabytes; ever larger thin film transistor displays capable of single-digit response times, together with newer Digital Light Processing multimedia projectors, enable much greater graphic complexity; and new 64-bit operating systems, such as Microsoft Vista, are now commonplace. However, have millisecond-accurate presentation and response timing improved, and will they ever be available in commodity computers and peripherals? In the present article, we used a Black Box ToolKit to measure the variability in timing characteristics of hardware used commonly in psychological research.
Acceleration of integral imaging based incoherent Fourier hologram capture using graphic processing unit.

PubMed

Jeong, Kyeong-Min; Kim, Hee-Seung; Hong, Sung-In; Lee, Sung-Keun; Jo, Na-Young; Kim, Yong-Soo; Lim, Hong-Gi; Park, Jae-Hyeung

2012-10-08

Speed enhancement of integral imaging based incoherent Fourier hologram capture using a graphic processing unit is reported. Integral imaging based method enables exact hologram capture of real-existing three-dimensional objects under regular incoherent illumination. In our implementation, we apply parallel computation scheme using the graphic processing unit, accelerating the processing speed. Using enhanced speed of hologram capture, we also implement a pseudo real-time hologram capture and optical reconstruction system. The overall operation speed is measured to be 1 frame per second.
Simple techniques for improving deep neural network outcomes on commodity hardware

NASA Astrophysics Data System (ADS)

Colina, Nicholas Christopher A.; Perez, Carlos E.; Paraan, Francis N. C.

2017-08-01

We benchmark improvements in the performance of deep neural networks (DNN) on the MNIST data test upon imple-menting two simple modifications to the algorithm that have little overhead computational cost. First is GPU parallelization on a commodity graphics card, and second is initializing the DNN with random orthogonal weight matrices prior to optimization. Eigenspectra analysis of the weight matrices reveal that the initially orthogonal matrices remain nearly orthogonal after training. The probability distributions from which these orthogonal matrices are drawn are also shown to significantly affect the performance of these deep neural networks.
The Middle East Today: An Atlas of Reproducible Pages. Revised Edition.

ERIC Educational Resources Information Center

World Eagle, Inc., Wellesley, MA.

This book contains blank outline maps of the continent/region, tables and graphics depicting the size, population, resources and water, commodities , trade, cities, languages, religions, industry, energy, food and agriculture, demographic statistics, aspects of the national economies, and aspects of the national governments of the Middle East.…
DOE Office of Scientific and Technical Information (OSTI.GOV)

Liscom, W.L.

This book presents a complete graphic and statistical portrait of the dramatic shifts in global energy flows during the 1970s and the resultant transfer of economic and political power from the industrial nations to the oil-producing states. The information was extracted from government-source documents and compiled in a computer data base. Computer graphics were combined with the data base to produce over 400 full-color graphs. The energy commodities covered are oil, natural gas, coal, nuclear, and conventional electric-power generation. Also included are data on hydroelectric and geothermal power, oil shale, tar sands, and other alternative energy sources. 72 references.
Graphics Processing Unit Acceleration of Gyrokinetic Turbulence Simulations

NASA Astrophysics Data System (ADS)

Hause, Benjamin; Parker, Scott

2012-10-01

We find a substantial increase in on-node performance using Graphics Processing Unit (GPU) acceleration in gyrokinetic delta-f particle-in-cell simulation. Optimization is performed on a two-dimensional slab gyrokinetic particle simulation using the Portland Group Fortran compiler with the GPU accelerator compiler directives. We have implemented the GPU acceleration on a Core I7 gaming PC with a NVIDIA GTX 580 GPU. We find comparable, or better, acceleration relative to the NERSC DIRAC cluster with the NVIDIA Tesla C2050 computing processor. The Tesla C 2050 is about 2.6 times more expensive than the GTX 580 gaming GPU. Optimization strategies and comparisons between DIRAC and the gaming PC will be presented. We will also discuss progress on optimizing the comprehensive three dimensional general geometry GEM code.
Acceleration of GPU-based Krylov solvers via data transfer reduction

DOE PAGES

Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...

2015-04-08

Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Accelerating epistasis analysis in human genetics with consumer graphics hardware.

PubMed

Sinnott-Armstrong, Nicholas A; Greene, Casey S; Cancare, Fabio; Moore, Jason H

2009-07-24

Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500. Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.

SIMD Optimization of Linear Expressions for Programmable Graphics Hardware

PubMed Central

Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang

2009-01-01

The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
Accelerated Adaptive MGS Phase Retrieval

NASA Technical Reports Server (NTRS)

Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang

2011-01-01

The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.
Stable stress‐drop measurements and their variability: Implications for ground‐motion prediction

USGS Publications Warehouse

Hanks, Thomas C.; Baltay, Annemarie S.; Beroza, Gregory C.

2013-01-01

We estimate the arms‐stress drop, Graphic, (Hanks, 1979) using acceleration time records of 59 earthquakes from two earthquake sequences in eastern Honshu, Japan. These acceleration‐based static stress drops compare well to stress drops calculated for the same events by Baltay et al. (2011) using an empirical Green’s function (eGf) approach. This agreement supports the assumption that earthquake acceleration time histories in the bandwidth between the corner frequency and a maximum observed frequency can be considered white, Gaussian, noise. Although the Graphic is computationally simpler than the eGf‐based Graphic‐stress drop, and is used as the “stress parameter” to describe the earthquake source in ground‐motion prediction equations, we find that it only compares well to the Graphic at source‐station distances of ∼20 km or less because there is no consideration of whole‐path anelastic attenuation or scattering. In these circumstances, the correlation between the Graphic and Graphic is strong. Events with high and low stress drops obtained through the eGf method have similarly high and low Graphic. We find that the inter‐event standard deviation of stress drop, for the population of earthquakes considered, is similar for both methods, 0.40 for the Graphic method and 0.42 for the Graphic, in log10 units, provided we apply the ∼20 km distance restriction to Graphic. This indicates that the observed variability is inherent to the source, rather than attributable to uncertainties in stress‐drop estimates
A prototype of a beam steering assistant tool for accelerator operations

DOE Office of Scientific and Technical Information (OSTI.GOV)

M. Bickley; P. Chevtsov

2006-10-24

The CEBAF accelerator provides nuclear physics experiments at Jefferson Lab with high quality electron beams. Three experimental end stations can simultaneously receive the beams with different energies and intensities. For each operational mode, the accelerator setup procedures are complicated and require very careful checking of beam spot sizes and positions on multiple beam viewers. To simplify these procedures and make them reproducible, a beam steering assistant GUI tool has been created. The tool is implemented as a multi-window control screen. The screen has an interactive graphical object window, which is an overlay on top of a digitized live video imagemore » from a beam viewer. It allows a user to easily create and edit any graphical objects consisting of text, ellipses, and lines, right above the live beam viewer image and then save them in a file that is called a beam steering template. The template can show, for example, the area within which the beam must always be on the viewer. Later, this template can be loaded in the interactive graphical object window to help accelerator operators steer the beam to the specified area on the viewer.« less
Fast image interpolation for motion estimation using graphics hardware

NASA Astrophysics Data System (ADS)

Kelly, Francis; Kokaram, Anil

2004-05-01

Motion estimation and compensation is the key to high quality video coding. Block matching motion estimation is used in most video codecs, including MPEG-2, MPEG-4, H.263 and H.26L. Motion estimation is also a key component in the digital restoration of archived video and for post-production and special effects in the movie industry. Sub-pixel accurate motion vectors can improve the quality of the vector field and lead to more efficient video coding. However sub-pixel accuracy requires interpolation of the image data. Image interpolation is a key requirement of many image processing algorithms. Often interpolation can be a bottleneck in these applications, especially in motion estimation due to the large number pixels involved. In this paper we propose using commodity computer graphics hardware for fast image interpolation. We use the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.
3D graphics hardware accelerator programming methods for real-time visualization systems

NASA Astrophysics Data System (ADS)

Souetov, Andrew E.

2001-02-01

The paper deals with new approaches in software design for creating real-time applications that use modern graphics acceleration hardware. The growing complexity of such type of software compels programmers to use different types of CASE systems in design and development process. The subject under discussion is integration of such systems in a development process, their effective use, and the combination of these new methods with the necessity to produce optimal codes. A method of simulation integration and modeling tools in real-time software development cycle is described.
3D graphics hardware accelerator programming methods for real-time visualization systems

NASA Astrophysics Data System (ADS)

Souetov, Andrew E.

2000-02-01

The paper deals with new approaches in software design for creating real-time applications that use modern graphics acceleration hardware. The growing complexity of such type of software compels programmers to use different types of CASE systems in design and development process. The subject under discussion is integration of such systems in a development process, their effective use, and the combination of these new methods with the necessity to produce optimal codes. A method of simulation integration and modeling tools in real-time software development cycle is described.
High-performance dynamic quantum clustering on graphics processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wittek, Peter, E-mail: peterwittek@acm.org

2013-01-15

Clustering methods in machine learning may benefit from borrowing metaphors from physics. Dynamic quantum clustering associates a Gaussian wave packet with the multidimensional data points and regards them as eigenfunctions of the Schroedinger equation. The clustering structure emerges by letting the system evolve and the visual nature of the algorithm has been shown to be useful in a range of applications. Furthermore, the method only uses matrix operations, which readily lend themselves to parallelization. In this paper, we develop an implementation on graphics hardware and investigate how this approach can accelerate the computations. We achieve a speedup of up tomore » two magnitudes over a multicore CPU implementation, which proves that quantum-like methods and acceleration by graphics processing units have a great relevance to machine learning.« less
Maximum Acceleration Recording Circuit

NASA Technical Reports Server (NTRS)

Bozeman, Richard J., Jr.

1995-01-01

Coarsely digitized maximum levels recorded in blown fuses. Circuit feeds power to accelerometer and makes nonvolatile record of maximum level to which output of accelerometer rises during measurement interval. In comparison with inertia-type single-preset-trip-point mechanical maximum-acceleration-recording devices, circuit weighs less, occupies less space, and records accelerations within narrower bands of uncertainty. In comparison with prior electronic data-acquisition systems designed for same purpose, circuit simpler, less bulky, consumes less power, costs and analysis of data recorded in magnetic or electronic memory devices. Circuit used, for example, to record accelerations to which commodities subjected during transportation on trucks.
The Effects of Observation Coaching on Children's Graphic Representations

ERIC Educational Resources Information Center

Vlach, Haley A.; Carver, Sharon M.

2008-01-01

Education programs have fostered advanced levels of graphic representation ability in young children but have not detailed the specific mechanisms responsible for the accelerated growth. Research suggests that between 6 and 8 years of age children begin to observe more carefully before drawing and that observation prompts aid children's…
Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

PubMed

Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

2013-12-01

Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.
Wet Lab Accelerator: A Web-Based Application Democratizing Laboratory Automation for Synthetic Biology.

PubMed

Bates, Maxwell; Berliner, Aaron J; Lachoff, Joe; Jaschke, Paul R; Groban, Eli S

2017-01-20

Wet Lab Accelerator (WLA) is a cloud-based tool that allows a scientist to conduct biology via robotic control without the need for any programming knowledge. A drag and drop interface provides a convenient and user-friendly method of generating biological protocols. Graphically developed protocols are turned into programmatic instruction lists required to conduct experiments at the cloud laboratory Transcriptic. Prior to the development of WLA, biologists were required to write in a programming language called "Autoprotocol" in order to work with Transcriptic. WLA relies on a new abstraction layer we call "Omniprotocol" to convert the graphical experimental description into lower level Autoprotocol language, which then directs robots at Transcriptic. While WLA has only been tested at Transcriptic, the conversion of graphically laid out experimental steps into Autoprotocol is generic, allowing extension of WLA into other cloud laboratories in the future. WLA hopes to democratize biology by bringing automation to general biologists.
Production Level CFD Code Acceleration for Hybrid Many-Core Architectures

NASA Technical Reports Server (NTRS)

Duffy, Austen C.; Hammond, Dana P.; Nielsen, Eric J.

2012-01-01

In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time.
Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units.

PubMed

Kussmann, Jörg; Ochsenfeld, Christian

2017-06-13

We present an extension of our graphics processing units (GPU)-accelerated quantum chemistry package to employ OpenCL compute kernels, which can be executed on a wide range of computing devices like CPUs, Intel Xeon Phi, and AMD GPUs. Here, we focus on the use of AMD GPUs and discuss differences as compared to CUDA-based calculations on NVIDIA GPUs. First illustrative timings are presented for hybrid density functional theory calculations using serial as well as parallel compute environments. The results show that AMD GPUs are as fast or faster than comparable NVIDIA GPUs and provide a viable alternative for quantum chemical applications.
Grace: A cross-platform micromagnetic simulator on graphics processing units

NASA Astrophysics Data System (ADS)

Zhu, Ru

2015-12-01

A micromagnetic simulator running on graphics processing units (GPUs) is presented. Different from GPU implementations of other research groups which are predominantly running on NVidia's CUDA platform, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform independent. It runs on GPUs from venders including NVidia, AMD and Intel, and achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude. The simulator paved the way for running large size micromagnetic simulations on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics cards, and is freely available to download.
Strategic Considerations of the Sino-Cuban Relationship as the United States Renews Relations with Cuba

DTIC Science & Technology

2016-05-26

obtained for the inclusion of pictures, maps, graphics, and any other works incorporated into this manuscript. A work of the United States Government is not...in 2004 by Venezuela and Cuba. The eleven members are Antigua and Barbuda, Bolivia, Cuba, Dominica, Ecuador , Grenada, Nicaragua, Saint Kitts and...region to the US alone in trade relations. China needs natural resources and commodities such as oil and soybeans and many Latin American governments and
Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

NASA Astrophysics Data System (ADS)

Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

2017-02-01

The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.
galario: Gpu Accelerated Library for Analyzing Radio Interferometer Observations

NASA Astrophysics Data System (ADS)

Tazzari, Marco; Beaujean, Frederik; Testi, Leonardo

2017-10-01

The galario library exploits the computing power of modern graphic cards (GPUs) to accelerate the comparison of model predictions to radio interferometer observations. It speeds up the computation of the synthetic visibilities given a model image (or an axisymmetric brightness profile) and their comparison to the observations.
Ice-sheet modelling accelerated by graphics cards

NASA Astrophysics Data System (ADS)

Brædstrup, Christian Fredborg; Damsgaard, Anders; Egholm, David Lundbek

2014-11-01

Studies of glaciers and ice sheets have increased the demand for high performance numerical ice flow models over the past decades. When exploring the highly non-linear dynamics of fast flowing glaciers and ice streams, or when coupling multiple flow processes for ice, water, and sediment, researchers are often forced to use super-computing clusters. As an alternative to conventional high-performance computing hardware, the Graphical Processing Unit (GPU) is capable of massively parallel computing while retaining a compact design and low cost. In this study, we present a strategy for accelerating a higher-order ice flow model using a GPU. By applying the newest GPU hardware, we achieve up to 180× speedup compared to a similar but serial CPU implementation. Our results suggest that GPU acceleration is a competitive option for ice-flow modelling when compared to CPU-optimised algorithms parallelised by the OpenMP or Message Passing Interface (MPI) protocols.
Short-time windowed covariance: A metric for identifying non-stationary, event-related covariant cortical sites

PubMed Central

Blakely, Timothy; Ojemann, Jeffrey G.; Rao, Rajesh P.N.

2014-01-01

Background Electrocorticography (ECoG) signals can provide high spatio-temporal resolution and high signal to noise ratio recordings of local neural activity from the surface of the brain. Previous studies have shown that broad-band, spatially focal, high-frequency increases in ECoG signals are highly correlated with movement and other cognitive tasks and can be volitionally modulated. However, significant additional information may be present in inter-electrode interactions, but adding additional higher order inter-electrode interactions can be impractical from a computational aspect, if not impossible. New method In this paper we present a new method of calculating high frequency interactions between electrodes called Short-Time Windowed Covariance (STWC) that builds on mathematical techniques currently used in neural signal analysis, along with an implementation that accelerates the algorithm by orders of magnitude by leveraging commodity, off-the-shelf graphics processing unit (GPU) hardware. Results Using the hardware-accelerated implementation of STWC, we identify many types of event-related inter-electrode interactions from human ECoG recordings on global and local scales that have not been identified by previous methods. Unique temporal patterns are observed for digit flexion in both low- (10 mm spacing) and high-resolution (3 mm spacing) electrode arrays. Comparison with existing methods Covariance is a commonly used metric for identifying correlated signals, but the standard covariance calculations do not allow for temporally varying covariance. In contrast STWC allows and identifies event-driven changes in covariance without identifying spurious noise correlations. Conclusions: STWC can be used to identify event-related neural interactions whose high computational load is well suited to GPU capabilities. PMID:24211499

Clinical engineering department strategic graphical dashboard to enhance maintenance planning and asset management.

PubMed

Sloane, Elliot; Rosow, Eric; Adam, Joe; Shine, Dave

2005-01-01

The Clinical Engineering (a.k.a. Biomedical Engineering) Department has heretofore lagged in adoption of some of the leading-edge information system tools used in other industries. This present application is part of a DOD-funded SBIR grant to improve the overall management of medical technology, and describes the capabilities that Strategic Graphical Dashboards (SGDs) can afford. This SGD is built on top of an Oracle database, and uses custom-written graphic objects like gauges, fuel tanks, and Geographic Information System (GIS) maps to improve and accelerate decision making.
Fischer Indole Synthesis in the Gas Phase, the Solution Phase, and at the Electrospray Droplet Interface.

PubMed

Bain, Ryan M; Ayrton, Stephen T; Cooks, R Graham

2017-07-01

Previous reports have shown that reactions occurring in the microdroplets formed during electrospray ionization can, under the right conditions, exhibit significantly greater rates than the corresponding bulk solution-phase reactions. The observed acceleration under electrospray ionization could result from a solution-phase, a gas-phase, or an interfacial reaction. This study shows that a gas-phase ion/molecule (or ion/ion) reaction is not responsible for the observed rate enhancement in the particular case of the Fischer indole synthesis. The results show that the accelerated reaction proceeds in the microdroplets, and evidence is provided that an interfacial process is involved. Graphical Abstract .
Stereoscopic 3D graphics generation

NASA Astrophysics Data System (ADS)

Li, Zhi; Liu, Jianping; Zan, Y.

1997-05-01

Stereoscopic display technology is one of the key techniques of areas such as simulation, multimedia, entertainment, virtual reality, and so on. Moreover, stereoscopic 3D graphics generation is an important part of stereoscopic 3D display system. In this paper, at first, we describe the principle of stereoscopic display and summarize some methods to generate stereoscopic 3D graphics. Secondly, to overcome the problems which came from the methods of user defined models (such as inconvenience, long modifying period and so on), we put forward the vector graphics files defined method. Thus we can design more directly; modify the model simply and easily; generate more conveniently; furthermore, we can make full use of graphics accelerator card and so on. Finally, we discuss the problem of how to speed up the generation.
Using a commodity high-definition television for collaborative structural biology

PubMed Central

Yennamalli, Ragothaman; Arangarasan, Raj; Bryden, Aaron; Gleicher, Michael; Phillips, George N.

2014-01-01

Visualization of protein structures using stereoscopic systems is frequently needed by structural biologists working to understand a protein’s structure–function relationships. Often several scientists are working as a team and need simultaneous interaction with each other and the graphics representations. Most existing molecular visualization tools support single-user tasks, which are not suitable for a collaborative group. Expensive caves, domes or geowalls have been developed, but the availability and low cost of high-definition televisions (HDTVs) and game controllers in the commodity entertainment market provide an economically attractive option to achieve a collaborative environment. This paper describes a low-cost environment, using standard consumer game controllers and commercially available stereoscopic HDTV monitors with appropriate signal converters for structural biology collaborations employing existing binary distributions of commonly used software packages like Coot, PyMOL, Chimera, VMD, O, Olex2 and others. PMID:24904249
Forward and adjoint spectral-element simulations of seismic wave propagation using hardware accelerators

NASA Astrophysics Data System (ADS)

Peter, Daniel; Videau, Brice; Pouget, Kevin; Komatitsch, Dimitri

2015-04-01

Improving the resolution of tomographic images is crucial to answer important questions on the nature of Earth's subsurface structure and internal processes. Seismic tomography is the most prominent approach where seismic signals from ground-motion records are used to infer physical properties of internal structures such as compressional- and shear-wave speeds, anisotropy and attenuation. Recent advances in regional- and global-scale seismic inversions move towards full-waveform inversions which require accurate simulations of seismic wave propagation in complex 3D media, providing access to the full 3D seismic wavefields. However, these numerical simulations are computationally very expensive and need high-performance computing (HPC) facilities for further improving the current state of knowledge. During recent years, many-core architectures such as graphics processing units (GPUs) have been added to available large HPC systems. Such GPU-accelerated computing together with advances in multi-core central processing units (CPUs) can greatly accelerate scientific applications. There are mainly two possible choices of language support for GPU cards, the CUDA programming environment and OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted mainly by AMD graphic cards. In order to employ such hardware accelerators for seismic wave propagation simulations, we incorporated a code generation tool BOAST into an existing spectral-element code package SPECFEM3D_GLOBE. This allows us to use meta-programming of computational kernels and generate optimized source code for both CUDA and OpenCL languages, running simulations on either CUDA or OpenCL hardware accelerators. We show here applications of forward and adjoint seismic wave propagation on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.
A daily living activity remote monitoring system for solitary elderly people.

PubMed

Maki, Hiromichi; Ogawa, Hidekuni; Matsuoka, Shingo; Yonezawa, Yoshiharu; Caldwell, W Morton

2011-01-01

A daily living activity remote monitoring system has been developed for supporting solitary elderly people. The monitoring system consists of a tri-axis accelerometer, six low-power active filters, a low-power 8-bit microcontroller (MC), a 1GB SD memory card (SDMC) and a 2.4 GHz low transmitting power mobile phone (PHS). The tri-axis accelerometer attached to the subject's chest can simultaneously measure dynamic and static acceleration forces produced by heart sound, respiration, posture and behavior. The heart rate, respiration rate, activity, posture and behavior are detected from the dynamic and static acceleration forces. These data are stored in the SD. The MC sends the data to the server computer every hour. The server computer stores the data and makes a graphic chart from the data. When the caregiver calls from his/her mobile phone to the server computer, the server computer sends the graphical chart via the PHS. The caregiver's mobile phone displays the chart to the monitor graphically.
Modified atmosphere packaging of fruits and vegetables.

PubMed

Kader, A A; Zagory, D; Kerbel, E L

1989-01-01

Modified atmospheres (MA), i.e., elevated concentrations of carbon dioxide and reduced levels of oxygen and ethylene, can be useful supplements to provide optimum temperature and relative humidity in maintaining the quality of fresh fruits and vegetables after harvest. MA benefits include reduced respiration, ethylene production, and sensitivity to ethylene; retarded softening and compositional changes; alleviation of certain physiological disorders; and reduced decay. Subjecting fresh produce to too low an oxygen concentration and/or to too high a carbon dioxide level can result in MA stress, which is manifested by accelerated deterioration. Packaging fresh produce in polymeric films can result in a commodity-generated MA. Atmosphere modification within such packages depends on film permeability, commodity respiration rate and gas diffusion characteristics, and initial free volume and atmospheric composition within the package. Temperature, relative humidity, and air movement around the package can influence the permeability of the film. Temperature also affects the metabolic activity of the commodity and consequently the rate of attaining the desired MA. All these factors must be considered in developing a mathematical model for selecting the most suitable film for each commodity.
Performance and scalability of Fourier domain optical coherence tomography acceleration using graphics processing units.

PubMed

Li, Jian; Bloch, Pavel; Xu, Jing; Sarunic, Marinko V; Shannon, Lesley

2011-05-01

Fourier domain optical coherence tomography (FD-OCT) provides faster line rates, better resolution, and higher sensitivity for noninvasive, in vivo biomedical imaging compared to traditional time domain OCT (TD-OCT). However, because the signal processing for FD-OCT is computationally intensive, real-time FD-OCT applications demand powerful computing platforms to deliver acceptable performance. Graphics processing units (GPUs) have been used as coprocessors to accelerate FD-OCT by leveraging their relatively simple programming model to exploit thread-level parallelism. Unfortunately, GPUs do not "share" memory with their host processors, requiring additional data transfers between the GPU and CPU. In this paper, we implement a complete FD-OCT accelerator on a consumer grade GPU/CPU platform. Our data acquisition system uses spectrometer-based detection and a dual-arm interferometer topology with numerical dispersion compensation for retinal imaging. We demonstrate that the maximum line rate is dictated by the memory transfer time and not the processing time due to the GPU platform's memory model. Finally, we discuss how the performance trends of GPU-based accelerators compare to the expected future requirements of FD-OCT data rates.
A real-time GNSS-R system based on software-defined radio and graphics processing units

NASA Astrophysics Data System (ADS)

Hobiger, Thomas; Amagai, Jun; Aida, Masanori; Narita, Hideki

2012-04-01

Reflected signals of the Global Navigation Satellite System (GNSS) from the sea or land surface can be utilized to deduce and monitor physical and geophysical parameters of the reflecting area. Unlike most other remote sensing techniques, GNSS-Reflectometry (GNSS-R) operates as a passive radar that takes advantage from the increasing number of navigation satellites that broadcast their L-band signals. Thereby, most of the GNSS-R receiver architectures are based on dedicated hardware solutions. Software-defined radio (SDR) technology has advanced in the recent years and enabled signal processing in real-time, which makes it an ideal candidate for the realization of a flexible GNSS-R system. Additionally, modern commodity graphic cards, which offer massive parallel computing performances, allow to handle the whole signal processing chain without interfering with the PC's CPU. Thus, this paper describes a GNSS-R system which has been developed on the principles of software-defined radio supported by General Purpose Graphics Processing Units (GPGPUs), and presents results from initial field tests which confirm the anticipated capability of the system.
Viewpoints: A High-Performance High-Dimensional Exploratory Data Analysis Tool

NASA Astrophysics Data System (ADS)

Gazis, P. R.; Levit, C.; Way, M. J.

2010-12-01

Scientific data sets continue to increase in both size and complexity. In the past, dedicated graphics systems at supercomputing centers were required to visualize large data sets, but as the price of commodity graphics hardware has dropped and its capability has increased, it is now possible, in principle, to view large complex data sets on a single workstation. To do this in practice, an investigator will need software that is written to take advantage of the relevant graphics hardware. The Viewpoints visualization package described herein is an example of such software. Viewpoints is an interactive tool for exploratory visual analysis of large high-dimensional (multivariate) data. It leverages the capabilities of modern graphics boards (GPUs) to run on a single workstation or laptop. Viewpoints is minimalist: it attempts to do a small set of useful things very well (or at least very quickly) in comparison with similar packages today. Its basic feature set includes linked scatter plots with brushing, dynamic histograms, normalization, and outlier detection/removal. Viewpoints was originally designed for astrophysicists, but it has since been used in a variety of fields that range from astronomy, quantum chemistry, fluid dynamics, machine learning, bioinformatics, and finance to information technology server log mining. In this article, we describe the Viewpoints package and show examples of its usage.
Introduction of Parallel GPGPU Acceleration Algorithms for the Solution of Radiative Transfer

NASA Technical Reports Server (NTRS)

Godoy, William F.; Liu, Xu

2011-01-01

General-purpose computing on graphics processing units (GPGPU) is a recent technique that allows the parallel graphics processing unit (GPU) to accelerate calculations performed sequentially by the central processing unit (CPU). To introduce GPGPU to radiative transfer, the Gauss-Seidel solution of the well-known expressions for 1-D and 3-D homogeneous, isotropic media is selected as a test case. Different algorithms are introduced to balance memory and GPU-CPU communication, critical aspects of GPGPU. Results show that speed-ups of one to two orders of magnitude are obtained when compared to sequential solutions. The underlying value of GPGPU is its potential extension in radiative solvers (e.g., Monte Carlo, discrete ordinates) at a minimal learning curve.
The Study of Two-Dimensional Oscillations Using a Smartphone Acceleration Sensor: Example of Lissajous Curves

ERIC Educational Resources Information Center

Tuset-Sanchis, Luis; Castro-Palacio, Juan C.; Gómez-Tejedor, José A.; Manjón, Francisco J.; Monsoriu, Juan A.

2015-01-01

A smartphone acceleration sensor is used to study two-dimensional harmonic oscillations. The data recorded by the free android application, Accelerometer Toy, is used to determine the periods of oscillation by graphical analysis. Different patterns of the Lissajous curves resulting from the superposition of harmonic motions are illustrated for…
7 CFR 1436.13 - Loan installments, delinquency, and acceleration of maturity date.

Code of Federal Regulations, 2014 CFR

2014-01-01

... each 12-month period of each of the partial and final loan disbursements, until the principal plus...) COMMODITY CREDIT CORPORATION, DEPARTMENT OF AGRICULTURE LOANS, PURCHASES, AND OTHER OPERATIONS FARM STORAGE... financial plan submitted by the debtor, CCC may send two subsequent written demands at approximately 30...
7 CFR 1436.13 - Loan installments, delinquency, and acceleration of maturity date.

Code of Federal Regulations, 2011 CFR

2011-01-01

... each 12-month period of each of the partial and final loan disbursements, until the principal plus...) COMMODITY CREDIT CORPORATION, DEPARTMENT OF AGRICULTURE LOANS, PURCHASES, AND OTHER OPERATIONS FARM STORAGE... financial plan submitted by the debtor, CCC may send two subsequent written demands at approximately 30...
7 CFR 1436.13 - Loan installments, delinquency, and acceleration of maturity date.

Code of Federal Regulations, 2012 CFR

2012-01-01

... each 12-month period of each of the partial and final loan disbursements, until the principal plus...) COMMODITY CREDIT CORPORATION, DEPARTMENT OF AGRICULTURE LOANS, PURCHASES, AND OTHER OPERATIONS FARM STORAGE... financial plan submitted by the debtor, CCC may send two subsequent written demands at approximately 30...
7 CFR 1436.13 - Loan installments, delinquency, and acceleration of maturity date.

Code of Federal Regulations, 2013 CFR

2013-01-01

... each 12-month period of each of the partial and final loan disbursements, until the principal plus...) COMMODITY CREDIT CORPORATION, DEPARTMENT OF AGRICULTURE LOANS, PURCHASES, AND OTHER OPERATIONS FARM STORAGE... financial plan submitted by the debtor, CCC may send two subsequent written demands at approximately 30...
Real-Time GPS-Alternative Navigation Using Commodity Hardware

DTIC Science & Technology

2007-06-01

4.1 Test Plan and Setup ..............................................................................................84 4.1.1 Component and...improvements planned , the most influential for navigation are additional signals, frequencies, and improved signal strength. These improvements will... planned and implemented to provide maximum extensibility for additional sensors and functionality without disturbing the core GPU-accelerated
Simulating economics and environmental impacts of beef and soybean systems in Brazil's Pamas and Amozon Biomes

USDA-ARS?s Scientific Manuscript database

Recent reductions in the deforestation of the Amazon biome have highlighted the need for the sustainable intensification of beef and commodity crop production in Brazil to increase agricultural productivity without accelerating adverse environmental impacts related to greenhouse gas emissions, eutro...
The study of two-dimensional oscillations using a smartphone acceleration sensor: example of Lissajous curves

NASA Astrophysics Data System (ADS)

Tuset-Sanchis, Luis; Castro-Palacio, Juan C.; Gómez-Tejedor, José A.; Manjón, Francisco J.; Monsoriu, Juan A.

2015-08-01

A smartphone acceleration sensor is used to study two-dimensional harmonic oscillations. The data recorded by the free android application, Accelerometer Toy, is used to determine the periods of oscillation by graphical analysis. Different patterns of the Lissajous curves resulting from the superposition of harmonic motions are illustrated for three experiments. This work introduces an example of how two-dimensional oscillations can be easily studied with a smartphone acceleration sensor.
Graphics-processing-unit-accelerated finite-difference time-domain simulation of the interaction between ultrashort laser pulses and metal nanoparticles

NASA Astrophysics Data System (ADS)

Nikolskiy, V. P.; Stegailov, V. V.

2018-01-01

Metal nanoparticles (NPs) serve as important tools for many modern technologies. However, the proper microscopic models of the interaction between ultrashort laser pulses and metal NPs are currently not very well developed in many cases. One part of the problem is the description of the warm dense matter that is formed in NPs after intense irradiation. Another part of the problem is the description of the electromagnetic waves around NPs. Description of wave propagation requires the solution of Maxwell’s equations and the finite-difference time-domain (FDTD) method is the classic approach for solving them. There are many commercial and free implementations of FDTD, including the open source software that supports graphics processing unit (GPU) acceleration. In this report we present the results on the FDTD calculations for different cases of the interaction between ultrashort laser pulses and metal nanoparticles. Following our previous results, we analyze the efficiency of the GPU acceleration of the FDTD algorithm.

On accelerated flow of MHD powell-eyring fluid via homotopy analysis method

NASA Astrophysics Data System (ADS)

Salah, Faisal; Viswanathan, K. K.; Aziz, Zainal Abdul

2017-09-01

The aim of this article is to obtain the approximate analytical solution for incompressible magnetohydrodynamic (MHD) flow for Powell-Eyring fluid induced by an accelerated plate. Both constant and variable accelerated cases are investigated. Approximate analytical solution in each case is obtained by using the Homotopy Analysis Method (HAM). The resulting nonlinear analysis is carried out to generate the series solution. Finally, Graphical outcomes of different values of the material constants parameters on the velocity flow field are discussed and analyzed.
Using FastX on the Peregrine System | High-Performance Computing | NREL

Science.gov Websites

with full 3D hardware acceleration. The traditional method of displaying graphics applications to a remote X server (indirect rendering) supports 3D hardware acceleration, but this approach causes all of the OpenGL commands and 3D data to be sent over the network to be rendered on the client machine. With
Graphics Processing Unit Acceleration of Gyrokinetic Turbulence Simulations

NASA Astrophysics Data System (ADS)

Hause, Benjamin; Parker, Scott; Chen, Yang

2013-10-01

We find a substantial increase in on-node performance using Graphics Processing Unit (GPU) acceleration in gyrokinetic delta-f particle-in-cell simulation. Optimization is performed on a two-dimensional slab gyrokinetic particle simulation using the Portland Group Fortran compiler with the OpenACC compiler directives and Fortran CUDA. Mixed implementation of both Open-ACC and CUDA is demonstrated. CUDA is required for optimizing the particle deposition algorithm. We have implemented the GPU acceleration on a third generation Core I7 gaming PC with two NVIDIA GTX 680 GPUs. We find comparable, or better, acceleration relative to the NERSC DIRAC cluster with the NVIDIA Tesla C2050 computing processor. The Tesla C 2050 is about 2.6 times more expensive than the GTX 580 gaming GPU. We also see enormous speedups (10 or more) on the Titan supercomputer at Oak Ridge with Kepler K20 GPUs. Results show speed-ups comparable or better than that of OpenMP models utilizing multiple cores. The use of hybrid OpenACC, CUDA Fortran, and MPI models across many nodes will also be discussed. Optimization strategies will be presented. We will discuss progress on optimizing the comprehensive three dimensional general geometry GEM code.
Accelerating sino-atrium computer simulations with graphic processing units.

PubMed

Zhang, Hong; Xiao, Zheng; Lin, Shien-fong

2015-01-01

Sino-atrial node cells (SANCs) play a significant role in rhythmic firing. To investigate their role in arrhythmia and interactions with the atrium, computer simulations based on cellular dynamic mathematical models are generally used. However, the large-scale computation usually makes research difficult, given the limited computational power of Central Processing Units (CPUs). In this paper, an accelerating approach with Graphic Processing Units (GPUs) is proposed in a simulation consisting of the SAN tissue and the adjoining atrium. By using the operator splitting method, the computational task was made parallel. Three parallelization strategies were then put forward. The strategy with the shortest running time was further optimized by considering block size, data transfer and partition. The results showed that for a simulation with 500 SANCs and 30 atrial cells, the execution time taken by the non-optimized program decreased 62% with respect to a serial program running on CPU. The execution time decreased by 80% after the program was optimized. The larger the tissue was, the more significant the acceleration became. The results demonstrated the effectiveness of the proposed GPU-accelerating methods and their promising applications in more complicated biological simulations.
A fast CT reconstruction scheme for a general multi-core PC.

PubMed

Zeng, Kai; Bai, Erwei; Wang, Ge

2007-01-01

Expensive computational cost is a severe limitation in CT reconstruction for clinical applications that need real-time feedback. A primary example is bolus-chasing computed tomography (CT) angiography (BCA) that we have been developing for the past several years. To accelerate the reconstruction process using the filtered backprojection (FBP) method, specialized hardware or graphics cards can be used. However, specialized hardware is expensive and not flexible. The graphics processing unit (GPU) in a current graphic card can only reconstruct images in a reduced precision and is not easy to program. In this paper, an acceleration scheme is proposed based on a multi-core PC. In the proposed scheme, several techniques are integrated, including utilization of geometric symmetry, optimization of data structures, single-instruction multiple-data (SIMD) processing, multithreaded computation, and an Intel C++ compilier. Our scheme maintains the original precision and involves no data exchange between the GPU and CPU. The merits of our scheme are demonstrated in numerical experiments against the traditional implementation. Our scheme achieves a speedup of about 40, which can be further improved by several folds using the latest quad-core processors.
A Fast CT Reconstruction Scheme for a General Multi-Core PC

PubMed Central

Zeng, Kai; Bai, Erwei; Wang, Ge

2007-01-01

Expensive computational cost is a severe limitation in CT reconstruction for clinical applications that need real-time feedback. A primary example is bolus-chasing computed tomography (CT) angiography (BCA) that we have been developing for the past several years. To accelerate the reconstruction process using the filtered backprojection (FBP) method, specialized hardware or graphics cards can be used. However, specialized hardware is expensive and not flexible. The graphics processing unit (GPU) in a current graphic card can only reconstruct images in a reduced precision and is not easy to program. In this paper, an acceleration scheme is proposed based on a multi-core PC. In the proposed scheme, several techniques are integrated, including utilization of geometric symmetry, optimization of data structures, single-instruction multiple-data (SIMD) processing, multithreaded computation, and an Intel C++ compilier. Our scheme maintains the original precision and involves no data exchange between the GPU and CPU. The merits of our scheme are demonstrated in numerical experiments against the traditional implementation. Our scheme achieves a speedup of about 40, which can be further improved by several folds using the latest quad-core processors. PMID:18256731
Hardware Testing and System Evaluation: Procedures to Evaluate Commodity Hardware for Production Clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goebel, J

2004-02-27

Without stable hardware any program will fail. The frustration and expense of supporting bad hardware can drain an organization, delay progress, and frustrate everyone involved. At Stanford Linear Accelerator Center (SLAC), we have created a testing method that helps our group, SLAC Computer Services (SCS), weed out potentially bad hardware and purchase the best hardware at the best possible cost. Commodity hardware changes often, so new evaluations happen periodically each time we purchase systems and minor re-evaluations happen for revised systems for our clusters, about twice a year. This general framework helps SCS perform correct, efficient evaluations. This article outlinesmore » SCS's computer testing methods and our system acceptance criteria. We expanded the basic ideas to other evaluations such as storage, and we think the methods outlined in this article has helped us choose hardware that is much more stable and supportable than our previous purchases. We have found that commodity hardware ranges in quality, so systematic method and tools for hardware evaluation were necessary. This article is based on one instance of a hardware purchase, but the guidelines apply to the general problem of purchasing commodity computer systems for production computational work.« less
Accelerating Molecular Dynamic Simulation on Graphics Processing Units

PubMed Central

Friedrichs, Mark S.; Eastman, Peter; Vaidyanathan, Vishal; Houston, Mike; Legrand, Scott; Beberg, Adam L.; Ensign, Daniel L.; Bruns, Christopher M.; Pande, Vijay S.

2009-01-01

We describe a complete implementation of all-atom protein molecular dynamics running entirely on a graphics processing unit (GPU), including all standard force field terms, integration, constraints, and implicit solvent. We discuss the design of our algorithms and important optimizations needed to fully take advantage of a GPU. We evaluate its performance, and show that it can be more than 700 times faster than a conventional implementation running on a single CPU core. PMID:19191337
Graphics processing unit accelerated intensity-based optical coherence tomography angiography using differential frames with real-time motion correction.

PubMed

Watanabe, Yuuki; Takahashi, Yuhei; Numazawa, Hiroshi

2014-02-01

We demonstrate intensity-based optical coherence tomography (OCT) angiography using the squared difference of two sequential frames with bulk-tissue-motion (BTM) correction. This motion correction was performed by minimization of the sum of the pixel values using axial- and lateral-pixel-shifted structural OCT images. We extract the BTM-corrected image from a total of 25 calculated OCT angiographic images. Image processing was accelerated by a graphics processing unit (GPU) with many stream processors to optimize the parallel processing procedure. The GPU processing rate was faster than that of a line scan camera (46.9 kHz). Our OCT system provides the means of displaying structural OCT images and BTM-corrected OCT angiographic images in real time.
High-performance image reconstruction in fluorescence tomography on desktop computers and graphics hardware.

PubMed

Freiberger, Manuel; Egger, Herbert; Liebmann, Manfred; Scharfetter, Hermann

2011-11-01

Image reconstruction in fluorescence optical tomography is a three-dimensional nonlinear ill-posed problem governed by a system of partial differential equations. In this paper we demonstrate that a combination of state of the art numerical algorithms and a careful hardware optimized implementation allows to solve this large-scale inverse problem in a few seconds on standard desktop PCs with modern graphics hardware. In particular, we present methods to solve not only the forward but also the non-linear inverse problem by massively parallel programming on graphics processors. A comparison of optimized CPU and GPU implementations shows that the reconstruction can be accelerated by factors of about 15 through the use of the graphics hardware without compromising the accuracy in the reconstructed images.
Graphics performance in rich Internet applications.

PubMed

Hoetzlein, Rama C

2012-01-01

Rendering performance for rich Internet applications (RIAs) has recently focused on the debate between using Flash and HTML5 for streaming video and gaming on mobile devices. A key area not widely explored, however, is the scalability of raw bitmap graphics performance for RIAs. Does Flash render animated sprites faster than HTML5? How much faster is WebGL than Flash? Answers to these questions are essential for developing large-scale data visualizations, online games, and truly dynamic websites. A new test methodology analyzes graphics performance across RIA frameworks and browsers, revealing specific performance outliers in existing frameworks. The results point toward a future in which all online experiences might be GPU accelerated.
Spectral-element Seismic Wave Propagation on CUDA/OpenCL Hardware Accelerators

NASA Astrophysics Data System (ADS)

Peter, D. B.; Videau, B.; Pouget, K.; Komatitsch, D.

2015-12-01

Seismic wave propagation codes are essential tools to investigate a variety of wave phenomena in the Earth. Furthermore, they can now be used for seismic full-waveform inversions in regional- and global-scale adjoint tomography. Although these seismic wave propagation solvers are crucial ingredients to improve the resolution of tomographic images to answer important questions about the nature of Earth's internal processes and subsurface structure, their practical application is often limited due to high computational costs. They thus need high-performance computing (HPC) facilities to improving the current state of knowledge. At present, numerous large HPC systems embed many-core architectures such as graphics processing units (GPUs) to enhance numerical performance. Such hardware accelerators can be programmed using either the CUDA programming environment or the OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted by additional hardware accelerators, like e.g. AMD graphic cards, ARM-based processors as well as Intel Xeon Phi coprocessors. For seismic wave propagation simulations using the open-source spectral-element code package SPECFEM3D_GLOBE, we incorporated an automatic source-to-source code generation tool (BOAST) which allows us to use meta-programming of all computational kernels for forward and adjoint runs. Using our BOAST kernels, we generate optimized source code for both CUDA and OpenCL languages within the source code package. Thus, seismic wave simulations are able now to fully utilize CUDA and OpenCL hardware accelerators. We show benchmarks of forward seismic wave propagation simulations using SPECFEM3D_GLOBE on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.
Field: a new meta-authoring platform for data-intensive scientific visualization

NASA Astrophysics Data System (ADS)

Downie, M.; Ameres, E.; Fox, P. A.; Goebel, J.; Graves, A.; Hendler, J.

2012-12-01

This presentation will demonstrate a new platform for data-intensive scientific visualization, called Field, that rethinks the problem of visual data exploration. Several new opportunities for scientific visualization present themselves here at this moment in time. We believe that when taken together they may catalyze a transformation of the practice of science and to begin to seed a technical culture within science that fuses data analysis, programming and myriad visual strategies. It is at integrative levels that the principle challenges exist, for many fundamental technical components of our field are now well understood and widely available. File formats from CSV through HDF all have broad library support; low-level high-performance graphics APIs (OpenGL) are in a period of stable growth; and a dizzying ecosystem of analysis and machine learning libraries abound. The hardware of computer graphics offers unprecedented computing power within commodity components; programming languages and platforms are coalescing around a core set of umbrella runtimes. Each of these trends are each set to continue — computer graphics hardware is developing at a super-Moore-law rate, and trends in publication and dissemination point only towards an increasing amount of access to code and data. The critical opportunity here for scientific visualization is, we maintain, not a in developing a new statistical library, nor a new tool centered on a particular technique, but rather new visual, "live" programming environment that is promiscuous in its scope. We can identify the necessarily methodological practice and traditions required here not in science or engineering but in the "live-coding" practices prevalent in the fields of digital art and design. We can define this practice as an approach to programming that is live, iterative, integrative, speculative and exploratory. "Live" because it is exclusively practiced in real-time (often during performance); "iterative", because intermediate programs and this visual results are constantly being made and remade en route; "speculative", because these programs and images result out of mode of inquiry into image-making not unlike that of hypothesis formation and testing; "integrative" because this style draws deeply upon the libraries of algorithms and materials available online today; and "exploratory" because the results of these speculations are inherently open to the data and unforeseen out the outset. To this end our development environment — Field — comprises a minimal core and a powerful plug-in system that can be extended from within the environment itself. By providing a hybrid text editor that can incorporate text-based programming at the same time with graphical user-interface elements, its flexible and extensible interface provides space as necessary for notation, visualization, interface construction, and introspection. In addition, it provides an advanced GPU-accelerated graphics system ideal for large-scale data visualization. Since Field was created in the context of widely divergent interdisciplinary projects, its aim is to give its users not only the ability to work rapidly, but to shape their Field environment extensively and flexibly for their own demands.
Experiences modeling ocean circulation problems on a 30 node commodity cluster with 3840 GPU processor cores.

NASA Astrophysics Data System (ADS)

Hill, C.

2008-12-01

Low cost graphic cards today use many, relatively simple, compute cores to deliver support for memory bandwidth of more than 100GB/s and theoretical floating point performance of more than 500 GFlop/s. Right now this performance is, however, only accessible to highly parallel algorithm implementations that, (i) can use a hundred or more, 32-bit floating point, concurrently executing cores, (ii) can work with graphics memory that resides on the graphics card side of the graphics bus and (iii) can be partially expressed in a language that can be compiled by a graphics programming tool. In this talk we describe our experiences implementing a complete, but relatively simple, time dependent shallow-water equations simulation targeting a cluster of 30 computers each hosting one graphics card. The implementation takes into account the considerations (i), (ii) and (iii) listed previously. We code our algorithm as a series of numerical kernels. Each kernel is designed to be executed by multiple threads of a single process. Kernels are passed memory blocks to compute over which can be persistent blocks of memory on a graphics card. Each kernel is individually implemented using the NVidia CUDA language but driven from a higher level supervisory code that is almost identical to a standard model driver. The supervisory code controls the overall simulation timestepping, but is written to minimize data transfer between main memory and graphics memory (a massive performance bottle-neck on current systems). Using the recipe outlined we can boost the performance of our cluster by nearly an order of magnitude, relative to the same algorithm executing only on the cluster CPU's. Achieving this performance boost requires that many threads are available to each graphics processor for execution within each numerical kernel and that the simulations working set of data can fit into the graphics card memory. As we describe, this puts interesting upper and lower bounds on the problem sizes for which this technology is currently most useful. However, many interesting problems fit within this envelope. Looking forward, we extrapolate our experience to estimate full-scale ocean model performance and applicability. Finally we describe preliminary hybrid mixed 32-bit and 64-bit experiments with graphics cards that support 64-bit arithmetic, albeit at a lower performance.
Transverse emittance and phase space program developed for use at the Fermilab A0 Photoinjector

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thurman-Keup, R.; Johnson, A.S.; Lumpkin, A.H.

2011-03-01

The Fermilab A0 Photoinjector is a 16 MeV high intensity, high brightness electron linac developed for advanced accelerator R&D. One of the key parameters for the electron beam is the transverse beam emittance. Here we report on a newly developed MATLAB based GUI program used for transverse emittance measurements using the multi-slit technique. This program combines the image acquisition and post-processing tools for determining the transverse phase space parameters with uncertainties. An integral part of accelerator research is a measurement of the beam phase space. Measurements of the transverse phase space can be accomplished by a variety of methods includingmore » multiple screens separated by drift spaces, or by sampling phase space via pepper pots or slits. In any case, the measurement of the phase space parameters, in particular the emittance, can be drastically simplified and sped up by automating the measurement in an intuitive fashion utilizing a graphical interface. At the A0 Photoinjector (A0PI), the control system is DOOCS, which originated at DESY. In addition, there is a library for interfacing to MATLAB, a graphically capable numerical analysis package sold by The Mathworks. It is this graphical package which was chosen as the basis for a graphical phase space measurement system due to its combination of analysis and display capabilities.« less
Performance Theory of Diagonal Conducting Wall MHD Accelerators

NASA Technical Reports Server (NTRS)

Litchford, R. J.

2003-01-01

The theoretical performance of diagonal conducting wall crossed field accelerators is examined on the basis of an infinite segmentation assumption using a cross-plane averaged generalized Ohm's law for a partially ionized gas, including ion slip. The desired accelerator performance relationships are derived from the cross-plane averaged Ohm's law by imposing appropriate configuration and loading constraints. A current dependent effective voltage drop model is also incorporated to account for cold-wall boundary layer effects including gasdynamic variations, discharge constriction, and electrode falls. Definition of dimensionless electric fields and current densities lead to the construction of graphical performance diagrams, which further illuminate the rudimentary behavior of crossed field accelerator operation.
Computer generated hologram from point cloud using graphics processor.

PubMed

Chen, Rick H-Y; Wilkinson, Timothy D

2009-12-20

Computer generated holography is an extremely demanding and complex task when it comes to providing realistic reconstructions with full parallax, occlusion, and shadowing. We present an algorithm designed for data-parallel computing on modern graphics processing units to alleviate the computational burden. We apply Gaussian interpolation to create a continuous surface representation from discrete input object points. The algorithm maintains a potential occluder list for each individual hologram plane sample to keep the number of visibility tests to a minimum. We experimented with two approximations that simplify and accelerate occlusion computation. It is observed that letting several neighboring hologram plane samples share visibility information on object points leads to significantly faster computation without causing noticeable artifacts in the reconstructed images. Computing a reduced sample set via nonuniform sampling is also found to be an effective acceleration technique.
BarraCUDA - a fast short read sequence aligner using graphics processing units

PubMed Central

2012-01-01

Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497
A Linux Workstation for High Performance Graphics

NASA Technical Reports Server (NTRS)

Geist, Robert; Westall, James

2000-01-01

The primary goal of this effort was to provide a low-cost method of obtaining high-performance 3-D graphics using an industry standard library (OpenGL) on PC class computers. Previously, users interested in doing substantial visualization or graphical manipulation were constrained to using specialized, custom hardware most often found in computers from Silicon Graphics (SGI). We provided an alternative to expensive SGI hardware by taking advantage of third-party, 3-D graphics accelerators that have now become available at very affordable prices. To make use of this hardware our goal was to provide a free, redistributable, and fully-compatible OpenGL work-alike library so that existing bodies of code could simply be recompiled. for PC class machines running a free version of Unix. This should allow substantial cost savings while greatly expanding the population of people with access to a serious graphics development and viewing environment. This should offer a means for NASA to provide a spectrum of graphics performance to its scientists, supplying high-end specialized SGI hardware for high-performance visualization while fulfilling the requirements of medium and lower performance applications with generic, off-the-shelf components and still maintaining compatibility between the two.
GPU-accelerated low-latency real-time searches for gravitational waves from compact binary coalescence

NASA Astrophysics Data System (ADS)

Liu, Yuan; Du, Zhihui; Chung, Shin Kee; Hooper, Shaun; Blair, David; Wen, Linqing

2012-12-01

We present a graphics processing unit (GPU)-accelerated time-domain low-latency algorithm to search for gravitational waves (GWs) from coalescing binaries of compact objects based on the summed parallel infinite impulse response (SPIIR) filtering technique. The aim is to facilitate fast detection of GWs with a minimum delay to allow prompt electromagnetic follow-up observations. To maximize the GPU acceleration, we apply an efficient batched parallel computing model that significantly reduces the number of synchronizations in SPIIR and optimizes the usage of the memory and hardware resource. Our code is tested on the CUDA ‘Fermi’ architecture in a GTX 480 graphics card and its performance is compared with a single core of Intel Core i7 920 (2.67 GHz). A 58-fold speedup is achieved while giving results in close agreement with the CPU implementation. Our result indicates that it is possible to conduct a full search for GWs from compact binary coalescence in real time with only one desktop computer equipped with a Fermi GPU card for the initial LIGO detectors which in the past required more than 100 CPUs.

Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware.

PubMed

Zhu, Xiangyuan; Li, Kenli; Salah, Ahmad; Shi, Lin; Li, Keqin

2015-01-01

Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures. Due to the ever increasing sizes of sequence databases, there is increasing demand to accelerate this task. In this paper, we demonstrate how graphic processing units (GPUs), powered by the compute unified device architecture (CUDA), can be used as an efficient computational platform to accelerate the MAFFT algorithm. To fully exploit the GPU's capabilities for accelerating MAFFT, we have optimized the sequence data organization to eliminate the bandwidth bottleneck of memory access, designed a memory allocation and reuse strategy to make full use of limited memory of GPUs, proposed a new modified-run-length encoding (MRLE) scheme to reduce memory consumption, and used high-performance shared memory to speed up I/O operations. Our implementation tested in three NVIDIA GPUs achieves speedup up to 11.28 on a Tesla K20m GPU compared to the sequential MAFFT 7.015.
A GPU-Based Wide-Band Radio Spectrometer

NASA Astrophysics Data System (ADS)

Chennamangalam, Jayanth; Scott, Simon; Jones, Glenn; Chen, Hong; Ford, John; Kepley, Amanda; Lorimer, D. R.; Nie, Jun; Prestage, Richard; Roshi, D. Anish; Wagner, Mark; Werthimer, Dan

2014-12-01

The graphics processing unit has become an integral part of astronomical instrumentation, enabling high-performance online data reduction and accelerated online signal processing. In this paper, we describe a wide-band reconfigurable spectrometer built using an off-the-shelf graphics processing unit card. This spectrometer, when configured as a polyphase filter bank, supports a dual-polarisation bandwidth of up to 1.1 GHz (or a single-polarisation bandwidth of up to 2.2 GHz) on the latest generation of graphics processing units. On the other hand, when configured as a direct fast Fourier transform, the spectrometer supports a dual-polarisation bandwidth of up to 1.4 GHz (or a single-polarisation bandwidth of up to 2.8 GHz).
Three-dimensional structural analysis using interactive graphics

NASA Technical Reports Server (NTRS)

Biffle, J.; Sumlin, H. A.

1975-01-01

The application of computer interactive graphics to three-dimensional structural analysis was described, with emphasis on the following aspects: (1) structural analysis, and (2) generation and checking of input data and examination of the large volume of output data (stresses, displacements, velocities, accelerations). Handling of three-dimensional input processing with a special MESH3D computer program was explained. Similarly, a special code PLTZ may be used to perform all the needed tasks for output processing from a finite element code. Examples were illustrated.
Quantum optimal control with automatic differentiation using graphics processors

NASA Astrophysics Data System (ADS)

Leung, Nelson; Abdelhafez, Mohamed; Chakram, Srivatsan; Naik, Ravi; Groszkowski, Peter; Koch, Jens; Schuster, David

We implement quantum optimal control based on automatic differentiation and harness the acceleration afforded by graphics processing units (GPUs). Automatic differentiation allows us to specify advanced optimization criteria and incorporate them into the optimization process with ease. We will describe efficient techniques to optimally control weakly anharmonic systems that are commonly encountered in circuit QED, including coupled superconducting transmon qubits and multi-cavity circuit QED systems. These systems allow for a rich variety of control schemes that quantum optimal control is well suited to explore.
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices

PubMed Central

Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B.

2018-01-01

Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support. PMID:29629431
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices.

PubMed

Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B

2017-06-01

Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support.
Accelerating image recognition on mobile devices using GPGPU

NASA Astrophysics Data System (ADS)

Bordallo López, Miguel; Nykänen, Henri; Hannuksela, Jari; Silvén, Olli; Vehviläinen, Markku

2011-01-01

The future multi-modal user interfaces of battery-powered mobile devices are expected to require computationally costly image analysis techniques. The use of Graphic Processing Units for computing is very well suited for parallel processing and the addition of programmable stages and high precision arithmetic provide for opportunities to implement energy-efficient complete algorithms. At the moment the first mobile graphics accelerators with programmable pipelines are available, enabling the GPGPU implementation of several image processing algorithms. In this context, we consider a face tracking approach that uses efficient gray-scale invariant texture features and boosting. The solution is based on the Local Binary Pattern (LBP) features and makes use of the GPU on the pre-processing and feature extraction phase. We have implemented a series of image processing techniques in the shader language of OpenGL ES 2.0, compiled them for a mobile graphics processing unit and performed tests on a mobile application processor platform (OMAP3530). In our contribution, we describe the challenges of designing on a mobile platform, present the performance achieved and provide measurement results for the actual power consumption in comparison to using the CPU (ARM) on the same platform.
GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units

PubMed Central

Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui

2012-01-01

Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a “fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/ PMID:22662128
MuSim, a Graphical User Interface for Multiple Simulation Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roberts, Thomas; Cummings, Mary Anne; Johnson, Rolland

2016-06-01

MuSim is a new user-friendly program designed to interface to many different particle simulation codes, regardless of their data formats or geometry descriptions. It presents the user with a compelling graphical user interface that includes a flexible 3-D view of the simulated world plus powerful editing and drag-and-drop capabilities. All aspects of the design can be parametrized so that parameter scans and optimizations are easy. It is simple to create plots and display events in the 3-D viewer (with a slider to vary the transparency of solids), allowing for an effortless comparison of different simulation codes. Simulation codes: G4beamline, MAD-X,more » and MCNP; more coming. Many accelerator design tools and beam optics codes were written long ago, with primitive user interfaces by today's standards. MuSim is specifically designed to make it easy to interface to such codes, providing a common user experience for all, and permitting the construction and exploration of models with very little overhead. For today's technology-driven students, graphical interfaces meet their expectations far better than text-based tools, and education in accelerator physics is one of our primary goals.« less
GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.

PubMed

Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui

2012-01-01

Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/
Mercury BLASTP: Accelerating Protein Sequence Alignment

PubMed Central

Jacob, Arpith; Lancaster, Joseph; Buhler, Jeremy; Harris, Brandon; Chamberlain, Roger D.

2008-01-01

Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this paper, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11-15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results. PMID:19492068
Geowall: Investigations into low-cost stereo display technologies

USGS Publications Warehouse

Steinwand, Daniel R.; Davis, Brian; Weeks, Nathan

2003-01-01

Recently, the combination of new projection technology, fast, low-cost graphics cards, and Linux-powered personal computers has made it possible to provide a stereoprojection and stereoviewing system that is much more affordable than previous commercial solutions. These Geowall systems are low-cost visualization systems built with commodity off-the-shelf components, run on open-source (and other) operating systems, and using open-source applications software. In short, they are ?Beowulf-class? visualization systems that provide a cost-effective way for the U. S. Geological Survey to broaden participation in the visualization community and view stereoimagery and three-dimensional models2.
Launch Pad Physics: Accelerate Interest With Model Rocketry.

ERIC Educational Resources Information Center

Key, LeRoy F.

1982-01-01

Student activities in an interdisciplinary, model rocket science program are described, including the construction of an Ohio Scientific computer system with graphic capabilities for use in the program and cooperative efforts with the Rocket Research Institute. (JN)
Performance Theory of Diagonal Conducting Wall Magnetohydrodynamic Accelerators

NASA Technical Reports Server (NTRS)

Litchford, R. J.

2004-01-01

The theoretical performance of diagonal conducting wall crossed-field accelerators is examined on the basis of an infinite segmentation assumption using a cross-plane averaged generalized Ohm s law for a partially ionized gas, including ion slip. The desired accelerator performance relationships are derived from the cross-plane averaged Ohm s law by imposing appropriate configuration and loading constraints. A current-dependent effective voltage drop model is also incorporated to account for cold-wall boundary layer effects, including gasdynamic variations, discharge constriction, and electrode falls. Definition of dimensionless electric fields and current densities leads to the construction of graphical performance diagrams, which further illuminate the rudimentary behavior of crossed-field accelerator operation.
Parallel mutual information estimation for inferring gene regulatory networks on GPUs

PubMed Central

2011-01-01

Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:21672264
Migrating EO/IR sensors to cloud-based infrastructure as service architectures

NASA Astrophysics Data System (ADS)

Berglie, Stephen T.; Webster, Steven; May, Christopher M.

2014-06-01

The Night Vision Image Generator (NVIG), a product of US Army RDECOM CERDEC NVESD, is a visualization tool used widely throughout Army simulation environments to provide fully attributed synthesized, full motion video using physics-based sensor and environmental effects. The NVIG relies heavily on contemporary hardware-based acceleration and GPU processing techniques, which push the envelope of both enterprise and commodity-level hypervisor support for providing virtual machines with direct access to hardware resources. The NVIG has successfully been integrated into fully virtual environments where system architectures leverage cloudbased technologies to various extents in order to streamline infrastructure and service management. This paper details the challenges presented to engineers seeking to migrate GPU-bound processes, such as the NVIG, to virtual machines and, ultimately, Cloud-Based IAS architectures. In addition, it presents the path that led to success for the NVIG. A brief overview of Cloud-Based infrastructure management tool sets is provided, and several virtual desktop solutions are outlined. A discrimination is made between general purpose virtual desktop technologies compared to technologies that expose GPU-specific capabilities, including direct rendering and hard ware-based video encoding. Candidate hypervisor/virtual machine configurations that nominally satisfy the virtualized hardware-level GPU requirements of the NVIG are presented , and each is subsequently reviewed in light of its implications on higher-level Cloud management techniques. Implementation details are included from the hardware level, through the operating system, to the 3D graphics APls required by the NVIG and similar GPU-bound tools.
Protein-protein docking on hardware accelerators: comparison of GPU and MIC architectures

PubMed Central

2015-01-01

Background The hardware accelerators will provide solutions to computationally complex problems in bioinformatics fields. However, the effect of acceleration depends on the nature of the application, thus selection of an appropriate accelerator requires some consideration. Results In the present study, we compared the effects of acceleration using graphics processing unit (GPU) and many integrated core (MIC) on the speed of fast Fourier transform (FFT)-based protein-protein docking calculation. The GPU implementation performed the protein-protein docking calculations approximately five times faster than the MIC offload mode implementation. The MIC native mode implementation has the advantage in the implementation costs. However, the performance was worse with larger protein pairs because of memory limitations. Conclusion The results suggest that GPU is more suitable than MIC for accelerating FFT-based protein-protein docking applications. PMID:25707855
Phytosanitary irradiation of fresh tropical commodities in Hawaii: Generic treatments, commercial adoption, and current issues

NASA Astrophysics Data System (ADS)

Follett, Peter A.; Weinert, Eric D.

2012-08-01

Hawaii is a pioneer in the use of phytosanitary irradiation. The commercial X-ray irradiation facility, Hawaii Pride LLC, has been shipping papaya and other tropical fruits and vegetables to the United States mainland using irradiation for 11 years. Irradiation is an approved treatment to control quarantine pests in 17 fruits and 7 vegetables for export from Hawaii to the US mainland. Hawaiian purple sweet potato is the highest volume product with annual exports of more than 12 million lbs (5500 t). The advent of generic radiation treatments for tephritid fruit flies (150 Gy) and other insects (400 Gy) will accelerate commodity export approvals and facilitate worldwide adoption. Lowering doses for specific pests and commodities can lower treatment costs and increase capacity owing to shorter treatment times, and will minimize any quality problems. Current impediments to wider adoption include the 1 kGy limit for fresh horticultural products, the labeling requirement, and non-acceptance of phytosanitary irradiation in Japan, the European Union, and elsewhere. Irradiation has potential as a treatment for unregulated imports to prevent new pest incursions.
Accelerating molecular dynamic simulation on the cell processor and Playstation 3.

PubMed

Luttmann, Edgar; Ensign, Daniel L; Vaidyanathan, Vishal; Houston, Mike; Rimon, Noam; Øland, Jeppe; Jayachandran, Guha; Friedrichs, Mark; Pande, Vijay S

2009-01-30

Implementation of molecular dynamics (MD) calculations on novel architectures will vastly increase its power to calculate the physical properties of complex systems. Herein, we detail algorithmic advances developed to accelerate MD simulations on the Cell processor, a commodity processor found in PlayStation 3 (PS3). In particular, we discuss issues regarding memory access versus computation and the types of calculations which are best suited for streaming processors such as the Cell, focusing on implicit solvation models. We conclude with a comparison of improved performance on the PS3's Cell processor over more traditional processors. (c) 2008 Wiley Periodicals, Inc.
Accurate and efficient spin integration for particle accelerators

DOE PAGES

Abell, Dan T.; Meiser, Dominic; Ranjbar, Vahid H.; ...

2015-02-01

Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code GPUSPINTRACK. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations.We evaluate their performance and accuracy in quantitative detail for individual elements as well as formore » the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.« less

An investigation into the probabilistic combination of quasi-static and random accelerations

NASA Technical Reports Server (NTRS)

Schock, R. W.; Tuell, L. P.

1984-01-01

The development of design load factors for aerospace and aircraft components and experiment support structures, which are subject to a simultaneous vehicle dynamic vibration (quasi-static) and acoustically generated random vibration, require the selection of a combination methodology. Typically, the procedure is to define the quasi-static and the random generated response separately, and arithmetically add or root sum square to get combined accelerations. Since the combination of a probabilistic and a deterministic function yield a probabilistic function, a viable alternate approach would be to determine the characteristics of the combined acceleration probability density function and select an appropriate percentile level for the combined acceleration. The following paper develops this mechanism and provides graphical data to select combined accelerations for most popular percentile levels.
Porting of the transfer-matrix method for multilayer thin-film computations on graphics processing units

NASA Astrophysics Data System (ADS)

Limmer, Steffen; Fey, Dietmar

2013-07-01

Thin-film computations are often a time-consuming task during optical design. An efficient way to accelerate these computations with the help of graphics processing units (GPUs) is described. It turned out that significant speed-ups can be achieved. We investigate the circumstances under which the best speed-up values can be expected. Therefore we compare different GPUs among themselves and with a modern CPU. Furthermore, the effect of thickness modulation on the speed-up and the runtime behavior depending on the input data is examined.
GPU Acceleration of DSP for Communication Receivers.

PubMed

Gunther, Jake; Gunther, Hyrum; Moon, Todd

2017-09-01

Graphics processing unit (GPU) implementations of signal processing algorithms can outperform CPU-based implementations. This paper describes the GPU implementation of several algorithms encountered in a wide range of high-data rate communication receivers including filters, multirate filters, numerically controlled oscillators, and multi-stage digital down converters. These structures are tested by processing the 20 MHz wide FM radio band (88-108 MHz). Two receiver structures are explored: a single channel receiver and a filter bank channelizer. Both run in real time on NVIDIA GeForce GTX 1080 graphics card.
Advantages of GPU technology in DFT calculations of intercalated graphene

NASA Astrophysics Data System (ADS)

Pešić, J.; Gajić, R.

2014-09-01

Over the past few years, the expansion of general-purpose graphic-processing unit (GPGPU) technology has had a great impact on computational science. GPGPU is the utilization of a graphics-processing unit (GPU) to perform calculations in applications usually handled by the central processing unit (CPU). Use of GPGPUs as a way to increase computational power in the material sciences has significantly decreased computational costs in already highly demanding calculations. A level of the acceleration and parallelization depends on the problem itself. Some problems can benefit from GPU acceleration and parallelization, such as the finite-difference time-domain algorithm (FTDT) and density-functional theory (DFT), while others cannot take advantage of these modern technologies. A number of GPU-supported applications had emerged in the past several years (www.nvidia.com/object/gpu-applications.html). Quantum Espresso (QE) is reported as an integrated suite of open source computer codes for electronic-structure calculations and materials modeling at the nano-scale. It is based on DFT, the use of a plane-waves basis and a pseudopotential approach. Since the QE 5.0 version, it has been implemented as a plug-in component for standard QE packages that allows exploiting the capabilities of Nvidia GPU graphic cards (www.qe-forge.org/gf/proj). In this study, we have examined the impact of the usage of GPU acceleration and parallelization on the numerical performance of DFT calculations. Graphene has been attracting attention worldwide and has already shown some remarkable properties. We have studied an intercalated graphene, using the QE package PHonon, which employs GPU. The term ‘intercalation’ refers to a process whereby foreign adatoms are inserted onto a graphene lattice. In addition, by intercalating different atoms between graphene layers, it is possible to tune their physical properties. Our experiments have shown there are benefits from using GPUs, and we reached an acceleration of several times compared to standard CPU calculations.
Mapping the Information Trace in Local Field Potentials by a Computational Method of Two-Dimensional Time-Shifting Synchronization Likelihood Based on Graphic Processing Unit Acceleration.

PubMed

Zhao, Zi-Fang; Li, Xue-Zhu; Wan, You

2017-12-01

The local field potential (LFP) is a signal reflecting the electrical activity of neurons surrounding the electrode tip. Synchronization between LFP signals provides important details about how neural networks are organized. Synchronization between two distant brain regions is hard to detect using linear synchronization algorithms like correlation and coherence. Synchronization likelihood (SL) is a non-linear synchronization-detecting algorithm widely used in studies of neural signals from two distant brain areas. One drawback of non-linear algorithms is the heavy computational burden. In the present study, we proposed a graphic processing unit (GPU)-accelerated implementation of an SL algorithm with optional 2-dimensional time-shifting. We tested the algorithm with both artificial data and raw LFP data. The results showed that this method revealed detailed information from original data with the synchronization values of two temporal axes, delay time and onset time, and thus can be used to reconstruct the temporal structure of a neural network. Our results suggest that this GPU-accelerated method can be extended to other algorithms for processing time-series signals (like EEG and fMRI) using similar recording techniques.
Monte Carlo-based fluorescence molecular tomography reconstruction method accelerated by a cluster of graphic processing units.

PubMed

Quan, Guotao; Gong, Hui; Deng, Yong; Fu, Jianwei; Luo, Qingming

2011-02-01

High-speed fluorescence molecular tomography (FMT) reconstruction for 3-D heterogeneous media is still one of the most challenging problems in diffusive optical fluorescence imaging. In this paper, we propose a fast FMT reconstruction method that is based on Monte Carlo (MC) simulation and accelerated by a cluster of graphics processing units (GPUs). Based on the Message Passing Interface standard, we modified the MC code for fast FMT reconstruction, and different Green's functions representing the flux distribution in media are calculated simultaneously by different GPUs in the cluster. A load-balancing method was also developed to increase the computational efficiency. By applying the Fréchet derivative, a Jacobian matrix is formed to reconstruct the distribution of the fluorochromes using the calculated Green's functions. Phantom experiments have shown that only 10 min are required to get reconstruction results with a cluster of 6 GPUs, rather than 6 h with a cluster of multiple dual opteron CPU nodes. Because of the advantages of high accuracy and suitability for 3-D heterogeneity media with refractive-index-unmatched boundaries from the MC simulation, the GPU cluster-accelerated method provides a reliable approach to high-speed reconstruction for FMT imaging.
GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models

PubMed Central

Mukherjee, Chiranjit; Rodriguez, Abel

2016-01-01

Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful. PMID:28626348
GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models.

PubMed

Mukherjee, Chiranjit; Rodriguez, Abel

2016-01-01

Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful.
GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads

PubMed Central

Manconi, Andrea; Orro, Alessandro; Manca, Emanuele; Armano, Giuliano; Milanesi, Luciano

2014-01-01

Cytosine DNA methylation is an epigenetic mark implicated in several biological processes. Bisulfite treatment of DNA is acknowledged as the gold standard technique to study methylation. This technique introduces changes in the genomic DNA by converting cytosines to uracils while 5-methylcytosines remain nonreactive. During PCR amplification 5-methylcytosines are amplified as cytosine, whereas uracils and thymines as thymine. To detect the methylation levels, reads treated with the bisulfite must be aligned against a reference genome. Mapping these reads to a reference genome represents a significant computational challenge mainly due to the increased search space and the loss of information introduced by the treatment. To deal with this computational challenge we devised GPU-BSM, a tool based on modern Graphics Processing Units. Graphics Processing Units are hardware accelerators that are increasingly being used successfully to accelerate general-purpose scientific applications. GPU-BSM is a tool able to map bisulfite-treated reads from whole genome bisulfite sequencing and reduced representation bisulfite sequencing, and to estimate methylation levels, with the goal of detecting methylation. Due to the massive parallelization obtained by exploiting graphics cards, GPU-BSM aligns bisulfite-treated reads faster than other cutting-edge solutions, while outperforming most of them in terms of unique mapped reads. PMID:24842718
Distributed shared memory for roaming large volumes.

PubMed

Castanié, Laurent; Mion, Christophe; Cavin, Xavier; Lévy, Bruno

2006-01-01

We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming.
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zuo, Wangda; McNeil, Andrew; Wetter, Michael

2011-09-06

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
Accelerating Advanced MRI Reconstructions on GPUs

PubMed Central

Stone, S.S.; Haldar, J.P.; Tsao, S.C.; Hwu, W.-m.W.; Sutton, B.P.; Liang, Z.-P.

2008-01-01

Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. This paper describes the acceleration of such an algorithm on NVIDIA’s Quadro FX 5600. The reconstruction of a 3D image with 1283 voxels achieves up to 180 GFLOPS and requires just over one minute on the Quadro, while reconstruction on a quad-core CPU is twenty-one times slower. Furthermore, relative to the true image, the error exhibited by the advanced reconstruction is only 12%, while conventional reconstruction techniques incur error of 42%. PMID:21796230
Accelerating Advanced MRI Reconstructions on GPUs.

PubMed

Stone, S S; Haldar, J P; Tsao, S C; Hwu, W-M W; Sutton, B P; Liang, Z-P

2008-10-01

Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. This paper describes the acceleration of such an algorithm on NVIDIA's Quadro FX 5600. The reconstruction of a 3D image with 128(3) voxels achieves up to 180 GFLOPS and requires just over one minute on the Quadro, while reconstruction on a quad-core CPU is twenty-one times slower. Furthermore, relative to the true image, the error exhibited by the advanced reconstruction is only 12%, while conventional reconstruction techniques incur error of 42%.
Creation Stations.

ERIC Educational Resources Information Center

Sauer, Jeff; Murphy, Sam

1997-01-01

In this comparison, NewMedia lab looks at 10 Pentium II workstations preconfigured for demanding three dimensional and multimedia work with OpenGL cards and fast Ultra SCSI hard drives. Highlights include costs, tests with Photoshop, technical support, and a sidebar that explains Accelerated Graphics Port. (Author/LRW)
GPU accelerated Monte Carlo simulation of Brownian motors dynamics with CUDA

NASA Astrophysics Data System (ADS)

Spiechowicz, J.; Kostur, M.; Machura, L.

2015-06-01

This work presents an updated and extended guide on methods of a proper acceleration of the Monte Carlo integration of stochastic differential equations with the commonly available NVIDIA Graphics Processing Units using the CUDA programming environment. We outline the general aspects of the scientific computing on graphics cards and demonstrate them with two models of a well known phenomenon of the noise induced transport of Brownian motors in periodic structures. As a source of fluctuations in the considered systems we selected the three most commonly occurring noises: the Gaussian white noise, the white Poissonian noise and the dichotomous process also known as a random telegraph signal. The detailed discussion on various aspects of the applied numerical schemes is also presented. The measured speedup can be of the astonishing order of about 3000 when compared to a typical CPU. This number significantly expands the range of problems solvable by use of stochastic simulations, allowing even an interactive research in some cases.
High performance hybrid functional Petri net simulations of biological pathway models on CUDA.

PubMed

Chalkidis, Georgios; Nagasaki, Masao; Miyano, Satoru

2011-01-01

Hybrid functional Petri nets are a wide-spread tool for representing and simulating biological models. Due to their potential of providing virtual drug testing environments, biological simulations have a growing impact on pharmaceutical research. Continuous research advancements in biology and medicine lead to exponentially increasing simulation times, thus raising the demand for performance accelerations by efficient and inexpensive parallel computation solutions. Recent developments in the field of general-purpose computation on graphics processing units (GPGPU) enabled the scientific community to port a variety of compute intensive algorithms onto the graphics processing unit (GPU). This work presents the first scheme for mapping biological hybrid functional Petri net models, which can handle both discrete and continuous entities, onto compute unified device architecture (CUDA) enabled GPUs. GPU accelerated simulations are observed to run up to 18 times faster than sequential implementations. Simulating the cell boundary formation by Delta-Notch signaling on a CUDA enabled GPU results in a speedup of approximately 7x for a model containing 1,600 cells.
Efficient Acceleration of the Pair-HMMs Forward Algorithm for GATK HaplotypeCaller on Graphics Processing Units.

PubMed

Ren, Shanshan; Bertels, Koen; Al-Ars, Zaid

2018-01-01

GATK HaplotypeCaller (HC) is a popular variant caller, which is widely used to identify variants in complex genomes. However, due to its high variants detection accuracy, it suffers from long execution time. In GATK HC, the pair-HMMs forward algorithm accounts for a large percentage of the total execution time. This article proposes to accelerate the pair-HMMs forward algorithm on graphics processing units (GPUs) to improve the performance of GATK HC. This article presents several GPU-based implementations of the pair-HMMs forward algorithm. It also analyzes the performance bottlenecks of the implementations on an NVIDIA Tesla K40 card with various data sets. Based on these results and the characteristics of GATK HC, we are able to identify the GPU-based implementations with the highest performance for the various analyzed data sets. Experimental results show that the GPU-based implementations of the pair-HMMs forward algorithm achieve a speedup of up to 5.47× over existing GPU-based implementations.
Application of graphics processing units to search pipelines for gravitational waves from coalescing binaries of compact objects

NASA Astrophysics Data System (ADS)

Chung, Shin Kee; Wen, Linqing; Blair, David; Cannon, Kipp; Datta, Amitava

2010-07-01

We report a novel application of a graphics processing unit (GPU) for the purpose of accelerating the search pipelines for gravitational waves from coalescing binaries of compact objects. A speed-up of 16-fold in total has been achieved with an NVIDIA GeForce 8800 Ultra GPU card compared with one core of a 2.5 GHz Intel Q9300 central processing unit (CPU). We show that substantial improvements are possible and discuss the reduction in CPU count required for the detection of inspiral sources afforded by the use of GPUs.
A graphics-card implementation of Monte-Carlo simulations for cosmic-ray transport

NASA Astrophysics Data System (ADS)

Tautz, R. C.

2016-05-01

A graphics card implementation of a test-particle simulation code is presented that is based on the CUDA extension of the C/C++ programming language. The original CPU version has been developed for the calculation of cosmic-ray diffusion coefficients in artificial Kolmogorov-type turbulence. In the new implementation, the magnetic turbulence generation, which is the most time-consuming part, is separated from the particle transport and is performed on a graphics card. In this article, the modification of the basic approach of integrating test particle trajectories to employ the SIMD (single instruction, multiple data) model is presented and verified. The efficiency of the new code is tested and several language-specific accelerating factors are discussed. For the example of isotropic magnetostatic turbulence, sample results are shown and a comparison to the results of the CPU implementation is performed.
Accelerated Molecular Dynamics Simulations with the AMOEBA Polarizable Force Field on Graphics Processing Units

PubMed Central

2013-01-01

The accelerated molecular dynamics (aMD) method has recently been shown to enhance the sampling of biomolecules in molecular dynamics (MD) simulations, often by several orders of magnitude. Here, we describe an implementation of the aMD method for the OpenMM application layer that takes full advantage of graphics processing units (GPUs) computing. The aMD method is shown to work in combination with the AMOEBA polarizable force field (AMOEBA-aMD), allowing the simulation of long time-scale events with a polarizable force field. Benchmarks are provided to show that the AMOEBA-aMD method is efficiently implemented and produces accurate results in its standard parametrization. For the BPTI protein, we demonstrate that the protein structure described with AMOEBA remains stable even on the extended time scales accessed at high levels of accelerations. For the DNA repair metalloenzyme endonuclease IV, we show that the use of the AMOEBA force field is a significant improvement over fixed charged models for describing the enzyme active-site. The new AMOEBA-aMD method is publicly available (http://wiki.simtk.org/openmm/VirtualRepository) and promises to be interesting for studying complex systems that can benefit from both the use of a polarizable force field and enhanced sampling. PMID:24634618

Determination of the needed power of an electric motor on the basis of acceleration time of the electric car

NASA Astrophysics Data System (ADS)

Sapundzhiev, M.; Evtimov, I.; Ivanov, R.

2017-10-01

The paper presents an upgraded methodology for determination of the electric motor power considering the time for acceleration. The influence of the speed factor of electric motor on the value of needed power at same acceleration time is studied. Some calculations on the basis of real vehicle were made. The numeric and graphical results are given. They show a decrease of needed power with the increase of the speed factor of motor, because the high speed factor allows the use of a larger range of the characteristic with the maximum power of the motor. An experimental verification of methodology was done.
Ukraine: The Lingering Soviet Headache and 25+ Years of Hybrid Rule

DTIC Science & Technology

2017-06-01

much time and money is invested by the United States and international organizations for the development of democracies around the world. The amount of...relationship between the government and economy. The intermingling of money and politics continues even after reforms accelerated in the 2000s, giving...prices for basic commodities. Ukraine’s economy is far from a free market model found in most well-developed democracies. As a result of this
Fast direct fourier reconstruction of radial and PROPELLER MRI data using the chirp transform algorithm on graphics hardware.

PubMed

Feng, Yanqiu; Song, Yanli; Wang, Cong; Xin, Xuegang; Feng, Qianjin; Chen, Wufan

2013-10-01

To develop and test a new algorithm for fast direct Fourier transform (DrFT) reconstruction of MR data on non-Cartesian trajectories composed of lines with equally spaced points. The DrFT, which is normally used as a reference in evaluating the accuracy of other reconstruction methods, can reconstruct images directly from non-Cartesian MR data without interpolation. However, DrFT reconstruction involves substantially intensive computation, which makes the DrFT impractical for clinical routine applications. In this article, the Chirp transform algorithm was introduced to accelerate the DrFT reconstruction of radial and Periodically Rotated Overlapping ParallEL Lines with Enhanced Reconstruction (PROPELLER) MRI data located on the trajectories that are composed of lines with equally spaced points. The performance of the proposed Chirp transform algorithm-DrFT algorithm was evaluated by using simulation and in vivo MRI data. After implementing the algorithm on a graphics processing unit, the proposed Chirp transform algorithm-DrFT algorithm achieved an acceleration of approximately one order of magnitude, and the speed-up factor was further increased to approximately three orders of magnitude compared with the traditional single-thread DrFT reconstruction. Implementation the Chirp transform algorithm-DrFT algorithm on the graphics processing unit can efficiently calculate the DrFT reconstruction of the radial and PROPELLER MRI data. Copyright © 2012 Wiley Periodicals, Inc.
GPU acceleration of Runge Kutta-Fehlberg and its comparison with Dormand-Prince method

NASA Astrophysics Data System (ADS)

Seen, Wo Mei; Gobithaasan, R. U.; Miura, Kenjiro T.

2014-07-01

There is a significant reduction of processing time and speedup of performance in computer graphics with the emergence of Graphic Processing Units (GPUs). GPUs have been developed to surpass Central Processing Unit (CPU) in terms of performance and processing speed. This evolution has opened up a new area in computing and researches where highly parallel GPU has been used for non-graphical algorithms. Physical or phenomenal simulations and modelling can be accelerated through General Purpose Graphic Processing Units (GPGPU) and Compute Unified Device Architecture (CUDA) implementations. These phenomena can be represented with mathematical models in the form of Ordinary Differential Equations (ODEs) which encompasses the gist of change rate between independent and dependent variables. ODEs are numerically integrated over time in order to simulate these behaviours. The classical Runge-Kutta (RK) scheme is the common method used to numerically solve ODEs. The Runge Kutta Fehlberg (RKF) scheme has been specially developed to provide an estimate of the principal local truncation error at each step, known as embedding estimate technique. This paper delves into the implementation of RKF scheme for GPU devices and compares its result with Dorman Prince method. A pseudo code is developed to show the implementation in detail. Hence, practitioners will be able to understand the data allocation in GPU, formation of RKF kernels and the flow of data to/from GPU-CPU upon RKF kernel evaluation. The pseudo code is then written in C Language and two ODE models are executed to show the achievable speedup as compared to CPU implementation. The accuracy and efficiency of the proposed implementation method is discussed in the final section of this paper.
Graphics Processing Unit-Accelerated Nonrigid Registration of MR Images to CT Images During CT-Guided Percutaneous Liver Tumor Ablations.

PubMed

Tokuda, Junichi; Plishker, William; Torabi, Meysam; Olubiyi, Olutayo I; Zaki, George; Tatli, Servet; Silverman, Stuart G; Shekher, Raj; Hata, Nobuhiko

2015-06-01

Accuracy and speed are essential for the intraprocedural nonrigid magnetic resonance (MR) to computed tomography (CT) image registration in the assessment of tumor margins during CT-guided liver tumor ablations. Although both accuracy and speed can be improved by limiting the registration to a region of interest (ROI), manual contouring of the ROI prolongs the registration process substantially. To achieve accurate and fast registration without the use of an ROI, we combined a nonrigid registration technique on the basis of volume subdivision with hardware acceleration using a graphics processing unit (GPU). We compared the registration accuracy and processing time of GPU-accelerated volume subdivision-based nonrigid registration technique to the conventional nonrigid B-spline registration technique. Fourteen image data sets of preprocedural MR and intraprocedural CT images for percutaneous CT-guided liver tumor ablations were obtained. Each set of images was registered using the GPU-accelerated volume subdivision technique and the B-spline technique. Manual contouring of ROI was used only for the B-spline technique. Registration accuracies (Dice similarity coefficient [DSC] and 95% Hausdorff distance [HD]) and total processing time including contouring of ROIs and computation were compared using a paired Student t test. Accuracies of the GPU-accelerated registrations and B-spline registrations, respectively, were 88.3 ± 3.7% versus 89.3 ± 4.9% (P = .41) for DSC and 13.1 ± 5.2 versus 11.4 ± 6.3 mm (P = .15) for HD. Total processing time of the GPU-accelerated registration and B-spline registration techniques was 88 ± 14 versus 557 ± 116 seconds (P < .000000002), respectively; there was no significant difference in computation time despite the difference in the complexity of the algorithms (P = .71). The GPU-accelerated volume subdivision technique was as accurate as the B-spline technique and required significantly less processing time. The GPU-accelerated volume subdivision technique may enable the implementation of nonrigid registration into routine clinical practice. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.
Project Physics Text 1, Concepts of Motion.

ERIC Educational Resources Information Center

Harvard Univ., Cambridge, MA. Harvard Project Physics.

Fundamental concepts of motion are presented in this first unit of the Project Physics textbook. Descriptions of motion are made in connection with speeds, accelerations, and their graphical representation. Free-fall bodies are analyzed by using Aristotle's theory and Galileo's work. Dynamics aspects are discussed with a background of mass, force,…
LOD-Sprite Technique for Accelerated Terrain Rendering

DTIC Science & Technology

1999-01-01

includes limited parallax, is possible. Another category samples the full plenoptic function, resulting in 3D, 4D or even 5D image sprites [13, 10... Plenoptic modeling: An image- based rendering system. Computer Graphics (Proc. SIG- GRAPH ’95), pages 39–46, 1995. [19] P. Rademacher and G. Bishop
Dual Enrollment--An Opportunity for Acceleration of Education.

ERIC Educational Resources Information Center

DeLuca, James P.

1978-01-01

The chairman of the Department of Graphic Arts and Advertising Technology at New York City Community College describes the department's dual enrollment program, which offers qualified high school seniors the opportunity to take college courses in an occupational curriculum as part-time matriculated college students while completing their high…
Muon Accelerator Program (MAP) | Neutrino Factory | Research Goals

Science.gov Websites

; Committees Research Goals Research & Development Design & Simulation Technology Development Systems Demonstrations Activities MASS Muon Cooling MuCool Test Area MICE Experiment MERIT Muon Collider Research Goals Why Muons at the Energy Frontier? How does it work? Graphics Animation Neutrino Factory Research Goals
Statistical evaluation of accelerated stability data obtained at a single temperature. I. Effect of experimental errors in evaluation of stability data obtained.

PubMed

Yoshioka, S; Aso, Y; Takeda, Y

1990-06-01

Accelerated stability data obtained at a single temperature is statistically evaluated, and the utility of such data for assessment of stability is discussed focussing on the chemical stability of solution-state dosage forms. The probability that the drug content of a product is observed to be within the lower specification limit in the accelerated test is interpreted graphically. This probability depends on experimental errors in the assay and temperature control, as well as the true degradation rate and activation energy. Therefore, the observation that the drug content meets the specification in the accelerated testing can provide only limited information on the shelf-life of the drug, without the knowledge of the activation energy and the accuracy and precision of the assay and temperature control.
Implementation of Multipattern String Matching Accelerated with GPU for Intrusion Detection System

NASA Astrophysics Data System (ADS)

Nehemia, Rangga; Lim, Charles; Galinium, Maulahikmah; Rinaldi Widianto, Ahmad

2017-04-01

As Internet-related security threats continue to increase in terms of volume and sophistication, existing Intrusion Detection System is also being challenged to cope with the current Internet development. Multi Pattern String Matching algorithm accelerated with Graphical Processing Unit is being utilized to improve the packet scanning performance of the IDS. This paper implements a Multi Pattern String Matching algorithm, also called Parallel Failureless Aho Corasick accelerated with GPU to improve the performance of IDS. OpenCL library is used to allow the IDS to support various GPU, including popular GPU such as NVIDIA and AMD, used in our research. The experiment result shows that the application of Multi Pattern String Matching using GPU accelerated platform provides a speed up, by up to 141% in term of throughput compared to the previous research.
Graphics Processing Units for HEP trigger systems

NASA Astrophysics Data System (ADS)

Ammendola, R.; Bauce, M.; Biagioni, A.; Chiozzi, S.; Cotta Ramusino, A.; Fantechi, R.; Fiorini, M.; Giagu, S.; Gianoli, A.; Lamanna, G.; Lonardo, A.; Messina, A.; Neri, I.; Paolucci, P. S.; Piandani, R.; Pontisso, L.; Rescigno, M.; Simula, F.; Sozzi, M.; Vicini, P.

2016-07-01

General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on CERN NA62 experiment trigger system. The use of GPU in higher level trigger system is also briefly considered.
Design and validation of an improved graphical user interface with the 'Tool ball'.

PubMed

Lee, Kuo-Wei; Lee, Ying-Chu

2012-01-01

The purpose of this research is introduce the design of an improved graphical user interface (GUI) and verifies the operational efficiency of the proposed interface. Until now, clicking the toolbar with the mouse is the usual way to operate software functions. In our research, we designed an improved graphical user interface - a tool ball that is operated by a mouse wheel to perform software functions. Several experiments are conducted to measure the time needed to operate certain software functions with the traditional combination of "mouse click + tool button" and the proposed integration of "mouse wheel + tool ball". The results indicate that the tool ball design can accelerate the speed of operating software functions, decrease the number of icons on the screen, and enlarge the applications of the mouse wheel. Copyright © 2011 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Accelerating 3D Hall MHD Magnetosphere Simulations with Graphics Processing Units

NASA Astrophysics Data System (ADS)

Bard, C.; Dorelli, J.

2017-12-01

The resolution required to simulate planetary magnetospheres with Hall magnetohydrodynamics result in program sizes approaching several hundred million grid cells. These would take years to run on a single computational core and require hundreds or thousands of computational cores to complete in a reasonable time. However, this requires access to the largest supercomputers. Graphics processing units (GPUs) provide a viable alternative: one GPU can do the work of roughly 100 cores, bringing Hall MHD simulations of Ganymede within reach of modest GPU clusters ( 8 GPUs). We report our progress in developing a GPU-accelerated, three-dimensional Hall magnetohydrodynamic code and present Hall MHD simulation results for both Ganymede (run on 8 GPUs) and Mercury (56 GPUs). We benchmark our Ganymede simulation with previous results for the Galileo G8 flyby, namely that adding the Hall term to ideal MHD simulations changes the global convection pattern within the magnetosphere. Additionally, we present new results for the G1 flyby as well as initial results from Hall MHD simulations of Mercury and compare them with the corresponding ideal MHD runs.
A new approach to fluid-structure interaction within graphics hardware accelerated smooth particle hydrodynamics considering heterogeneous particle size distribution

NASA Astrophysics Data System (ADS)

Eghtesad, Adnan; Knezevic, Marko

2018-07-01

A corrective smooth particle method (CSPM) within smooth particle hydrodynamics (SPH) is used to study the deformation of an aircraft structure under high-velocity water-ditching impact load. The CSPM-SPH method features a new approach for the prediction of two-way fluid-structure interaction coupling. Results indicate that the implementation is well suited for modeling the deformation of structures under high-velocity impact into water as evident from the predicted stress and strain localizations in the aircraft structure as well as the integrity of the impacted interfaces, which show no artificial particle penetrations. To reduce the simulation time, a heterogeneous particle size distribution over a complex three-dimensional geometry is used. The variable particle size is achieved from a finite element mesh with variable element size and, as a result, variable nodal (i.e., SPH particle) spacing. To further accelerate the simulations, the SPH code is ported to a graphics processing unit using the OpenACC standard. The implementation and simulation results are described and discussed in this paper.
Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit.

PubMed

Badal, Andreu; Badano, Aldo

2009-11-01

It is a known fact that Monte Carlo simulations of radiation transport are computationally intensive and may require long computing times. The authors introduce a new paradigm for the acceleration of Monte Carlo simulations: The use of a graphics processing unit (GPU) as the main computing device instead of a central processing unit (CPU). A GPU-based Monte Carlo code that simulates photon transport in a voxelized geometry with the accurate physics models from PENELOPE has been developed using the CUDATM programming model (NVIDIA Corporation, Santa Clara, CA). An outline of the new code and a sample x-ray imaging simulation with an anthropomorphic phantom are presented. A remarkable 27-fold speed up factor was obtained using a GPU compared to a single core CPU. The reported results show that GPUs are currently a good alternative to CPUs for the simulation of radiation transport. Since the performance of GPUs is currently increasing at a faster pace than that of CPUs, the advantages of GPU-based software are likely to be more pronounced in the future.
Acceleration of High Angular Momentum Electron Repulsion Integrals and Integral Derivatives on Graphics Processing Units.

PubMed

Miao, Yipu; Merz, Kenneth M

2015-04-14

We present an efficient implementation of ab initio self-consistent field (SCF) energy and gradient calculations that run on Compute Unified Device Architecture (CUDA) enabled graphical processing units (GPUs) using recurrence relations. We first discuss the machine-generated code that calculates the electron-repulsion integrals (ERIs) for different ERI types. Next we describe the porting of the SCF gradient calculation to GPUs, which results in an acceleration of the computation of the first-order derivative of the ERIs. However, only s, p, and d ERIs and s and p derivatives could be executed simultaneously on GPUs using the current version of CUDA and generation of NVidia GPUs using a previously described algorithm [Miao and Merz J. Chem. Theory Comput. 2013, 9, 965-976.]. Hence, we developed an algorithm to compute f type ERIs and d type ERI derivatives on GPUs. Our benchmarks shows the performance GPU enable ERI and ERI derivative computation yielded speedups of 10-18 times relative to traditional CPU execution. An accuracy analysis using double-precision calculations demonstrates that the overall accuracy is satisfactory for most applications.
A new approach to fluid-structure interaction within graphics hardware accelerated smooth particle hydrodynamics considering heterogeneous particle size distribution

NASA Astrophysics Data System (ADS)

Eghtesad, Adnan; Knezevic, Marko

2017-12-01

A corrective smooth particle method (CSPM) within smooth particle hydrodynamics (SPH) is used to study the deformation of an aircraft structure under high-velocity water-ditching impact load. The CSPM-SPH method features a new approach for the prediction of two-way fluid-structure interaction coupling. Results indicate that the implementation is well suited for modeling the deformation of structures under high-velocity impact into water as evident from the predicted stress and strain localizations in the aircraft structure as well as the integrity of the impacted interfaces, which show no artificial particle penetrations. To reduce the simulation time, a heterogeneous particle size distribution over a complex three-dimensional geometry is used. The variable particle size is achieved from a finite element mesh with variable element size and, as a result, variable nodal (i.e., SPH particle) spacing. To further accelerate the simulations, the SPH code is ported to a graphics processing unit using the OpenACC standard. The implementation and simulation results are described and discussed in this paper.
Accelerated rescaling of single Monte Carlo simulation runs with the Graphics Processing Unit (GPU).

PubMed

Yang, Owen; Choi, Bernard

2013-01-01

To interpret fiber-based and camera-based measurements of remitted light from biological tissues, researchers typically use analytical models, such as the diffusion approximation to light transport theory, or stochastic models, such as Monte Carlo modeling. To achieve rapid (ideally real-time) measurement of tissue optical properties, especially in clinical situations, there is a critical need to accelerate Monte Carlo simulation runs. In this manuscript, we report on our approach using the Graphics Processing Unit (GPU) to accelerate rescaling of single Monte Carlo runs to calculate rapidly diffuse reflectance values for different sets of tissue optical properties. We selected MATLAB to enable non-specialists in C and CUDA-based programming to use the generated open-source code. We developed a software package with four abstraction layers. To calculate a set of diffuse reflectance values from a simulated tissue with homogeneous optical properties, our rescaling GPU-based approach achieves a reduction in computation time of several orders of magnitude as compared to other GPU-based approaches. Specifically, our GPU-based approach generated a diffuse reflectance value in 0.08ms. The transfer time from CPU to GPU memory currently is a limiting factor with GPU-based calculations. However, for calculation of multiple diffuse reflectance values, our GPU-based approach still can lead to processing that is ~3400 times faster than other GPU-based approaches.
Higher-order ice-sheet modelling accelerated by multigrid on graphics cards

NASA Astrophysics Data System (ADS)

Brædstrup, Christian; Egholm, David

2013-04-01

Higher-order ice flow modelling is a very computer intensive process owing primarily to the nonlinear influence of the horizontal stress coupling. When applied for simulating long-term glacial landscape evolution, the ice-sheet models must consider very long time series, while both high temporal and spatial resolution is needed to resolve small effects. The use of higher-order and full stokes models have therefore seen very limited usage in this field. However, recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large-scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists working on ice flow models. Our current research focuses on utilising the GPU as a tool in ice-sheet and glacier modelling. To this extent we have implemented the Integrated Second-Order Shallow Ice Approximation (iSOSIA) equations on the device using the finite difference method. To accelerate the computations, the GPU solver uses a non-linear Red-Black Gauss-Seidel iterator coupled with a Full Approximation Scheme (FAS) multigrid setup to further aid convergence. The GPU finite difference implementation provides the inherent parallelization that scales from hundreds to several thousands of cores on newer cards. We demonstrate the efficiency of the GPU multigrid solver using benchmark experiments.

Software package for modeling spin-orbit motion in storage rings

NASA Astrophysics Data System (ADS)

Zyuzin, D. V.

2015-12-01

A software package providing a graphical user interface for computer experiments on the motion of charged particle beams in accelerators, as well as analysis of obtained data, is presented. The software package was tested in the framework of the international project on electric dipole moment measurement JEDI (Jülich Electric Dipole moment Investigations). The specific features of particle spin motion imply the requirement to use a cyclic accelerator (storage ring) consisting of electrostatic elements, which makes it possible to preserve horizontal polarization for a long time. Computer experiments study the dynamics of 106-109 particles in a beam during 109 turns in an accelerator (about 1012-1015 integration steps for the equations of motion). For designing an optimal accelerator structure, a large number of computer experiments on polarized beam dynamics are required. The numerical core of the package is COSY Infinity, a program for modeling spin-orbit dynamics.
FPGA acceleration of rigid-molecule docking codes

PubMed Central

Sukhwani, B.; Herbordt, M.C.

2011-01-01

Modelling the interactions of biological molecules, or docking, is critical both to understanding basic life processes and to designing new drugs. The field programmable gate array (FPGA) based acceleration of a recently developed, complex, production docking code is described. The authors found that it is necessary to extend their previous three-dimensional (3D) correlation structure in several ways, most significantly to support simultaneous computation of several correlation functions. The result for small-molecule docking is a 100-fold speed-up of a section of the code that represents over 95% of the original run-time. An additional 2% is accelerated through a previously described method, yielding a total acceleration of 36× over a single core and 10× over a quad-core. This approach is found to be an ideal complement to graphics processing unit (GPU) based docking, which excels in the protein–protein domain. PMID:21857870
GAPD: a GPU-accelerated atom-based polychromatic diffraction simulation code.

PubMed

E, J C; Wang, L; Chen, S; Zhang, Y Y; Luo, S N

2018-03-01

GAPD, a graphics-processing-unit (GPU)-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries, is presented. This code implements GPU parallel computation via both real- and reciprocal-space decompositions. With GAPD, direct simulations are performed of the reciprocal lattice node of ultralarge systems (∼5 billion atoms) and diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams (including synchrotron undulator sources), and validation, benchmark and application cases are presented.
Dispersion relation and electron acceleration in the combined circular and elliptical metallic-dielectric waveguide filled by plasma

NASA Astrophysics Data System (ADS)

Abdoli-Arani, A.; Montazeri, M. M.

2018-04-01

Two special types of metallic waveguide having dielectric cladding and plasma core including the combined circular and elliptical structure are studied. Longitudinal and transverse field components in the different regions are obtained. Applying the boundary conditions, dispersion relations of the electromagnetic waves in the structures are obtained and then plotted. The acceleration of an injected external relativistic electron in the considered waveguides is studied. The obtained differential equations related to electron motion are solved by the fourth-order Runge-Kutta method. Numerical computations are made, and the results are graphically presented.
Accelerator System Model (ASM) user manual with physics and engineering model documentation. ASM version 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1993-07-01

The Accelerator System Model (ASM) is a computer program developed to model proton radiofrequency accelerators and to carry out system level trade studies. The ASM FORTRAN subroutines are incorporated into an intuitive graphical user interface which provides for the {open_quotes}construction{close_quotes} of the accelerator in a window on the computer screen. The interface is based on the Shell for Particle Accelerator Related Codes (SPARC) software technology written for the Macintosh operating system in the C programming language. This User Manual describes the operation and use of the ASM application within the SPARC interface. The Appendix provides a detailed description of themore » physics and engineering models used in ASM. ASM Version 1.0 is joint project of G. H. Gillespie Associates, Inc. and the Accelerator Technology (AT) Division of the Los Alamos National Laboratory. Neither the ASM Version 1.0 software nor this ASM Documentation may be reproduced without the expressed written consent of both the Los Alamos National Laboratory and G. H. Gillespie Associates, Inc.« less
Residual acceleration data on IML-1: Development of a data reduction and dissemination plan

NASA Technical Reports Server (NTRS)

Rogers, Melissa J. B.; Alexander, J. Iwan D.; Wolf, Randy

1992-01-01

The main thrust of our work in the third year of contract NAG8-759 was the development and analysis of various data processing techniques that may be applicable to residual acceleration data. Our goal is the development of a data processing guide that low gravity principal investigators can use to assess their need for accelerometer data and then formulate an acceleration data analysis strategy. The work focused on the flight of the first International Microgravity Laboratory (IML-1) mission. We are also developing a data base management system to handle large quantities of residual acceleration data. This type of system should be an integral tool in the detailed analysis of accelerometer data. The system will manage a large graphics data base in the support of supervised and unsupervised pattern recognition. The goal of the pattern recognition phase is to identify specific classes of accelerations so that these classes can be easily recognized in any data base. The data base management system is being tested on the Spacelab 3 (SL3) residual acceleration data.
NASA's Agricultural Program: A USDA/Grower Partnership

NASA Technical Reports Server (NTRS)

McKellip, Rodney; Thomas, Michael

2002-01-01

Ag20/20 is a partnership between USDA, NASA, and four national commodity associations. It is driven by the information needs of U.S. farmers. Ag20/20 is focused on utilization of earth science and remote sensing for decision-making and oriented toward economically viable operational solutions. Its purpose is to accelerate the use of remote sensing and other geospatial technologies on the farm to: 1) Increase the production efficiency of the American farmer; 2) Reduce crop production risks; 3) Improve environmental stewardship tools for agricultural production.
GPU-accelerated non-uniform fast Fourier transform-based compressive sensing spectral domain optical coherence tomography.

PubMed

Xu, Daguang; Huang, Yong; Kang, Jin U

2014-06-16

We implemented the graphics processing unit (GPU) accelerated compressive sensing (CS) non-uniform in k-space spectral domain optical coherence tomography (SD OCT). Kaiser-Bessel (KB) function and Gaussian function are used independently as the convolution kernel in the gridding-based non-uniform fast Fourier transform (NUFFT) algorithm with different oversampling ratios and kernel widths. Our implementation is compared with the GPU-accelerated modified non-uniform discrete Fourier transform (MNUDFT) matrix-based CS SD OCT and the GPU-accelerated fast Fourier transform (FFT)-based CS SD OCT. It was found that our implementation has comparable performance to the GPU-accelerated MNUDFT-based CS SD OCT in terms of image quality while providing more than 5 times speed enhancement. When compared to the GPU-accelerated FFT based-CS SD OCT, it shows smaller background noise and less side lobes while eliminating the need for the cumbersome k-space grid filling and the k-linear calibration procedure. Finally, we demonstrated that by using a conventional desktop computer architecture having three GPUs, real-time B-mode imaging can be obtained in excess of 30 fps for the GPU-accelerated NUFFT based CS SD OCT with frame size 2048(axial) × 1,000(lateral).
Best bang for your buck: GPU nodes for GROMACS biomolecular simulations

PubMed Central

Páll, Szilárd; Fechner, Martin; Esztermann, Ansgar; de Groot, Bert L.; Grubmüller, Helmut

2015-01-01

The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well‐exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)‐based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off‐loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance‐to‐price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer‐class GPUs this improvement equally reflects in the performance‐to‐price ratio. Although memory issues in consumer‐class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost‐efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well‐balanced ratio of CPU and consumer‐class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:26238484
Best bang for your buck: GPU nodes for GROMACS biomolecular simulations.

PubMed

Kutzner, Carsten; Páll, Szilárd; Fechner, Martin; Esztermann, Ansgar; de Groot, Bert L; Grubmüller, Helmut

2015-10-05

The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well-exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)-based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off-loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance-to-price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer-class GPUs this improvement equally reflects in the performance-to-price ratio. Although memory issues in consumer-class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost-efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well-balanced ratio of CPU and consumer-class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Equalizer: a scalable parallel rendering framework.

PubMed

Eilemann, Stefan; Makhinya, Maxim; Pajarola, Renato

2009-01-01

Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantages over previous approaches, present example configurations and usage scenarios as well as scalability results.
Virtual suturing simulation based on commodity physics engine for medical learning.

PubMed

Choi, Kup-Sze; Chan, Sze-Ho; Pang, Wai-Man

2012-06-01

Development of virtual-reality medical applications is usually a complicated and labour intensive task. This paper explores the feasibility of using commodity physics engine to develop a suturing simulator prototype for manual skills training in the fields of nursing and medicine, so as to enjoy the benefits of rapid development and hardware-accelerated computation. In the prototype, spring-connected boxes of finite dimension are used to simulate soft tissues, whereas needle and thread are modelled with chained segments. Spherical joints are used to simulate suture's flexibility and to facilitate thread cutting. An algorithm is developed to simulate needle insertion and thread advancement through the tissue. Two-handed manipulations and force feedback are enabled with two haptic devices. Experiments on the closure of a wound show that the prototype is able to simulate suturing procedures at interactive rates. The simulator is also used to study a curvature-adaptive suture modelling technique. Issues and limitations of the proposed approach and future development are discussed.
A Subdivision-Based Representation for Vector Image Editing.

PubMed

Liao, Zicheng; Hoppe, Hugues; Forsyth, David; Yu, Yizhou

2012-11-01

Vector graphics has been employed in a wide variety of applications due to its scalability and editability. Editability is a high priority for artists and designers who wish to produce vector-based graphical content with user interaction. In this paper, we introduce a new vector image representation based on piecewise smooth subdivision surfaces, which is a simple, unified and flexible framework that supports a variety of operations, including shape editing, color editing, image stylization, and vector image processing. These operations effectively create novel vector graphics by reusing and altering existing image vectorization results. Because image vectorization yields an abstraction of the original raster image, controlling the level of detail of this abstraction is highly desirable. To this end, we design a feature-oriented vector image pyramid that offers multiple levels of abstraction simultaneously. Our new vector image representation can be rasterized efficiently using GPU-accelerated subdivision. Experiments indicate that our vector image representation achieves high visual quality and better supports editing operations than existing representations.
Unsteady Blood Flow with Nanoparticles Through Stenosed Arteries in the Presence of Periodic Body Acceleration

NASA Astrophysics Data System (ADS)

Fatin Jamil, Dzuliana; Roslan, Rozaini; Abdulhameed, Mohammed; Che-Him, Norziha; Sufahani, Suliadi; Mohamad, Mahathir; Ghazali Kamardan, Muhamad

2018-04-01

The effects of nanoparticles such as Fe 3O4,TiO2, and Cu on blood flow inside a stenosed artery are studied. In this study, blood was modelled as non-Newtonian Bingham plastic fluid subjected to periodic body acceleration and slip velocity. The flow governing equations were solved analytically by using the perturbation method. By using the numerical approaches, the physiological parameters were analyzed, and the blood flow velocity distributions were generated graphically and discussed. From the flow results, the flow speed increases as slip velocity increases and decreases as the values of yield stress increases.
High-throughput sequence alignment using Graphics Processing Units

PubMed Central

Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, Amitabh

2007-01-01

Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. PMID:18070356
Total-dose radiation effects data for semiconductor devices (1989 supplement)

NASA Technical Reports Server (NTRS)

Martin, Keith E.; Coss, James R.; Goben, Charles A.; Shaw, David C.; Farmanesh, Sam; Davarpanah, Michael M.; Craft, Leroy H.; Price, William E.

1990-01-01

Steady state, total dose radiation test data are provided for electronic designers and other personnel using semiconductor devices in a radiation environment. The data are presented in graphic and narrative formats. Two primary radiation source types were used: Cobalt-60 gamma rays and a Dynamitron electron accelerator capable of delivering 2.5 MeV electrons at a steady rate.
ChalkBoard: Mapping Functions to Polygons

NASA Astrophysics Data System (ADS)

Matlage, Kevin; Gill, Andy

ChalkBoard is a domain specific language for describing images. The ChalkBoard language is uncompromisingly functional and encourages the use of modern functional idioms. ChalkBoard uses off-the-shelf graphics cards to speed up rendering of functional descriptions. In this paper, we describe the design of the core ChalkBoard language, and the architecture of our static image generation accelerator.
Accelerating the Gillespie Exact Stochastic Simulation Algorithm using hybrid parallel execution on graphics processing units.

PubMed

Komarov, Ivan; D'Souza, Roshan M

2012-01-01

The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×-120× performance gain over various state-of-the-art serial algorithms when simulating different types of models.
Efficient molecular dynamics simulations with many-body potentials on graphics processing units

NASA Astrophysics Data System (ADS)

Fan, Zheyong; Chen, Wei; Vierimaa, Ville; Harju, Ari

2017-09-01

Graphics processing units have been extensively used to accelerate classical molecular dynamics simulations. However, there is much less progress on the acceleration of force evaluations for many-body potentials compared to pairwise ones. In the conventional force evaluation algorithm for many-body potentials, the force, virial stress, and heat current for a given atom are accumulated within different loops, which could result in write conflict between different threads in a CUDA kernel. In this work, we provide a new force evaluation algorithm, which is based on an explicit pairwise force expression for many-body potentials derived recently (Fan et al., 2015). In our algorithm, the force, virial stress, and heat current for a given atom can be accumulated within a single thread and is free of write conflicts. We discuss the formulations and algorithms and evaluate their performance. A new open-source code, GPUMD, is developed based on the proposed formulations. For the Tersoff many-body potential, the double precision performance of GPUMD using a Tesla K40 card is equivalent to that of the LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) molecular dynamics code running with about 100 CPU cores (Intel Xeon CPU X5670 @ 2.93 GHz).
Field size dependent mapping of medical linear accelerator radiation leakage

NASA Astrophysics Data System (ADS)

Vũ Bezin, Jérémi; Veres, Attila; Lefkopoulos, Dimitri; Chavaudra, Jean; Deutsch, Eric; de Vathaire, Florent; Diallo, Ibrahima

2015-03-01

The purpose of this study was to investigate the suitability of a graphics library based model for the assessment of linear accelerator radiation leakage. Transmission through the shielding elements was evaluated using the build-up factor corrected exponential attenuation law and the contribution from the electron guide was estimated using the approximation of a linear isotropic radioactive source. Model parameters were estimated by a fitting series of thermoluminescent dosimeter leakage measurements, achieved up to 100 cm from the beam central axis along three directions. The distribution of leakage data at the patient plane reflected the architecture of the shielding elements. Thus, the maximum leakage dose was found under the collimator when only one jaw shielded the primary beam and was about 0.08% of the dose at isocentre. Overall, we observe that the main contributor to leakage dose according to our model was the electron beam guide. Concerning the discrepancies between the measurements used to calibrate the model and the calculations from the model, the average difference was about 7%. Finally, graphics library modelling is a readily and suitable way to estimate leakage dose distribution on a personal computer. Such data could be useful for dosimetric evaluations in late effect studies.

GPU accelerated simulations of 3D deterministic particle transport using discrete ordinates method

NASA Astrophysics Data System (ADS)

Gong, Chunye; Liu, Jie; Chi, Lihua; Huang, Haowei; Fang, Jingyue; Gong, Zhenghu

2011-07-01

Graphics Processing Unit (GPU), originally developed for real-time, high-definition 3D graphics in computer games, now provides great faculty in solving scientific applications. The basis of particle transport simulation is the time-dependent, multi-group, inhomogeneous Boltzmann transport equation. The numerical solution to the Boltzmann equation involves the discrete ordinates ( Sn) method and the procedure of source iteration. In this paper, we present a GPU accelerated simulation of one energy group time-independent deterministic discrete ordinates particle transport in 3D Cartesian geometry (Sweep3D). The performance of the GPU simulations are reported with the simulations of vacuum boundary condition. The discussion of the relative advantages and disadvantages of the GPU implementation, the simulation on multi GPUs, the programming effort and code portability are also reported. The results show that the overall performance speedup of one NVIDIA Tesla M2050 GPU ranges from 2.56 compared with one Intel Xeon X5670 chip to 8.14 compared with one Intel Core Q6600 chip for no flux fixup. The simulation with flux fixup on one M2050 is 1.23 times faster than on one X5670.
Kinematic modelling of disc galaxies using graphics processing units

NASA Astrophysics Data System (ADS)

Bekiaris, G.; Glazebrook, K.; Fluke, C. J.; Abraham, R.

2016-01-01

With large-scale integral field spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such as the graphics processing unit (GPU), as an accelerator for the computationally expensive model-fitting procedure. We review the algorithms involved in model-fitting and evaluate their suitability for GPU implementation. We employ different optimization techniques, including the Levenberg-Marquardt and nested sampling algorithms, but also a naive brute-force approach based on nested grids. We find that the GPU can accelerate the model-fitting procedure up to a factor of ˜100 when compared to a single-threaded CPU, and up to a factor of ˜10 when compared to a multithreaded dual CPU configuration. Our method's accuracy, precision and robustness are assessed by successfully recovering the kinematic properties of simulated data, and also by verifying the kinematic modelling results of galaxies from the GHASP and DYNAMO surveys as found in the literature. The resulting GBKFIT code is available for download from: http://supercomputing.swin.edu.au/gbkfit.
Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Badal, Andreu; Badano, Aldo

Purpose: It is a known fact that Monte Carlo simulations of radiation transport are computationally intensive and may require long computing times. The authors introduce a new paradigm for the acceleration of Monte Carlo simulations: The use of a graphics processing unit (GPU) as the main computing device instead of a central processing unit (CPU). Methods: A GPU-based Monte Carlo code that simulates photon transport in a voxelized geometry with the accurate physics models from PENELOPE has been developed using the CUDA programming model (NVIDIA Corporation, Santa Clara, CA). Results: An outline of the new code and a sample x-raymore » imaging simulation with an anthropomorphic phantom are presented. A remarkable 27-fold speed up factor was obtained using a GPU compared to a single core CPU. Conclusions: The reported results show that GPUs are currently a good alternative to CPUs for the simulation of radiation transport. Since the performance of GPUs is currently increasing at a faster pace than that of CPUs, the advantages of GPU-based software are likely to be more pronounced in the future.« less
Multilevel Summation of Electrostatic Potentials Using Graphics Processing Units*

PubMed Central

Hardy, David J.; Stone, John E.; Schulten, Klaus

2009-01-01

Physical and engineering practicalities involved in microprocessor design have resulted in flat performance growth for traditional single-core microprocessors. The urgent need for continuing increases in the performance of scientific applications requires the use of many-core processors and accelerators such as graphics processing units (GPUs). This paper discusses GPU acceleration of the multilevel summation method for computing electrostatic potentials and forces for a system of charged atoms, which is a problem of paramount importance in biomolecular modeling applications. We present and test a new GPU algorithm for the long-range part of the potentials that computes a cutoff pair potential between lattice points, essentially convolving a fixed 3-D lattice of “weights” over all sub-cubes of a much larger lattice. The implementation exploits the different memory subsystems provided on the GPU to stream optimally sized data sets through the multiprocessors. We demonstrate for the full multilevel summation calculation speedups of up to 26 using a single GPU and 46 using multiple GPUs, enabling the computation of a high-resolution map of the electrostatic potential for a system of 1.5 million atoms in under 12 seconds. PMID:20161132
OpenCL Implementation of NeuroIsing

NASA Astrophysics Data System (ADS)

Zapart, C. A.

Recent advances in graphics card hardware combined with anintroduction of the OpenCL standard promise to accelerate numerical simulations across diverse scientific disciplines. One such field benefiting from new hardware/software paradigms is econophysics. The paper describes an OpenCL implementation of a selected econophysics model: NeuroIsing, which has been designed to execute in parallel on a vendor-independent graphics card. Originally introduced in the paper [C.~A.~Zapart, ``Econophysics in Financial Time Series Prediction'', PhD thesis, Graduate University for Advanced Studies, Japan (2009)], at first it was implemented on a CELL processor running inside a SONY PS3 games console. The NeuroIsing framework can be applied to predicting and trading foreign exchange as well as stock market index futures.
JDiffraction: A GPGPU-accelerated JAVA library for numerical propagation of scalar wave fields

NASA Astrophysics Data System (ADS)

Piedrahita-Quintero, Pablo; Trujillo, Carlos; Garcia-Sucerquia, Jorge

2017-05-01

JDiffraction, a GPGPU-accelerated JAVA library for numerical propagation of scalar wave fields, is presented. Angular spectrum, Fresnel transform, and Fresnel-Bluestein transform are the numerical algorithms implemented in the methods and functions of the library to compute the scalar propagation of the complex wavefield. The functionality of the library is tested with the modeling of easy to forecast numerical experiments and also with the numerical reconstruction of a digitally recorded hologram. The performance of JDiffraction is contrasted with a library written for C++, showing great competitiveness in the apparently less complex environment of JAVA language. JDiffraction also includes JAVA easy-to-use methods and functions that take advantage of the computation power of the graphic processing units to accelerate the processing times of 2048×2048 pixel images up to 74 frames per second.
GPU-accelerated phase extraction algorithm for interferograms: a real-time application

NASA Astrophysics Data System (ADS)

Zhu, Xiaoqiang; Wu, Yongqian; Liu, Fengwei

2016-11-01

Optical testing, having the merits of non-destruction and high sensitivity, provides a vital guideline for optical manufacturing. But the testing process is often computationally intensive and expensive, usually up to a few seconds, which is sufferable for dynamic testing. In this paper, a GPU-accelerated phase extraction algorithm is proposed, which is based on the advanced iterative algorithm. The accelerated algorithm can extract the right phase-distribution from thirteen 1024x1024 fringe patterns with arbitrary phase shifts in 233 milliseconds on average using NVIDIA Quadro 4000 graphic card, which achieved a 12.7x speedup ratio than the same algorithm executed on CPU and 6.6x speedup ratio than that on Matlab using DWANING W5801 workstation. The performance improvement can fulfill the demand of computational accuracy and real-time application.
Software package for modeling spin–orbit motion in storage rings

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zyuzin, D. V., E-mail: d.zyuzin@fz-juelich.de

2015-12-15

A software package providing a graphical user interface for computer experiments on the motion of charged particle beams in accelerators, as well as analysis of obtained data, is presented. The software package was tested in the framework of the international project on electric dipole moment measurement JEDI (Jülich Electric Dipole moment Investigations). The specific features of particle spin motion imply the requirement to use a cyclic accelerator (storage ring) consisting of electrostatic elements, which makes it possible to preserve horizontal polarization for a long time. Computer experiments study the dynamics of 10{sup 6}–10{sup 9} particles in a beam during 10{supmore » 9} turns in an accelerator (about 10{sup 12}–10{sup 15} integration steps for the equations of motion). For designing an optimal accelerator structure, a large number of computer experiments on polarized beam dynamics are required. The numerical core of the package is COSY Infinity, a program for modeling spin–orbit dynamics.« less
GPU accelerated manifold correction method for spinning compact binaries

NASA Astrophysics Data System (ADS)

Ran, Chong-xi; Liu, Song; Zhong, Shuang-ying

2018-04-01

The graphics processing unit (GPU) acceleration of the manifold correction algorithm based on the compute unified device architecture (CUDA) technology is designed to simulate the dynamic evolution of the Post-Newtonian (PN) Hamiltonian formulation of spinning compact binaries. The feasibility and the efficiency of parallel computation on GPU have been confirmed by various numerical experiments. The numerical comparisons show that the accuracy on GPU execution of manifold corrections method has a good agreement with the execution of codes on merely central processing unit (CPU-based) method. The acceleration ability when the codes are implemented on GPU can increase enormously through the use of shared memory and register optimization techniques without additional hardware costs, implying that the speedup is nearly 13 times as compared with the codes executed on CPU for phase space scan (including 314 × 314 orbits). In addition, GPU-accelerated manifold correction method is used to numerically study how dynamics are affected by the spin-induced quadrupole-monopole interaction for black hole binary system.
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators

NASA Astrophysics Data System (ADS)

Gonzalez, Juan; Núñez, Rafael C.

2009-07-01

We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.
Manufacturing’s Contribution to Pakistan’s Economic Expansion: Commodity - or Service-Led Growth

DTIC Science & Technology

1994-12-01

private sector from regulation and artificial price distortions. In addition, a complementary privatisation programme was launched with the aim of reducing the role of the public sector in manufacturing and services. As a side benefit, the programme was seen as alleviating the government’s financial and administrative burden and creating new opportunities for the private sector . While growth in large-scale manufacturing output has not accelerated in recent years (nor has its overall contribution to GDP growth increased), there is hope
APRON: A Cellular Processor Array Simulation and Hardware Design Tool

NASA Astrophysics Data System (ADS)

Barr, David R. W.; Dudek, Piotr

2009-12-01

We present a software environment for the efficient simulation of cellular processor arrays (CPAs). This software (APRON) is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.
Graphical and PC-software analysis of volcano eruption precursors according to the Materials Failure Forecast Method (FFM)

NASA Astrophysics Data System (ADS)

Cornelius, Reinold R.; Voight, Barry

1995-03-01

The Materials Failure Forecasting Method for volcanic eruptions (FFM) analyses the rate of precursory phenomena. Time of eruption onset is derived from the time of "failure" implied by accelerating rate of deformation. The approach attempts to fit data, Ω, to the differential relationship Ω¨=AΩ˙, where the dot superscript represents the time derivative, and the data Ω may be any of several parameters describing the accelerating deformation or energy release of the volcanic system. Rate coefficients, A and α, may be derived from appropriate data sets to provide an estimate of time to "failure". As the method is still an experimental technique, it should be used with appropriate judgment during times of volcanic crisis. Limitations of the approach are identified and discussed. Several kinds of eruption precursory phenomena, all simulating accelerating creep during the mechanical deformation of the system, can be used with FFM. Among these are tilt data, slope-distance measurements, crater fault movements and seismicity. The use of seismic coda, seismic amplitude-derived energy release and time-integrated amplitudes or coda lengths are examined. Usage of cumulative coda length directly has some practical advantages to more rigorously derived parameters, and RSAM and SSAM technologies appear to be well suited to real-time applications. One graphical and four numerical techniques of applying FFM are discussed. The graphical technique is based on an inverse representation of rate versus time. For α = 2, the inverse rate plot is linear; it is concave upward for α < 2 and concave downward for α > 2. The eruption time is found by simple extrapolation of the data set toward the time axis. Three numerical techniques are based on linear least-squares fits to linearized data sets. The "linearized least-squares technique" is most robust and is expected to be the most practical numerical technique. This technique is based on an iterative linearization of the given rate-time series. The hindsight technique is disadvantaged by a bias favouring a too early eruption time in foresight applications. The "log rate versus log acceleration technique", utilizing a logarithmic representation of the fundamental differential equation, is disadvantaged by large data scatter after interpolation of accelerations. One further numerical technique, a nonlinear least-squares fit to rate data, requires special and more complex software. PC-oriented computer codes were developed for data manipulation, application of the three linearizing numerical methods, and curve fitting. Separate software is required for graphing purposes. All three linearizing techniques facilitate an eruption window based on a data envelope according to the linear least-squares fit, at a specific level of confidence, and an estimated rate at time of failure.
Evaluating virtual hosted desktops for graphics-intensive astronomy

NASA Astrophysics Data System (ADS)

Meade, B. F.; Fluke, C. J.

2018-04-01

Visualisation of data is critical to understanding astronomical phenomena. Today, many instruments produce datasets that are too big to be downloaded to a local computer, yet many of the visualisation tools used by astronomers are deployed only on desktop computers. Cloud computing is increasingly used to provide a computation and simulation platform in astronomy, but it also offers great potential as a visualisation platform. Virtual hosted desktops, with graphics processing unit (GPU) acceleration, allow interactive, graphics-intensive desktop applications to operate co-located with astronomy datasets stored in remote data centres. By combining benchmarking and user experience testing, with a cohort of 20 astronomers, we investigate the viability of replacing physical desktop computers with virtual hosted desktops. In our work, we compare two Apple MacBook computers (one old and one new, representing hardware and opposite ends of the useful lifetime) with two virtual hosted desktops: one commercial (Amazon Web Services) and one in a private research cloud (the Australian NeCTAR Research Cloud). For two-dimensional image-based tasks and graphics-intensive three-dimensional operations - typical of astronomy visualisation workflows - we found that benchmarks do not necessarily provide the best indication of performance. When compared to typical laptop computers, virtual hosted desktops can provide a better user experience, even with lower performing graphics cards. We also found that virtual hosted desktops are equally simple to use, provide greater flexibility in choice of configuration, and may actually be a more cost-effective option for typical usage profiles.
Effects of a Story Map on Accelerated Reader Postreading Test Scores in Students with High-Functioning Autism

ERIC Educational Resources Information Center

Stringfield, Suzanne Griggs; Luscre, Deanna; Gast, David L.

2011-01-01

In this study, three elementary-aged boys with high-functioning autism (HFA) were taught to use a graphic organizer called a Story Map as a postreading tool during language arts instruction. Students learned to accurately complete the Story Map. The effect of the intervention on story recall was assessed within the context of a multiple-baseline…
An elementary approach to the gravitational Doppler shift

NASA Astrophysics Data System (ADS)

Wörner, C. H.; Rojas, Roberto

2017-01-01

In college physics courses, treatment of the Doppler effect is usually done far from the first introduction to kinematics. This paper aims to apply a graphical treatment to describe the gravitational redshift, by considering the Doppler effect in two accelerated reference frames and exercising the equivalence principle. This approach seems appropriate to discuss with beginner students and could serve to enrich the didactic processes.
GPU applications for data processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vladymyrov, Mykhailo, E-mail: mykhailo.vladymyrov@cern.ch; Aleksandrov, Andrey; INFN sezione di Napoli, I-80125 Napoli

2015-12-31

Modern experiments that use nuclear photoemulsion imply fast and efficient data acquisition from the emulsion can be performed. The new approaches in developing scanning systems require real-time processing of large amount of data. Methods that use Graphical Processing Unit (GPU) computing power for emulsion data processing are presented here. It is shown how the GPU-accelerated emulsion processing helped us to rise the scanning speed by factor of nine.
Software Accelerates Computing Time for Complex Math

NASA Technical Reports Server (NTRS)

2014-01-01

Ames Research Center awarded Newark, Delaware-based EM Photonics Inc. SBIR funding to utilize graphic processing unit (GPU) technology- traditionally used for computer video games-to develop high-computing software called CULA. The software gives users the ability to run complex algorithms on personal computers with greater speed. As a result of the NASA collaboration, the number of employees at the company has increased 10 percent.
Accelerating Malware Detection via a Graphics Processing Unit

DTIC Science & Technology

2010-09-01

Processing Unit . . . . . . . . . . . . . . . . . . 4 PE Portable Executable . . . . . . . . . . . . . . . . . . . . . 4 COFF Common Object File Format...operating systems for the future [Szo05]. The PE format is an updated version of the common object file format ( COFF ) [Mic06]. Microsoft released a new...NAs02]. These alerts can be costly in terms of time and resources for individuals and organizations to investigate each misidentified file [YWL07] [Vak10
Cognitive ergonomics of operational tools

NASA Astrophysics Data System (ADS)

Lüdeke, A.

2012-10-01

Control systems have become increasingly more powerful over the past decades. The availability of high data throughput and sophisticated graphical interactions has opened a variety of new possibilities. But has this helped to provide intuitive, easy to use applications to simplify the operation of modern large scale accelerator facilities? We will discuss what makes an application useful to operation and what is necessary to make a tool easy to use. We will show that even the implementation of a small number of simple application design rules can help to create ergonomic operational tools. The author is convinced that such tools do indeed help to achieve higher beam availability and better beam performance at accelerator facilities.

The Research and Test of Fast Radio Burst Real-time Search Algorithm Based on GPU Acceleration

NASA Astrophysics Data System (ADS)

Wang, J.; Chen, M. Z.; Pei, X.; Wang, Z. Q.

2017-03-01

In order to satisfy the research needs of Nanshan 25 m radio telescope of Xinjiang Astronomical Observatory (XAO) and study the key technology of the planned QiTai radio Telescope (QTT), the receiver group of XAO studied the GPU (Graphics Processing Unit) based real-time FRB searching algorithm which developed from the original FRB searching algorithm based on CPU (Central Processing Unit), and built the FRB real-time searching system. The comparison of the GPU system and the CPU system shows that: on the basis of ensuring the accuracy of the search, the speed of the GPU accelerated algorithm is improved by 35-45 times compared with the CPU algorithm.
GPU-accelerated computation of electron transfer.

PubMed

Höfinger, Siegfried; Acocella, Angela; Pop, Sergiu C; Narumi, Tetsu; Yasuoka, Kenji; Beu, Titus; Zerbetto, Francesco

2012-11-05

Electron transfer is a fundamental process that can be studied with the help of computer simulation. The underlying quantum mechanical description renders the problem a computationally intensive application. In this study, we probe the graphics processing unit (GPU) for suitability to this type of problem. Time-critical components are identified via profiling of an existing implementation and several different variants are tested involving the GPU at increasing levels of abstraction. A publicly available library supporting basic linear algebra operations on the GPU turns out to accelerate the computation approximately 50-fold with minor dependence on actual problem size. The performance gain does not compromise numerical accuracy and is of significant value for practical purposes. Copyright © 2012 Wiley Periodicals, Inc.
GPU Accelerated Prognostics

NASA Technical Reports Server (NTRS)

Gorospe, George E., Jr.; Daigle, Matthew J.; Sankararaman, Shankar; Kulkarni, Chetan S.; Ng, Eley

2017-01-01

Prognostic methods enable operators and maintainers to predict the future performance for critical systems. However, these methods can be computationally expensive and may need to be performed each time new information about the system becomes available. In light of these computational requirements, we have investigated the application of graphics processing units (GPUs) as a computational platform for real-time prognostics. Recent advances in GPU technology have reduced cost and increased the computational capability of these highly parallel processing units, making them more attractive for the deployment of prognostic software. We present a survey of model-based prognostic algorithms with considerations for leveraging the parallel architecture of the GPU and a case study of GPU-accelerated battery prognostics with computational performance results.
International trade drives biodiversity threats in developing nations.

PubMed

Lenzen, M; Moran, D; Kanemoto, K; Foran, B; Lobefaro, L; Geschke, A

2012-06-06

Human activities are causing Earth's sixth major extinction event-an accelerating decline of the world's stocks of biological diversity at rates 100 to 1,000 times pre-human levels. Historically, low-impact intrusion into species habitats arose from local demands for food, fuel and living space. However, in today's increasingly globalized economy, international trade chains accelerate habitat degradation far removed from the place of consumption. Although adverse effects of economic prosperity and economic inequality have been confirmed, the importance of international trade as a driver of threats to species is poorly understood. Here we show that a significant number of species are threatened as a result of international trade along complex routes, and that, in particular, consumers in developed countries cause threats to species through their demand of commodities that are ultimately produced in developing countries. We linked 25,000 Animalia species threat records from the International Union for Conservation of Nature Red List to more than 15,000 commodities produced in 187 countries and evaluated more than 5 billion supply chains in terms of their biodiversity impacts. Excluding invasive species, we found that 30% of global species threats are due to international trade. In many developed countries, the consumption of imported coffee, tea, sugar, textiles, fish and other manufactured items causes a biodiversity footprint that is larger abroad than at home. Our results emphasize the importance of examining biodiversity loss as a global systemic phenomenon, instead of looking at the degrading or polluting producers in isolation. We anticipate that our findings will facilitate better regulation, sustainable supply-chain certification and consumer product labelling.
Total-dose radiation effects data for semiconductor devices. 1985 supplement. Volume 2, part A

NASA Technical Reports Server (NTRS)

Martin, K. E.; Gauthier, M. K.; Coss, J. R.; Dantas, A. R. V.; Price, W. E.

1986-01-01

Steady-state, total-dose radiation test data, are provided in graphic format for use by electronic designers and other personnel using semiconductor devices in a radiation environment. The data were generated by JPL for various NASA space programs. This volume provides data on integrated circuits. The data are presented in graphic, tabular, and/or narrative format, depending on the complexity of the integrated circuit. Most tests were done using the JPL or Boeing electron accelerator (Dynamitron) which provides a steady-state 2.5 MeV electron beam. However, some radiation exposures were made with a Cobalt-60 gamma ray source, the results of which should be regarded as only an approximate measure of the radiation damage that would be incurred by an equivalent electron dose.
Medical image processing on the GPU - past, present and future.

PubMed

Eklund, Anders; Dufort, Paul; Forsberg, Daniel; LaConte, Stephen M

2013-12-01

Graphics processing units (GPUs) are used today in a wide range of applications, mainly because they can dramatically accelerate parallel computing, are affordable and energy efficient. In the field of medical imaging, GPUs are in some cases crucial for enabling practical use of computationally demanding algorithms. This review presents the past and present work on GPU accelerated medical image processing, and is meant to serve as an overview and introduction to existing GPU implementations. The review covers GPU acceleration of basic image processing operations (filtering, interpolation, histogram estimation and distance transforms), the most commonly used algorithms in medical imaging (image registration, image segmentation and image denoising) and algorithms that are specific to individual modalities (CT, PET, SPECT, MRI, fMRI, DTI, ultrasound, optical imaging and microscopy). The review ends by highlighting some future possibilities and challenges. Copyright © 2013 Elsevier B.V. All rights reserved.
Accelerating Time Integration for the Shallow Water Equations on the Sphere Using GPUs

DOE PAGES

Archibald, R.; Evans, K. J.; Salinger, A.

2015-06-01

The push towards larger and larger computational platforms has made it possible for climate simulations to resolve climate dynamics across multiple spatial and temporal scales. This direction in climate simulation has created a strong need to develop scalable timestepping methods capable of accelerating throughput on high performance computing. This study details the recent advances in the implementation of implicit time stepping of the spectral element dynamical core within the United States Department of Energy (DOE) Accelerated Climate Model for Energy (ACME) on graphical processing units (GPU) based machines. We demonstrate how solvers in the Trilinos project are interfaced with ACMEmore » and GPU kernels to increase computational speed of the residual calculations in the implicit time stepping method for the atmosphere dynamics. We demonstrate the optimization gains and data structure reorganization that facilitates the performance improvements.« less
GPU COMPUTING FOR PARTICLE TRACKING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nishimura, Hiroshi; Song, Kai; Muriki, Krishna

2011-03-25

This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed. General purpose Computation on Graphics Processing Units (GPGPU) bring massive parallel computing capabilities to numerical calculation. However, the unique architecture of GPU requires a comprehensive understanding of the hardware and programming model to be able to well optimize existing applications. In the field of accelerator physics, the dynamic aperture calculationmore » of a storage ring, which is often the most time consuming part of the accelerator modeling and simulation, can benefit from GPU due to its embarrassingly parallel feature, which fits well with the GPU programming model. In this paper, we use the Tesla C2050 GPU which consists of 14 multi-processois (MP) with 32 cores on each MP, therefore a total of 448 cores, to host thousands ot threads dynamically. Thread is a logical execution unit of the program on GPU. In the GPU programming model, threads are grouped into a collection of blocks Within each block, multiple threads share the same code, and up to 48 KB of shared memory. Multiple thread blocks form a grid, which is executed as a GPU kernel. A simplified code that is a subset of Tracy++ [2] is developed to demonstrate the possibility of using GPU to speed up the dynamic aperture calculation by having each thread track a particle.« less
A Graphics Processing Unit Accelerated Motion Correction Algorithm and Modular System for Real-time fMRI

PubMed Central

Scheinost, Dustin; Hampson, Michelle; Qiu, Maolin; Bhawnani, Jitendra; Constable, R. Todd; Papademetris, Xenophon

2013-01-01

Real-time functional magnetic resonance imaging (rt-fMRI) has recently gained interest as a possible means to facilitate the learning of certain behaviors. However, rt-fMRI is limited by processing speed and available software, and continued development is needed for rt-fMRI to progress further and become feasible for clinical use. In this work, we present an open-source rt-fMRI system for biofeedback powered by a novel Graphics Processing Unit (GPU) accelerated motion correction strategy as part of the BioImage Suite project (www.bioimagesuite.org). Our system contributes to the development of rt-fMRI by presenting a motion correction algorithm that provides an estimate of motion with essentially no processing delay as well as a modular rt-fMRI system design. Using empirical data from rt-fMRI scans, we assessed the quality of motion correction in this new system. The present algorithm performed comparably to standard (non real-time) offline methods and outperformed other real-time methods based on zero order interpolation of motion parameters. The modular approach to the rt-fMRI system allows the system to be flexible to the experiment and feedback design, a valuable feature for many applications. We illustrate the flexibility of the system by describing several of our ongoing studies. Our hope is that continuing development of open-source rt-fMRI algorithms and software will make this new technology more accessible and adaptable, and will thereby accelerate its application in the clinical and cognitive neurosciences. PMID:23319241
A graphics processing unit accelerated motion correction algorithm and modular system for real-time fMRI.

PubMed

Scheinost, Dustin; Hampson, Michelle; Qiu, Maolin; Bhawnani, Jitendra; Constable, R Todd; Papademetris, Xenophon

2013-07-01

Real-time functional magnetic resonance imaging (rt-fMRI) has recently gained interest as a possible means to facilitate the learning of certain behaviors. However, rt-fMRI is limited by processing speed and available software, and continued development is needed for rt-fMRI to progress further and become feasible for clinical use. In this work, we present an open-source rt-fMRI system for biofeedback powered by a novel Graphics Processing Unit (GPU) accelerated motion correction strategy as part of the BioImage Suite project ( www.bioimagesuite.org ). Our system contributes to the development of rt-fMRI by presenting a motion correction algorithm that provides an estimate of motion with essentially no processing delay as well as a modular rt-fMRI system design. Using empirical data from rt-fMRI scans, we assessed the quality of motion correction in this new system. The present algorithm performed comparably to standard (non real-time) offline methods and outperformed other real-time methods based on zero order interpolation of motion parameters. The modular approach to the rt-fMRI system allows the system to be flexible to the experiment and feedback design, a valuable feature for many applications. We illustrate the flexibility of the system by describing several of our ongoing studies. Our hope is that continuing development of open-source rt-fMRI algorithms and software will make this new technology more accessible and adaptable, and will thereby accelerate its application in the clinical and cognitive neurosciences.
Specialized Computer Systems for Environment Visualization

NASA Astrophysics Data System (ADS)

Al-Oraiqat, Anas M.; Bashkov, Evgeniy A.; Zori, Sergii A.

2018-06-01

The need for real time image generation of landscapes arises in various fields as part of tasks solved by virtual and augmented reality systems, as well as geographic information systems. Such systems provide opportunities for collecting, storing, analyzing and graphically visualizing geographic data. Algorithmic and hardware software tools for increasing the realism and efficiency of the environment visualization in 3D visualization systems are proposed. This paper discusses a modified path tracing algorithm with a two-level hierarchy of bounding volumes and finding intersections with Axis-Aligned Bounding Box. The proposed algorithm eliminates the branching and hence makes the algorithm more suitable to be implemented on the multi-threaded CPU and GPU. A modified ROAM algorithm is used to solve the qualitative visualization of reliefs' problems and landscapes. The algorithm is implemented on parallel systems—cluster and Compute Unified Device Architecture-networks. Results show that the implementation on MPI clusters is more efficient than Graphics Processing Unit/Graphics Processing Clusters and allows real-time synthesis. The organization and algorithms of the parallel GPU system for the 3D pseudo stereo image/video synthesis are proposed. With realizing possibility analysis on a parallel GPU-architecture of each stage, 3D pseudo stereo synthesis is performed. An experimental prototype of a specialized hardware-software system 3D pseudo stereo imaging and video was developed on the CPU/GPU. The experimental results show that the proposed adaptation of 3D pseudo stereo imaging to the architecture of GPU-systems is efficient. Also it accelerates the computational procedures of 3D pseudo-stereo synthesis for the anaglyph and anamorphic formats of the 3D stereo frame without performing optimization procedures. The acceleration is on average 11 and 54 times for test GPUs.
DROIDS 1.20: A GUI-Based Pipeline for GPU-Accelerated Comparative Protein Dynamics.

PubMed

Babbitt, Gregory A; Mortensen, Jamie S; Coppola, Erin E; Adams, Lily E; Liao, Justin K

2018-03-13

Traditional informatics in comparative genomics work only with static representations of biomolecules (i.e., sequence and structure), thereby ignoring the molecular dynamics (MD) of proteins that define function in the cell. A comparative approach applied to MD would connect this very short timescale process, defined in femtoseconds, to one of the longest in the universe: molecular evolution measured in millions of years. Here, we leverage advances in graphics-processing-unit-accelerated MD simulation software to develop a comparative method of MD analysis and visualization that can be applied to any two homologous Protein Data Bank structures. Our open-source pipeline, DROIDS (Detecting Relative Outlier Impacts in Dynamic Simulations), works in conjunction with existing molecular modeling software to convert any Linux gaming personal computer into a "comparative computational microscope" for observing the biophysical effects of mutations and other chemical changes in proteins. DROIDS implements structural alignment and Benjamini-Hochberg-corrected Kolmogorov-Smirnov statistics to compare nanosecond-scale atom bond fluctuations on the protein backbone, color mapping the significant differences identified in protein MD with single-amino-acid resolution. DROIDS is simple to use, incorporating graphical user interface control for Amber16 MD simulations, cpptraj analysis, and the final statistical and visual representations in R graphics and UCSF Chimera. We demonstrate that DROIDS can be utilized to visually investigate molecular evolution and disease-related functional changes in MD due to genetic mutation and epigenetic modification. DROIDS can also be used to potentially investigate binding interactions of pharmaceuticals, toxins, or other biomolecules in a functional evolutionary context as well. Copyright © 2018 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Large-scale virtual screening on public cloud resources with Apache Spark.

PubMed

Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola

2017-01-01

Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.
Parallel Computer System for 3D Visualization Stereo on GPU

NASA Astrophysics Data System (ADS)

Al-Oraiqat, Anas M.; Zori, Sergii A.

2018-03-01

This paper proposes the organization of a parallel computer system based on Graphic Processors Unit (GPU) for 3D stereo image synthesis. The development is based on the modified ray tracing method developed by the authors for fast search of tracing rays intersections with scene objects. The system allows significant increase in the productivity for the 3D stereo synthesis of photorealistic quality. The generalized procedure of 3D stereo image synthesis on the Graphics Processing Unit/Graphics Processing Clusters (GPU/GPC) is proposed. The efficiency of the proposed solutions by GPU implementation is compared with single-threaded and multithreaded implementations on the CPU. The achieved average acceleration in multi-thread implementation on the test GPU and CPU is about 7.5 and 1.6 times, respectively. Studying the influence of choosing the size and configuration of the computational Compute Unified Device Archi-tecture (CUDA) network on the computational speed shows the importance of their correct selection. The obtained experimental estimations can be significantly improved by new GPUs with a large number of processing cores and multiprocessors, as well as optimized configuration of the computing CUDA network.
Quantum Chemical Calculations Using Accelerators: Migrating Matrix Operations to the NVIDIA Kepler GPU and the Intel Xeon Phi.

PubMed

Leang, Sarom S; Rendell, Alistair P; Gordon, Mark S

2014-03-11

Increasingly, modern computer systems comprise a multicore general-purpose processor augmented with a number of special purpose devices or accelerators connected via an external interface such as a PCI bus. The NVIDIA Kepler Graphical Processing Unit (GPU) and the Intel Phi are two examples of such accelerators. Accelerators offer peak performances that can be well above those of the host processor. How to exploit this heterogeneous environment for legacy application codes is not, however, straightforward. This paper considers how matrix operations in typical quantum chemical calculations can be migrated to the GPU and Phi systems. Double precision general matrix multiply operations are endemic in electronic structure calculations, especially methods that include electron correlation, such as density functional theory, second order perturbation theory, and coupled cluster theory. The use of approaches that automatically determine whether to use the host or an accelerator, based on problem size, is explored, with computations that are occurring on the accelerator and/or the host. For data-transfers over PCI-e, the GPU provides the best overall performance for data sizes up to 4096 MB with consistent upload and download rates between 5-5.6 GB/s and 5.4-6.3 GB/s, respectively. The GPU outperforms the Phi for both square and nonsquare matrix multiplications.
Improving GPU-accelerated adaptive IDW interpolation algorithm using fast kNN search.

PubMed

Mei, Gang; Xu, Nengxiong; Xu, Liangliang

2016-01-01

This paper presents an efficient parallel Adaptive Inverse Distance Weighting (AIDW) interpolation algorithm on modern Graphics Processing Unit (GPU). The presented algorithm is an improvement of our previous GPU-accelerated AIDW algorithm by adopting fast k-nearest neighbors (kNN) search. In AIDW, it needs to find several nearest neighboring data points for each interpolated point to adaptively determine the power parameter; and then the desired prediction value of the interpolated point is obtained by weighted interpolating using the power parameter. In this work, we develop a fast kNN search approach based on the space-partitioning data structure, even grid, to improve the previous GPU-accelerated AIDW algorithm. The improved algorithm is composed of the stages of kNN search and weighted interpolating. To evaluate the performance of the improved algorithm, we perform five groups of experimental tests. The experimental results indicate: (1) the improved algorithm can achieve a speedup of up to 1017 over the corresponding serial algorithm; (2) the improved algorithm is at least two times faster than our previous GPU-accelerated AIDW algorithm; and (3) the utilization of fast kNN search can significantly improve the computational efficiency of the entire GPU-accelerated AIDW algorithm.
Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors

NASA Astrophysics Data System (ADS)

Khajeh-Saeed, Ali; Poole, Stephen; Blair Perot, J.

2010-06-01

Finding regions of similarity between two very long data streams is a computationally intensive problem referred to as sequence alignment. Alignment algorithms must allow for imperfect sequence matching with different starting locations and some gaps and errors between the two data sequences. Perhaps the most well known application of sequence matching is the testing of DNA or protein sequences against genome databases. The Smith-Waterman algorithm is a method for precisely characterizing how well two sequences can be aligned and for determining the optimal alignment of those two sequences. Like many applications in computational science, the Smith-Waterman algorithm is constrained by the memory access speed and can be accelerated significantly by using graphics processors (GPUs) as the compute engine. In this work we show that effective use of the GPU requires a novel reformulation of the Smith-Waterman algorithm. The performance of this new version of the algorithm is demonstrated using the SSCA#1 (Bioinformatics) benchmark running on one GPU and on up to four GPUs executing in parallel. The results indicate that for large problems a single GPU is up to 45 times faster than a CPU for this application, and the parallel implementation shows linear speed up on up to 4 GPUs.
High-performance image processing on the desktop

NASA Astrophysics Data System (ADS)

Jordan, Stephen D.

1996-04-01

The suitability of computers to the task of medical image visualization for the purposes of primary diagnosis and treatment planning depends on three factors: speed, image quality, and price. To be widely accepted the technology must increase the efficiency of the diagnostic and planning processes. This requires processing and displaying medical images of various modalities in real-time, with accuracy and clarity, on an affordable system. Our approach to meeting this challenge began with market research to understand customer image processing needs. These needs were translated into system-level requirements, which in turn were used to determine which image processing functions should be implemented in hardware. The result is a computer architecture for 2D image processing that is both high-speed and cost-effective. The architectural solution is based on the high-performance PA-RISC workstation with an HCRX graphics accelerator. The image processing enhancements are incorporated into the image visualization accelerator (IVX) which attaches to the HCRX graphics subsystem. The IVX includes a custom VLSI chip which has a programmable convolver, a window/level mapper, and an interpolator supporting nearest-neighbor, bi-linear, and bi-cubic modes. This combination of features can be used to enable simultaneous convolution, pan, zoom, rotate, and window/level control into 1 k by 1 k by 16-bit medical images at 40 frames/second.
Acceleration of spiking neural network based pattern recognition on NVIDIA graphics processors.

PubMed

Han, Bing; Taha, Tarek M

2010-04-01

There is currently a strong push in the research community to develop biological scale implementations of neuron based vision models. Systems at this scale are computationally demanding and generally utilize more accurate neuron models, such as the Izhikevich and the Hodgkin-Huxley models, in favor of the more popular integrate and fire model. We examine the feasibility of using graphics processing units (GPUs) to accelerate a spiking neural network based character recognition network to enable such large scale systems. Two versions of the network utilizing the Izhikevich and Hodgkin-Huxley models are implemented. Three NVIDIA general-purpose (GP) GPU platforms are examined, including the GeForce 9800 GX2, the Tesla C1060, and the Tesla S1070. Our results show that the GPGPUs can provide significant speedup over conventional processors. In particular, the fastest GPGPU utilized, the Tesla S1070, provided a speedup of 5.6 and 84.4 over highly optimized implementations on the fastest central processing unit (CPU) tested, a quadcore 2.67 GHz Xeon processor, for the Izhikevich and the Hodgkin-Huxley models, respectively. The CPU implementation utilized all four cores and the vector data parallelism offered by the processor. The results indicate that GPUs are well suited for this application domain.
Scalable large format 3D displays

NASA Astrophysics Data System (ADS)

Chang, Nelson L.; Damera-Venkata, Niranjan

2010-02-01

We present a general framework for the modeling and optimization of scalable large format 3-D displays using multiple projectors. Based on this framework, we derive algorithms that can robustly optimize the visual quality of an arbitrary combination of projectors (e.g. tiled, superimposed, combinations of the two) without manual adjustment. The framework creates for the first time a new unified paradigm that is agnostic to a particular configuration of projectors yet robustly optimizes for the brightness, contrast, and resolution of that configuration. In addition, we demonstrate that our algorithms support high resolution stereoscopic video at real-time interactive frame rates achieved on commodity graphics hardware. Through complementary polarization, the framework creates high quality multi-projector 3-D displays at low hardware and operational cost for a variety of applications including digital cinema, visualization, and command-and-control walls.

CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications.

PubMed

Lei, Guoqing; Dou, Yong; Wan, Wen; Xia, Fei; Li, Rongchun; Ma, Meng; Zou, Dan

2012-01-01

Prediction of ribonucleic acid (RNA) secondary structure remains one of the most important research areas in bioinformatics. The Zuker algorithm is one of the most popular methods of free energy minimization for RNA secondary structure prediction. Thus far, few studies have been reported on the acceleration of the Zuker algorithm on general-purpose processors or on extra accelerators such as Field Programmable Gate-Array (FPGA) and Graphics Processing Units (GPU). To the best of our knowledge, no implementation combines both CPU and extra accelerators, such as GPUs, to accelerate the Zuker algorithm applications. In this paper, a CPU-GPU hybrid computing system that accelerates Zuker algorithm applications for RNA secondary structure prediction is proposed. The computing tasks are allocated between CPU and GPU for parallel cooperate execution. Performance differences between the CPU and the GPU in the task-allocation scheme are considered to obtain workload balance. To improve the hybrid system performance, the Zuker algorithm is optimally implemented with special methods for CPU and GPU architecture. Speedup of 15.93× over optimized multi-core SIMD CPU implementation and performance advantage of 16% over optimized GPU implementation are shown in the experimental results. More than 14% of the sequences are executed on CPU in the hybrid system. The system combining CPU and GPU to accelerate the Zuker algorithm is proven to be promising and can be applied to other bioinformatics applications.
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, W Michael; Wang, Peng; Plimpton, Steven J

The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - 1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory,more » 2) minimizing the amount of code that must be ported for efficient acceleration, 3) utilizing the available processing power from both many-core CPUs and accelerators, and 4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.« less
Solar/hydrogen systems assessment. Volume 1: Solar/hydrogen systems for the 1985 - 2000 time frame

NASA Technical Reports Server (NTRS)

Foster, R. W.; Tison, R. R.; Escher, W. J. D.; Hanson, J. A.

1980-01-01

Opportunities for commercialization of systems capable of producing hydrogen from solar energy were studied. The hydrogen product costs that might be achieved by the four selected candidate systems was compared with the pricing structure and practices of the commodity gas market. Subsequently, product cost and market price match was noted to exist in the small user sector of the hydrogen marketplace. Barriers to and historical time lags in, commercialization of new technologies are reviewed. Recommendations for development and demonstration programs designed to accelerate the commercialization of the candidate systems are presented.
World food and nutrition: the scientific and technological base.

PubMed

Wortman, S

1980-07-04

Alleviation of world hunger and poverty will require the accelerated development and application in each low-income country of a broad spectrum of technologies based on advances in the biological, social, and physical sciences. They will range from improved cropping systems for farmers or small labor-intensive enterprises (small and beautiful) to nationwide transportation and communications systems, power grids, and other distribution and marketing capabilities (big and beautiful). Concerted action through a combination of commodity production campaigns, area development efforts, and overhaul of outdated national agencies offers the best prospect for overcoming both hunger and poverty.
Solar/hydrogen systems assessment. Volume 1: Solar/hydrogen systems for the 1985 - 2000 time frame

NASA Astrophysics Data System (ADS)

Foster, R. W.; Tison, R. R.; Escher, W. J. D.; Hanson, J. A.

1980-06-01

Opportunities for commercialization of systems capable of producing hydrogen from solar energy were studied. The hydrogen product costs that might be achieved by the four selected candidate systems was compared with the pricing structure and practices of the commodity gas market. Subsequently, product cost and market price match was noted to exist in the small user sector of the hydrogen marketplace. Barriers to and historical time lags in, commercialization of new technologies are reviewed. Recommendations for development and demonstration programs designed to accelerate the commercialization of the candidate systems are presented.
The Graphical Cadastre Problem in Turkey: The Case of Trabzon Province.

PubMed

Demir, Osman; Çoruhlu, Yakup Emre

2008-09-11

Cadastral projects in Turkey have been accelerated in recent years by the involvement of the private sector. These projects aim at completing the country's cadastre, along with producing bases in standards that could be a foundation for Land Registry and Cadastre Information System (LRCIS). It is possible to produce cadastral data with today's technological means. In this context, three dimensional cadastre data can be properly produced, especially in digital cadastre projects with the required point accuracy. Nevertheless this is not enough for LRCIS. The cadastre bases that have been produced so far by different methods with different scales and bases, with or without coordinates, should also be converted into digital form based on National Basic GPS Network of Turkey (NBGN) in required point-location accuracy. As the result of evaluation of graphical cadastre bases produced without coordinates, actual land measurements, and information obtained from sheets and field book data together, it was found out that there are significant base problems in the graphical maps. These bases, comprising 20% of Turkey's cadastre constitutes the most important bottleneck of completing the country's cadastre. In the scope of this paper, the possibilities of converting the field book measurement values of graphic cadastre bases into digital forms in national coordinate system by comparing them with actual land measurements are investigated, along with Turkey's Cadastre and its problems.
The Graphical Cadastre Problem in Turkey: The Case of Trabzon Province

PubMed Central

Demir, Osman; Çoruhlu, Yakup Emre

2008-01-01

Cadastral projects in Turkey have been accelerated in recent years by the involvement of the private sector. These projects aim at completing the country's cadastre, along with producing bases in standards that could be a foundation for Land Registry and Cadastre Information System (LRCIS). It is possible to produce cadastral data with today's technological means. In this context, three dimensional cadastre data can be properly produced, especially in digital cadastre projects with the required point accuracy. Nevertheless this is not enough for LRCIS. The cadastre bases that have been produced so far by different methods with different scales and bases, with or without coordinates, should also be converted into digital form based on National Basic GPS Network of Turkey (NBGN) in required point-location accuracy. As the result of evaluation of graphical cadastre bases produced without coordinates, actual land measurements, and information obtained from sheets and field book data together, it was found out that there are significant base problems in the graphical maps. These bases, comprising 20% of Turkey's cadastre constitutes the most important bottleneck of completing the country's cadastre. In the scope of this paper, the possibilities of converting the field book measurement values of graphic cadastre bases into digital forms in national coordinate system by comparing them with actual land measurements are investigated, along with Turkey's Cadastre and its problems. PMID:27873830
Speedup for quantum optimal control from automatic differentiation based on graphics processing units

NASA Astrophysics Data System (ADS)

Leung, Nelson; Abdelhafez, Mohamed; Koch, Jens; Schuster, David

2017-04-01

We implement a quantum optimal control algorithm based on automatic differentiation and harness the acceleration afforded by graphics processing units (GPUs). Automatic differentiation allows us to specify advanced optimization criteria and incorporate them in the optimization process with ease. We show that the use of GPUs can speedup calculations by more than an order of magnitude. Our strategy facilitates efficient numerical simulations on affordable desktop computers and exploration of a host of optimization constraints and system parameters relevant to real-life experiments. We demonstrate optimization of quantum evolution based on fine-grained evaluation of performance at each intermediate time step, thus enabling more intricate control on the evolution path, suppression of departures from the truncated model subspace, as well as minimization of the physical time needed to perform high-fidelity state preparation and unitary gates.
Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods.

PubMed

Andrade, Xavier; Aspuru-Guzik, Alán

2013-10-08

We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems, our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.
GPU-accelerated computational tool for studying the effectiveness of asteroid disruption techniques

NASA Astrophysics Data System (ADS)

Zimmerman, Ben J.; Wie, Bong

2016-10-01

This paper presents the development of a new Graphics Processing Unit (GPU) accelerated computational tool for asteroid disruption techniques. Numerical simulations are completed using the high-order spectral difference (SD) method. Due to the compact nature of the SD method, it is well suited for implementation with the GPU architecture, hence solutions are generated at orders of magnitude faster than the Central Processing Unit (CPU) counterpart. A multiphase model integrated with the SD method is introduced, and several asteroid disruption simulations are conducted, including kinetic-energy impactors, multi-kinetic energy impactor systems, and nuclear options. Results illustrate the benefits of using multi-kinetic energy impactor systems when compared to a single impactor system. In addition, the effectiveness of nuclear options is observed.
GPU accelerated FDTD solver and its application in MRI.

PubMed

Chi, J; Liu, F; Jin, J; Mason, D G; Crozier, S

2010-01-01

The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 shimming scheme for high-field magnetic resonance imaging (MRI). The optimized shimming scheme exhibits considerably improved transmit B(1) profiles. The GPU implementation dramatically shortened the runtime of FDTD simulation of electromagnetic field compared with its CPU counterpart. The acceleration in runtime has made such investigation possible, and will pave the way for other studies of large-scale computational electromagnetic problems in modern MRI which were previously impractical.
GPU Accelerated Ultrasonic Tomography Using Propagation and Back Propagation Method

DTIC Science & Technology

2015-09-28

the medical imaging field using GPUs has been done for many years. In [1], Copeland et al. used 2D images , obtained by X - ray projections, to...Index Terms— Medical Imaging , Ultrasonic Tomography, GPU, CUDA, Parallel Computing I. INTRODUCTION GRAPHIC Processing Units (GPUs) are computation... Imaging Algorithm The process of reconstructing images from ultrasonic infor- mation starts with the following acoustical wave equation: ∂2 ∂t2 u ( x
JView Visualization for Next Generation Air Transportation System

DTIC Science & Technology

2011-01-01

hardware graphics acceleration. JView relies on concrete Object Oriented Design (OOD) and programming techniques to provide a robust and venue non...visibility priority of a texture set. A good example of this is you have translucent images that should always be visible over the other textures...elements present in the scene. • Capture Alpha. Allows the alpha color channel ( translucency ) to be saved when capturing images or movies of a 3D scene
17 CFR 4.41 - Advertising by commodity pool operators, commodity trading advisors, and the principals thereof.

Code of Federal Regulations, 2010 CFR

2010-04-01

... operators, commodity trading advisors, and the principals thereof. 4.41 Section 4.41 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS Advertising § 4.41 Advertising by commodity pool operators, commodity trading advisors, and the...
Accelerating Wright–Fisher Forward Simulations on the Graphics Processing Unit

PubMed Central

Lawrie, David S.

2017-01-01

Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/. PMID:28768689
Integration of an open interface PC scene generator using COTS DVI converter hardware

NASA Astrophysics Data System (ADS)

Nordland, Todd; Lyles, Patrick; Schultz, Bret

2006-05-01

Commercial-Off-The-Shelf (COTS) personal computer (PC) hardware is increasingly capable of computing high dynamic range (HDR) scenes for military sensor testing at high frame rates. New electro-optical and infrared (EO/IR) scene projectors feature electrical interfaces that can accept the DVI output of these PC systems. However, military Hardware-in-the-loop (HWIL) facilities such as those at the US Army Aviation and Missile Research Development and Engineering Center (AMRDEC) utilize a sizeable inventory of existing projection systems that were designed to use the Silicon Graphics Incorporated (SGI) digital video port (DVP, also known as DVP2 or DD02) interface. To mate the new DVI-based scene generation systems to these legacy projection systems, CG2 Inc., a Quantum3D Company (CG2), has developed a DVI-to-DVP converter called Delta DVP. This device takes progressive scan DVI input, converts it to digital parallel data, and combines and routes color components to derive a 16-bit wide luminance channel replicated on a DVP output interface. The HWIL Functional Area of AMRDEC has developed a suite of modular software to perform deterministic real-time, wave band-specific rendering of sensor scenes, leveraging the features of commodity graphics hardware and open source software. Together, these technologies enable sensor simulation and test facilities to integrate scene generation and projection components with diverse pedigrees.
Efficient gaussian density formulation of volume and surface areas of macromolecules on graphical processing units.

PubMed

Zhang, Baofeng; Kilburg, Denise; Eastman, Peter; Pande, Vijay S; Gallicchio, Emilio

2017-04-15

We present an algorithm to efficiently compute accurate volumes and surface areas of macromolecules on graphical processing unit (GPU) devices using an analytic model which represents atomic volumes by continuous Gaussian densities. The volume of the molecule is expressed by means of the inclusion-exclusion formula, which is based on the summation of overlap integrals among multiple atomic densities. The surface area of the molecule is obtained by differentiation of the molecular volume with respect to atomic radii. The many-body nature of the model makes a port to GPU devices challenging. To our knowledge, this is the first reported full implementation of this model on GPU hardware. To accomplish this, we have used recursive strategies to construct the tree of overlaps and to accumulate volumes and their gradients on the tree data structures so as to minimize memory contention. The algorithm is used in the formulation of a surface area-based non-polar implicit solvent model implemented as an open source plug-in (named GaussVol) for the popular OpenMM library for molecular mechanics modeling. GaussVol is 50 to 100 times faster than our best optimized implementation for the CPUs, achieving speeds in excess of 100 ns/day with 1 fs time-step for protein-sized systems on commodity GPUs. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Electrosynthesis of Commodity Chemicals by an Autotrophic Microbial Community

PubMed Central

Marshall, Christopher W.; Ross, Daniel E.; Fichot, Erin B.; Norman, R. Sean

2012-01-01

A microbial community originating from brewery waste produced methane, acetate, and hydrogen when selected on a granular graphite cathode poised at −590 mV versus the standard hydrogen electrode (SHE) with CO2 as the only carbon source. This is the first report on the simultaneous electrosynthesis of these commodity chemicals and the first description of electroacetogenesis by a microbial community. Deep sequencing of the active community 16S rRNA revealed a dynamic microbial community composed of an invariant Archaea population of Methanobacterium spp. and a shifting Bacteria population. Acetobacterium spp. were the most abundant Bacteria on the cathode when acetogenesis dominated. Methane was generally the dominant product with rates increasing from <1 to 7 mM day−1 (per cathode liquid volume) and was concomitantly produced with acetate and hydrogen. Acetogenesis increased to >4 mM day−1 (accumulated to 28.5 mM over 12 days), and methanogenesis ceased following the addition of 2-bromoethanesulfonic acid. Traces of hydrogen accumulated during initial selection and subsequently accelerated to >11 mM day−1 (versus 0.045 mM day−1 abiotic production). The hypothesis of electrosynthetic biocatalysis occurring at the microbe-electrode interface was supported by a catalytic wave (midpoint potential of −460 mV versus SHE) in cyclic voltammetry scans of the biocathode, the lack of redox active components in the medium, and the generation of comparatively high amounts of products (even after medium exchange). In addition, the volumetric production rates of these three commodity chemicals are marked improvements for electrosynthesis, advancing the process toward economic feasibility. PMID:23001672
CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications

PubMed Central

2012-01-01

Background Prediction of ribonucleic acid (RNA) secondary structure remains one of the most important research areas in bioinformatics. The Zuker algorithm is one of the most popular methods of free energy minimization for RNA secondary structure prediction. Thus far, few studies have been reported on the acceleration of the Zuker algorithm on general-purpose processors or on extra accelerators such as Field Programmable Gate-Array (FPGA) and Graphics Processing Units (GPU). To the best of our knowledge, no implementation combines both CPU and extra accelerators, such as GPUs, to accelerate the Zuker algorithm applications. Results In this paper, a CPU-GPU hybrid computing system that accelerates Zuker algorithm applications for RNA secondary structure prediction is proposed. The computing tasks are allocated between CPU and GPU for parallel cooperate execution. Performance differences between the CPU and the GPU in the task-allocation scheme are considered to obtain workload balance. To improve the hybrid system performance, the Zuker algorithm is optimally implemented with special methods for CPU and GPU architecture. Conclusions Speedup of 15.93× over optimized multi-core SIMD CPU implementation and performance advantage of 16% over optimized GPU implementation are shown in the experimental results. More than 14% of the sequences are executed on CPU in the hybrid system. The system combining CPU and GPU to accelerate the Zuker algorithm is proven to be promising and can be applied to other bioinformatics applications. PMID:22369626
A graphical approach to radio frequency quadrupole design

NASA Astrophysics Data System (ADS)

Turemen, G.; Unel, G.; Yasatekin, B.

2015-07-01

The design of a radio frequency quadrupole, an important section of all ion accelerators, and the calculation of its beam dynamics properties can be achieved using the existing computational tools. These programs, originally designed in 1980s, show effects of aging in their user interfaces and in their output. The authors believe there is room for improvement in both design techniques using a graphical approach and in the amount of analytical calculations before going into CPU burning finite element analysis techniques. Additionally an emphasis on the graphical method of controlling the evolution of the relevant parameters using the drag-to-change paradigm is bound to be beneficial to the designer. A computer code, named DEMIRCI, has been written in C++ to demonstrate these ideas. This tool has been used in the design of Turkish Atomic Energy Authority (TAEK)'s 1.5 MeV proton beamline at Saraykoy Nuclear Research and Training Center (SANAEM). DEMIRCI starts with a simple analytical model, calculates the RFQ behavior and produces 3D design files that can be fed to a milling machine. The paper discusses the experience gained during design process of SANAEM Project Prometheus (SPP) RFQ and underlines some of DEMIRCI's capabilities.

Enhanced Graphics for Extended Scale Range

NASA Technical Reports Server (NTRS)

Hanson, Andrew J.; Chi-Wing Fu, Philip

2012-01-01

Enhanced Graphics for Extended Scale Range is a computer program for rendering fly-through views of scene models that include visible objects differing in size by large orders of magnitude. An example would be a scene showing a person in a park at night with the moon, stars, and galaxies in the background sky. Prior graphical computer programs exhibit arithmetic and other anomalies when rendering scenes containing objects that differ enormously in scale and distance from the viewer. The present program dynamically repartitions distance scales of objects in a scene during rendering to eliminate almost all such anomalies in a way compatible with implementation in other software and in hardware accelerators. By assigning depth ranges correspond ing to rendering precision requirements, either automatically or under program control, this program spaces out object scales to match the precision requirements of the rendering arithmetic. This action includes an intelligent partition of the depth buffer ranges to avoid known anomalies from this source. The program is written in C++, using OpenGL, GLUT, and GLUI standard libraries, and nVidia GEForce Vertex Shader extensions. The program has been shown to work on several computers running UNIX and Windows operating systems.
Accelerating image reconstruction in dual-head PET system by GPU and symmetry properties.

PubMed

Chou, Cheng-Ying; Dong, Yun; Hung, Yukai; Kao, Yu-Jiun; Wang, Weichung; Kao, Chien-Min; Chen, Chin-Tu

2012-01-01

Positron emission tomography (PET) is an important imaging modality in both clinical usage and research studies. We have developed a compact high-sensitivity PET system that consisted of two large-area panel PET detector heads, which produce more than 224 million lines of response and thus request dramatic computational demands. In this work, we employed a state-of-the-art graphics processing unit (GPU), NVIDIA Tesla C2070, to yield an efficient reconstruction process. Our approaches ingeniously integrate the distinguished features of the symmetry properties of the imaging system and GPU architectures, including block/warp/thread assignments and effective memory usage, to accelerate the computations for ordered subset expectation maximization (OSEM) image reconstruction. The OSEM reconstruction algorithms were implemented employing both CPU-based and GPU-based codes, and their computational performance was quantitatively analyzed and compared. The results showed that the GPU-accelerated scheme can drastically reduce the reconstruction time and thus can largely expand the applicability of the dual-head PET system.
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy

NASA Astrophysics Data System (ADS)

Yamanaka, Akinori; Aoki, Takayuki; Ogawa, Satoi; Takaki, Tomohiro

2011-03-01

The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a graphic processing unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with computer unified device architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is presented. Also, we evaluated the acceleration performance of the three-dimensional solidification simulation by using a single NVIDIA TESLA C1060 GPU and the developed program code. The results showed that the GPU calculation for 5763 computational grids achieved the performance of 170 GFLOPS by utilizing the shared memory as a software-managed cache. Furthermore, it can be demonstrated that the computation with the GPU is 100 times faster than that with a single CPU core. From the obtained results, we confirmed the feasibility of realizing a real-time full three-dimensional phase-field simulation of microstructure evolution on a personal desktop computer.
Quantum supercharger library: hyper-parallelism of the Hartree-Fock method.

PubMed

Fernandes, Kyle D; Renison, C Alicia; Naidoo, Kevin J

2015-07-05

We present here a set of algorithms that completely rewrites the Hartree-Fock (HF) computations common to many legacy electronic structure packages (such as GAMESS-US, GAMESS-UK, and NWChem) into a massively parallel compute scheme that takes advantage of hardware accelerators such as Graphical Processing Units (GPUs). The HF compute algorithm is core to a library of routines that we name the Quantum Supercharger Library (QSL). We briefly evaluate the QSL's performance and report that it accelerates a HF 6-31G Self-Consistent Field (SCF) computation by up to 20 times for medium sized molecules (such as a buckyball) when compared with mature Central Processing Unit algorithms available in the legacy codes in regular use by researchers. It achieves this acceleration by massive parallelization of the one- and two-electron integrals and optimization of the SCF and Direct Inversion in the Iterative Subspace routines through the use of GPU linear algebra libraries. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Low Gravity Guidance System for Airborne Microgravity Research

NASA Technical Reports Server (NTRS)

Rieke, W. J.; Emery, E. F.; Boyer, E. O.; Hegedus, C.; ODonoghue, D. P.

1996-01-01

Microgravity research techniques have been established to achieve a greater understanding of the role of gravity in the fundamentals of a variety of physical phenomena and material processing. One technique in use at the NASA Lewis Research Center involves flying Keplarian trajectories with a modified Lear Jet and DC-9 aircraft to achieve a highly accurate Microgravity environment by neutralizing accelerations in all three axis of the aircraft. The Low Gravity Guidance System (LGGS) assists the pilot and copilot in flying the trajectories by displaying the aircraft acceleration data in a graphical display format. The Low Gravity Guidance System is a microprocessor based system that acquires and displays the aircraft acceleration information. This information is presented using an electroluminescent display mounted over the pilot's instrument panel. The pilot can select the Microgravity range that is required for a given research event. This paper describes the characteristics, design, calibration and testing of the Low Gravity Guidance System Phase 3, significant lessons from earlier systems and the developmental work on future systems.
The General-Use Nodal Network Solver (GUNNS) Modeling Package for Space Vehicle Flow System Simulation

NASA Technical Reports Server (NTRS)

Harvey, Jason; Moore, Michael

2013-01-01

The General-Use Nodal Network Solver (GUNNS) is a modeling software package that combines nodal analysis and the hydraulic-electric analogy to simulate fluid, electrical, and thermal flow systems. GUNNS is developed by L-3 Communications under the TS21 (Training Systems for the 21st Century) project for NASA Johnson Space Center (JSC), primarily for use in space vehicle training simulators at JSC. It has sufficient compactness and fidelity to model the fluid, electrical, and thermal aspects of space vehicles in real-time simulations running on commodity workstations, for vehicle crew and flight controller training. It has a reusable and flexible component and system design, and a Graphical User Interface (GUI), providing capability for rapid GUI-based simulator development, ease of maintenance, and associated cost savings. GUNNS is optimized for NASA's Trick simulation environment, but can be run independently of Trick.
Framework for Development and Distribution of Hardware Acceleration

NASA Astrophysics Data System (ADS)

Thomas, David B.; Luk, Wayne W.

2002-07-01

This paper describes IGOL, a framework for developing reconfigurable data processing applications. While IGOL was originally designed to target imaging and graphics systems, its structure is sufficiently general to support a broad range of applications. IGOL adopts a four-layer architecture: application layer, operation layer, appliance layer and configuration layer. This architecture is intended to separate and co-ordinate both the development and execution of hardware and software components. Hardware developers can use IGOL as an instance testbed for verification and benchmarking, as well as for distribution. Software application developers can use IGOL to discover hardware accelerated data processors, and to access them in a transparent, non-hardware specific manner. IGOL provides extensive support for the RC1000-PP board via the Handel-C language, and a wide selection of image processing filters have been developed. IGOL also supplies plug-ins to enable such filters to be incorporated in popular applications such as Premiere, Winamp, VirtualDub and DirectShow. Moreover, IGOL allows the automatic use of multiple cards to accelerate an application, demonstrated using DirectShow. To enable transparent acceleration without sacrificing performance, a three-tiered COM (Component Object Model) API has been designed and implemented. This API provides a well-defined and extensible interface which facilitates the development of hardware data processors that can accelerate multiple applications.
Accelerating EPI distortion correction by utilizing a modern GPU-based parallel computation.

PubMed

Yang, Yao-Hao; Huang, Teng-Yi; Wang, Fu-Nien; Chuang, Tzu-Chao; Chen, Nan-Kuei

2013-04-01

The combination of phase demodulation and field mapping is a practical method to correct echo planar imaging (EPI) geometric distortion. However, since phase dispersion accumulates in each phase-encoding step, the calculation complexity of phase modulation is Ny-fold higher than conventional image reconstructions. Thus, correcting EPI images via phase demodulation is generally a time-consuming task. Parallel computing by employing general-purpose calculations on graphics processing units (GPU) can accelerate scientific computing if the algorithm is parallelized. This study proposes a method that incorporates the GPU-based technique into phase demodulation calculations to reduce computation time. The proposed parallel algorithm was applied to a PROPELLER-EPI diffusion tensor data set. The GPU-based phase demodulation method reduced the EPI distortion correctly, and accelerated the computation. The total reconstruction time of the 16-slice PROPELLER-EPI diffusion tensor images with matrix size of 128 × 128 was reduced from 1,754 seconds to 101 seconds by utilizing the parallelized 4-GPU program. GPU computing is a promising method to accelerate EPI geometric correction. The resulting reduction in computation time of phase demodulation should accelerate postprocessing for studies performed with EPI, and should effectuate the PROPELLER-EPI technique for clinical practice. Copyright © 2011 by the American Society of Neuroimaging.
An Adynamical, Graphical Approach to Quantum Gravity and Unification

NASA Astrophysics Data System (ADS)

Stuckey, W. M.; Silberstein, Michael; McDevitt, Timothy

We use graphical field gradients in an adynamical, background independent fashion to propose a new approach to quantum gravity (QG) and unification. Our proposed reconciliation of general relativity (GR) and quantum field theory (QFT) is based on a modification of their graphical instantiations, i.e. Regge calculus and lattice gauge theory (LGT), respectively, which we assume are fundamental to their continuum counterparts. Accordingly, the fundamental structure is a graphical amalgam of space, time, and sources (in parlance of QFT) called a "space-time source element". These are fundamental elements of space, time, and sources, not source elements in space and time. The transition amplitude for a space-time source element is computed using a path integral with discrete graphical action. The action for a space-time source element is constructed from a difference matrix K and source vector J on the graph, as in lattice gauge theory. K is constructed from graphical field gradients so that it contains a non-trivial null space and J is then restricted to the row space of K, so that it is divergence-free and represents a conserved exchange of energy-momentum. This construct of K and J represents an adynamical global constraint (AGC) between sources, the space-time metric, and the energy-momentum content of the element, rather than a dynamical law for time-evolved entities. In this view, one manifestation of quantum gravity becomes evident when, for example, a single space-time source element spans adjoining simplices of the Regge calculus graph. Thus, energy conservation for the space-time source element includes contributions to the deficit angles between simplices. This idea is used to correct proper distance in the Einstein-de Sitter (EdS) cosmology model yielding a fit of the Union2 Compilation supernova data that matches ΛCDM without having to invoke accelerating expansion or dark energy. A similar modification to LGT results in an adynamical account of quantum interference.
Spatial-spectral preprocessing for endmember extraction on GPU's

NASA Astrophysics Data System (ADS)

Jimenez, Luis I.; Plaza, Javier; Plaza, Antonio; Li, Jun

2016-10-01

Spectral unmixing is focused in the identification of spectrally pure signatures, called endmembers, and their corresponding abundances in each pixel of a hyperspectral image. Mainly focused on the spectral information contained in the hyperspectral images, endmember extraction techniques have recently included spatial information to achieve more accurate results. Several algorithms have been developed for automatic or semi-automatic identification of endmembers using spatial and spectral information, including the spectral-spatial endmember extraction (SSEE) where, within a preprocessing step in the technique, both sources of information are extracted from the hyperspectral image and equally used for this purpose. Previous works have implemented the SSEE technique in four main steps: 1) local eigenvectors calculation in each sub-region in which the original hyperspectral image is divided; 2) computation of the maxima and minima projection of all eigenvectors over the entire hyperspectral image in order to obtain a candidates pixels set; 3) expansion and averaging of the signatures of the candidate set; 4) ranking based on the spectral angle distance (SAD). The result of this method is a list of candidate signatures from which the endmembers can be extracted using various spectral-based techniques, such as orthogonal subspace projection (OSP), vertex component analysis (VCA) or N-FINDR. Considering the large volume of data and the complexity of the calculations, there is a need for efficient implementations. Latest- generation hardware accelerators such as commodity graphics processing units (GPUs) offer a good chance for improving the computational performance in this context. In this paper, we develop two different implementations of the SSEE algorithm using GPUs. Both are based on the eigenvectors computation within each sub-region of the first step, one using the singular value decomposition (SVD) and another one using principal component analysis (PCA). Based on our experiments with hyperspectral data sets, high computational performance is observed in both cases.
Dissecting the COW

DOE Office of Scientific and Technical Information (OSTI.GOV)

Linstadt, E.

1985-10-01

The COW, or Console On Wheels, is the primary operator interface to the SLC accelerator control system. A hardware and software description of the COW, a microcomputer based system with a color graphics display output and touchpanel and knob inputs, is given. The ease of development and expandability, due to both the modular nature of the hardware and the multitasking, interrupt driven software running in the COW, are described. Integration of the COW into the SLCNET communications network and SLC Control system is detailed.
Beam transport program for FEL project

NASA Astrophysics Data System (ADS)

Sugimoto, Masayoshi; Takao, Masaru

1992-07-01

A beam transport program is developed to design the beam transport line of the free electron laser system at JAERI and to assist the beam diagnosis. The program traces a beam matrix through the elements in the beam transport line and the accelerators. The graphical user interface is employed to access the parameters and to represent the results. The basic computational method is based on the LANL-TRACE program and it is rewritten for personal computers in Pascal.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

NASA Astrophysics Data System (ADS)

Schaerer, Roman Pascal; Bansal, Pratyuksh; Torrilhon, Manuel

2017-07-01

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) [13], we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropy distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.
cellVIEW: a Tool for Illustrative and Multi-Scale Rendering of Large Biomolecular Datasets

PubMed Central

Le Muzic, Mathieu; Autin, Ludovic; Parulek, Julius; Viola, Ivan

2017-01-01

In this article we introduce cellVIEW, a new system to interactively visualize large biomolecular datasets on the atomic level. Our tool is unique and has been specifically designed to match the ambitions of our domain experts to model and interactively visualize structures comprised of several billions atom. The cellVIEW system integrates acceleration techniques to allow for real-time graphics performance of 60 Hz display rate on datasets representing large viruses and bacterial organisms. Inspired by the work of scientific illustrators, we propose a level-of-detail scheme which purpose is two-fold: accelerating the rendering and reducing visual clutter. The main part of our datasets is made out of macromolecules, but it also comprises nucleic acids strands which are stored as sets of control points. For that specific case, we extend our rendering method to support the dynamic generation of DNA strands directly on the GPU. It is noteworthy that our tool has been directly implemented inside a game engine. We chose to rely on a third party engine to reduce software development work-load and to make bleeding-edge graphics techniques more accessible to the end-users. To our knowledge cellVIEW is the only suitable solution for interactive visualization of large bimolecular landscapes on the atomic level and is freely available to use and extend. PMID:29291131
Graphics Processing Unit (GPU) Acceleration of the Goddard Earth Observing System Atmospheric Model

NASA Technical Reports Server (NTRS)

Putnam, Williama

2011-01-01

The Goddard Earth Observing System 5 (GEOS-5) is the atmospheric model used by the Global Modeling and Assimilation Office (GMAO) for a variety of applications, from long-term climate prediction at relatively coarse resolution, to data assimilation and numerical weather prediction, to very high-resolution cloud-resolving simulations. GEOS-5 is being ported to a graphics processing unit (GPU) cluster at the NASA Center for Climate Simulation (NCCS). By utilizing GPU co-processor technology, we expect to increase the throughput of GEOS-5 by at least an order of magnitude, and accelerate the process of scientific exploration across all scales of global modeling, including: The large-scale, high-end application of non-hydrostatic, global, cloud-resolving modeling at 10- to I-kilometer (km) global resolutions Intermediate-resolution seasonal climate and weather prediction at 50- to 25-km on small clusters of GPUs Long-range, coarse-resolution climate modeling, enabled on a small box of GPUs for the individual researcher After being ported to the GPU cluster, the primary physics components and the dynamical core of GEOS-5 have demonstrated a potential speedup of 15-40 times over conventional processor cores. Performance improvements of this magnitude reduce the required scalability of 1-km, global, cloud-resolving models from an unfathomable 6 million cores to an attainable 200,000 GPU-enabled cores.
Using Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

NASA Astrophysics Data System (ADS)

O'Connor, A. S.; Justice, B.; Harris, A. T.

2013-12-01

Graphics Processing Units (GPUs) are high-performance multiple-core processors capable of very high computational speeds and large data throughput. Modern GPUs are inexpensive and widely available commercially. These are general-purpose parallel processors with support for a variety of programming interfaces, including industry standard languages such as C. GPU implementations of algorithms that are well suited for parallel processing can often achieve speedups of several orders of magnitude over optimized CPU codes. Significant improvements in speeds for imagery orthorectification, atmospheric correction, target detection and image transformations like Independent Components Analsyis (ICA) have been achieved using GPU-based implementations. Additional optimizations, when factored in with GPU processing capabilities, can provide 50x - 100x reduction in the time required to process large imagery. Exelis Visual Information Solutions (VIS) has implemented a CUDA based GPU processing frame work for accelerating ENVI and IDL processes that can best take advantage of parallelization. Testing Exelis VIS has performed shows that orthorectification can take as long as two hours with a WorldView1 35,0000 x 35,000 pixel image. With GPU orthorecification, the same orthorectification process takes three minutes. By speeding up image processing, imagery can successfully be used by first responders, scientists making rapid discoveries with near real time data, and provides an operational component to data centers needing to quickly process and disseminate data.
Accelerating atomistic calculations of quantum energy eigenstates on graphic cards

NASA Astrophysics Data System (ADS)

Rodrigues, Walter; Pecchia, A.; Lopez, M.; Auf der Maur, M.; Di Carlo, A.

2014-10-01

Electronic properties of nanoscale materials require the calculation of eigenvalues and eigenvectors of large matrices. This bottleneck can be overcome by parallel computing techniques or the introduction of faster algorithms. In this paper we report a custom implementation of the Lanczos algorithm with simple restart, optimized for graphical processing units (GPUs). The whole algorithm has been developed using CUDA and runs entirely on the GPU, with a specialized implementation that spares memory and reduces at most machine-to-device data transfers. Furthermore parallel distribution over several GPUs has been attained using the standard message passing interface (MPI). Benchmark calculations performed on a GaN/AlGaN wurtzite quantum dot with up to 600,000 atoms are presented. The empirical tight-binding (ETB) model with an sp3d5s∗+spin-orbit parametrization has been used to build the system Hamiltonian (H).
SPACEBAR: Kinematic design by computer graphics

NASA Technical Reports Server (NTRS)

Ricci, R. J.

1975-01-01

The interactive graphics computer program SPACEBAR, conceived to reduce the time and complexity associated with the development of kinematic mechanisms on the design board, was described. This program allows the direct design and analysis of mechanisms right at the terminal screen. All input variables, including linkage geometry, stiffness, and applied loading conditions, can be fed into or changed at the terminal and may be displayed in three dimensions. All mechanism configurations can be cycled through their range of travel and viewed in their various geometric positions. Output data includes geometric positioning in orthogonal coordinates of each node point in the mechanism, velocity and acceleration of the node points, and internal loads and displacements of the node points and linkages. All analysis calculations take at most a few seconds to complete. Output data can be viewed at the scope and also printed at the discretion of the user.
Platform for Automated Real-Time High Performance Analytics on Medical Image Data.

PubMed

Allen, William J; Gabr, Refaat E; Tefera, Getaneh B; Pednekar, Amol S; Vaughn, Matthew W; Narayana, Ponnada A

2018-03-01

Biomedical data are quickly growing in volume and in variety, providing clinicians an opportunity for better clinical decision support. Here, we demonstrate a robust platform that uses software automation and high performance computing (HPC) resources to achieve real-time analytics of clinical data, specifically magnetic resonance imaging (MRI) data. We used the Agave application programming interface to facilitate communication, data transfer, and job control between an MRI scanner and an off-site HPC resource. In this use case, Agave executed the graphical pipeline tool GRAphical Pipeline Environment (GRAPE) to perform automated, real-time, quantitative analysis of MRI scans. Same-session image processing will open the door for adaptive scanning and real-time quality control, potentially accelerating the discovery of pathologies and minimizing patient callbacks. We envision this platform can be adapted to other medical instruments, HPC resources, and analytics tools.
HEPLIB `91: International users meeting on the support and environments of high energy physics computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnstad, H.

The purpose of this meeting is to discuss the current and future HEP computing support and environments from the perspective of new horizons in accelerator, physics, and computing technologies. Topics of interest to the Meeting include (but are limited to): the forming of the HEPLIB world user group for High Energy Physic computing; mandate, desirables, coordination, organization, funding; user experience, international collaboration; the roles of national labs, universities, and industry; range of software, Monte Carlo, mathematics, physics, interactive analysis, text processors, editors, graphics, data base systems, code management tools; program libraries, frequency of updates, distribution; distributed and interactive computing, datamore » base systems, user interface, UNIX operating systems, networking, compilers, Xlib, X-Graphics; documentation, updates, availability, distribution; code management in large collaborations, keeping track of program versions; and quality assurance, testing, conventions, standards.« less

HEPLIB 91: International users meeting on the support and environments of high energy physics computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnstad, H.

The purpose of this meeting is to discuss the current and future HEP computing support and environments from the perspective of new horizons in accelerator, physics, and computing technologies. Topics of interest to the Meeting include (but are limited to): the forming of the HEPLIB world user group for High Energy Physic computing; mandate, desirables, coordination, organization, funding; user experience, international collaboration; the roles of national labs, universities, and industry; range of software, Monte Carlo, mathematics, physics, interactive analysis, text processors, editors, graphics, data base systems, code management tools; program libraries, frequency of updates, distribution; distributed and interactive computing, datamore » base systems, user interface, UNIX operating systems, networking, compilers, Xlib, X-Graphics; documentation, updates, availability, distribution; code management in large collaborations, keeping track of program versions; and quality assurance, testing, conventions, standards.« less
Near-realtime simulations of biolelectric activity in small mammalian hearts using graphical processing units

PubMed Central

Vigmond, Edward J.; Boyle, Patrick M.; Leon, L. Joshua; Plank, Gernot

2014-01-01

Simulations of cardiac bioelectric phenomena remain a significant challenge despite continual advancements in computational machinery. Spanning large temporal and spatial ranges demands millions of nodes to accurately depict geometry, and a comparable number of timesteps to capture dynamics. This study explores a new hardware computing paradigm, the graphics processing unit (GPU), to accelerate cardiac models, and analyzes results in the context of simulating a small mammalian heart in real time. The ODEs associated with membrane ionic flow were computed on traditional CPU and compared to GPU performance, for one to four parallel processing units. The scalability of solving the PDE responsible for tissue coupling was examined on a cluster using up to 128 cores. Results indicate that the GPU implementation was between 9 and 17 times faster than the CPU implementation and scaled similarly. Solving the PDE was still 160 times slower than real time. PMID:19964295
General purpose graphic processing unit implementation of adaptive pulse compression algorithms

NASA Astrophysics Data System (ADS)

Cai, Jingxiao; Zhang, Yan

2017-07-01

This study introduces a practical approach to implement real-time signal processing algorithms for general surveillance radar based on NVIDIA graphical processing units (GPUs). The pulse compression algorithms are implemented using compute unified device architecture (CUDA) libraries such as CUDA basic linear algebra subroutines and CUDA fast Fourier transform library, which are adopted from open source libraries and optimized for the NVIDIA GPUs. For more advanced, adaptive processing algorithms such as adaptive pulse compression, customized kernel optimization is needed and investigated. A statistical optimization approach is developed for this purpose without needing much knowledge of the physical configurations of the kernels. It was found that the kernel optimization approach can significantly improve the performance. Benchmark performance is compared with the CPU performance in terms of processing accelerations. The proposed implementation framework can be used in various radar systems including ground-based phased array radar, airborne sense and avoid radar, and aerospace surveillance radar.
Real-time digital holographic microscopy using the graphic processing unit.

PubMed

Shimobaba, Tomoyoshi; Sato, Yoshikuni; Miura, Junya; Takenouchi, Mai; Ito, Tomoyoshi

2008-08-04

Digital holographic microscopy (DHM) is a well-known powerful method allowing both the amplitude and phase of a specimen to be simultaneously observed. In order to obtain a reconstructed image from a hologram, numerous calculations for the Fresnel diffraction are required. The Fresnel diffraction can be accelerated by the FFT (Fast Fourier Transform) algorithm. However, real-time reconstruction from a hologram is difficult even if we use a recent central processing unit (CPU) to calculate the Fresnel diffraction by the FFT algorithm. In this paper, we describe a real-time DHM system using a graphic processing unit (GPU) with many stream processors, which allows use as a highly parallel processor. The computational speed of the Fresnel diffraction using the GPU is faster than that of recent CPUs. The real-time DHM system can obtain reconstructed images from holograms whose size is 512 x 512 grids in 24 frames per second.
A Survey Of Techniques for Managing and Leveraging Caches in GPUs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh

2014-09-01

Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges ofmore » cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.« less
Tempest: Accelerated MS/MS Database Search Software for Heterogeneous Computing Platforms.

PubMed

Adamo, Mark E; Gerber, Scott A

2016-09-07

MS/MS database search algorithms derive a set of candidate peptide sequences from in silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU (central processing unit) generates peptide candidates that are asynchronously sent to a discrete GPU (graphics processing unit) to be scored against experimental spectra in parallel. The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general-purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Accelerating a Particle-in-Cell Simulation Using a Hybrid Counting Sort

NASA Astrophysics Data System (ADS)

Bowers, K. J.

2001-11-01

In this article, performance limitations of the particle advance in a particle-in-cell (PIC) simulation are discussed. It is shown that the memory subsystem and cache-thrashing severely limit the speed of such simulations. Methods to implement a PIC simulation under such conditions are explored. An algorithm based on a counting sort is developed which effectively eliminates PIC simulation cache thrashing. Sustained performance gains of 40 to 70 percent are measured on commodity workstations for a minimal 2d2v electrostatic PIC simulation. More complete simulations are expected to have even better results as larger simulations are usually even more memory subsystem limited.
Dietary shifts and implications for US agriculture.

PubMed

O'Brien, P

1995-06-01

Changes to healthier dietary patterns similar to those of traditional Mediterranean diets or those of the US government's dietary guidelines and food guide pyramid would require significant changes in American agricultural practices. The volume, mix, production, and marketing of agricultural commodities would need to be modified. Because differences between actual and recommended intakes for major food groups are quite large and affect a broad range of products, adjustments in supply and demand could overshadow past experience in dealing with such changes. New food and agriculture policies may well be needed to ease and accelerate agricultural adjustments, to improve nutritional characteristics of popular foods, and to promote desirable changes in consumers' food choices.
Scaling Semantic Graph Databases in Size and Performance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morari, Alessandro; Castellana, Vito G.; Villa, Oreste

In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grainedmore » data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.« less
gWEGA: GPU-accelerated WEGA for molecular superposition and shape comparison.

PubMed

Yan, Xin; Li, Jiabo; Gu, Qiong; Xu, Jun

2014-06-05

Virtual screening of a large chemical library for drug lead identification requires searching/superimposing a large number of three-dimensional (3D) chemical structures. This article reports a graphic processing unit (GPU)-accelerated weighted Gaussian algorithm (gWEGA) that expedites shape or shape-feature similarity score-based virtual screening. With 86 GPU nodes (each node has one GPU card), gWEGA can screen 110 million conformations derived from an entire ZINC drug-like database with diverse antidiabetic agents as query structures within 2 s (i.e., screening more than 55 million conformations per second). The rapid screening speed was accomplished through the massive parallelization on multiple GPU nodes and rapid prescreening of 3D structures (based on their shape descriptors and pharmacophore feature compositions). Copyright © 2014 Wiley Periodicals, Inc.
Explicit integration with GPU acceleration for large kinetic networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brock, Benjamin; Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830; Belt, Andrew

2015-12-01

We demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff thermonuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. This orders-of-magnitude decrease in computation time for solving systems of realistic kinetic networks implies that important coupled, multiphysics problems inmore » various scientific and technical fields that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.« less
Light scattering microscopy measurements of single nuclei compared with GPU-accelerated FDTD simulations

NASA Astrophysics Data System (ADS)

Stark, Julian; Rothe, Thomas; Kieß, Steffen; Simon, Sven; Kienle, Alwin

2016-04-01

Single cell nuclei were investigated using two-dimensional angularly and spectrally resolved scattering microscopy. We show that even for a qualitative comparison of experimental and theoretical data, the standard Mie model of a homogeneous sphere proves to be insufficient. Hence, an accelerated finite-difference time-domain method using a graphics processor unit and domain decomposition was implemented to analyze the experimental scattering patterns. The measured cell nuclei were modeled as single spheres with randomly distributed spherical inclusions of different size and refractive index representing the nucleoli and clumps of chromatin. Taking into account the nuclear heterogeneity of a large number of inclusions yields a qualitative agreement between experimental and theoretical spectra and illustrates the impact of the nuclear micro- and nanostructure on the scattering patterns.
Stochastic first passage time accelerated with CUDA

NASA Astrophysics Data System (ADS)

Pierro, Vincenzo; Troiano, Luigi; Mejuto, Elena; Filatrella, Giovanni

2018-05-01

The numerical integration of stochastic trajectories to estimate the time to pass a threshold is an interesting physical quantity, for instance in Josephson junctions and atomic force microscopy, where the full trajectory is not accessible. We propose an algorithm suitable for efficient implementation on graphical processing unit in CUDA environment. The proposed approach for well balanced loads achieves almost perfect scaling with the number of available threads and processors, and allows an acceleration of about 400× with a GPU GTX980 respect to standard multicore CPU. This method allows with off the shell GPU to challenge problems that are otherwise prohibitive, as thermal activation in slowly tilted potentials. In particular, we demonstrate that it is possible to simulate the switching currents distributions of Josephson junctions in the timescale of actual experiments.
Tensor Algebra Library for NVidia Graphics Processing Units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liakh, Dmitry

This is a general purpose math library implementing basic tensor algebra operations on NVidia GPU accelerators. This software is a tensor algebra library that can perform basic tensor algebra operations, including tensor contractions, tensor products, tensor additions, etc., on NVidia GPU accelerators, asynchronously with respect to the CPU host. It supports a simultaneous use of multiple NVidia GPUs. Each asynchronous API function returns a handle which can later be used for querying the completion of the corresponding tensor algebra operation on a specific GPU. The tensors participating in a particular tensor operation are assumed to be stored in local RAMmore » of a node or GPU RAM. The main research area where this library can be utilized is the quantum many-body theory (e.g., in electronic structure theory).« less
Light scattering microscopy measurements of single nuclei compared with GPU-accelerated FDTD simulations.

PubMed

Stark, Julian; Rothe, Thomas; Kieß, Steffen; Simon, Sven; Kienle, Alwin

2016-04-07

Single cell nuclei were investigated using two-dimensional angularly and spectrally resolved scattering microscopy. We show that even for a qualitative comparison of experimental and theoretical data, the standard Mie model of a homogeneous sphere proves to be insufficient. Hence, an accelerated finite-difference time-domain method using a graphics processor unit and domain decomposition was implemented to analyze the experimental scattering patterns. The measured cell nuclei were modeled as single spheres with randomly distributed spherical inclusions of different size and refractive index representing the nucleoli and clumps of chromatin. Taking into account the nuclear heterogeneity of a large number of inclusions yields a qualitative agreement between experimental and theoretical spectra and illustrates the impact of the nuclear micro- and nanostructure on the scattering patterns.
17 CFR 5.4 - Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors.

Code of Federal Regulations, 2013 CFR

2013-04-01

... this chapter to commodity pool operators and commodity trading advisors. 5.4 Section 5.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION OFF-EXCHANGE FOREIGN CURRENCY TRANSACTIONS § 5.4 Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors. Part 4 of...
17 CFR 5.4 - Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors.

Code of Federal Regulations, 2012 CFR

2012-04-01

... this chapter to commodity pool operators and commodity trading advisors. 5.4 Section 5.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION OFF-EXCHANGE FOREIGN CURRENCY TRANSACTIONS § 5.4 Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors. Part 4 of...
CTG Analyzer: A graphical user interface for cardiotocography.

PubMed

Sbrollini, Agnese; Agostinelli, Angela; Burattini, Luca; Morettini, Micaela; Di Nardo, Francesco; Fioretti, Sandro; Burattini, Laura

2017-07-01

Cardiotocography (CTG) is the most commonly used test for establishing the good health of the fetus during pregnancy and labor. CTG consists in the recording of fetal heart rate (FHR; bpm) and maternal uterine contractions (UC; mmHg). FHR is characterized by baseline, baseline variability, tachycardia, bradycardia, acceleration and decelerations. Instead, UC signal is characterized by presence of contractions and contractions period. Such parameters are usually evaluated by visual inspection. However, visual analysis of CTG recordings has a well-demonstrated poor reproducibility, due to the complexity of physiological phenomena affecting fetal heart rhythm and being related to clinician's experience. Computerized tools in support of clinicians represents a possible solution for improving correctness in CTG interpretation. This paper proposes CTG Analyzer as a graphical tool for automatic and objective analysis of CTG tracings. CTG Analyzer was developed under MATLAB®; it is a very intuitive and user friendly graphical user interface. FHR time series and UC signal are represented one under the other, on a grid with reference lines, as usually done for CTG reports printed on paper. Colors help identification of FHR and UC features. Automatic analysis is based on some unchangeable features definitions provided by the FIGO guidelines, and other arbitrary settings whose default values can be changed by the user. Eventually, CTG Analyzer provides a report file listing all the quantitative results of the analysis. Thus, CTG Analyzer represents a potentially useful graphical tool for automatic and objective analysis of CTG tracings.
17 CFR 4.13 - Exemption from registration as a commodity pool operator.

Code of Federal Regulations, 2011 CFR

2011-04-01

... a commodity pool operator. 4.13 Section 4.13 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.13 Exemption from registration as a commodity pool operator. This section is...
Utilizing GPUs to Accelerate Turbomachinery CFD Codes

NASA Technical Reports Server (NTRS)

MacCalla, Weylin; Kulkarni, Sameer

2016-01-01

GPU computing has established itself as a way to accelerate parallel codes in the high performance computing world. This work focuses on speeding up APNASA, a legacy CFD code used at NASA Glenn Research Center, while also drawing conclusions about the nature of GPU computing and the requirements to make GPGPU worthwhile on legacy codes. Rewriting and restructuring of the source code was avoided to limit the introduction of new bugs. The code was profiled and investigated for parallelization potential, then OpenACC directives were used to indicate parallel parts of the code. The use of OpenACC directives was not able to reduce the runtime of APNASA on either the NVIDIA Tesla discrete graphics card, or the AMD accelerated processing unit. Additionally, it was found that in order to justify the use of GPGPU, the amount of parallel work being done within a kernel would have to greatly exceed the work being done by any one portion of the APNASA code. It was determined that in order for an application like APNASA to be accelerated on the GPU, it should not be modular in nature, and the parallel portions of the code must contain a large portion of the code's computation time.

17 CFR 32.9 - Fraud in connection with commodity option transactions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Fraud in connection with commodity option transactions. 32.9 Section 32.9 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.9 Fraud in connection with commodity...
The Principles and the Specifics of Trading in Commodities

NASA Astrophysics Data System (ADS)

Baran, Dušan; Herbacsková, Anita

2012-12-01

In the present period of instability on financial markets, investments in commodities are the solution for elimination of the consequences of inflation and ensure the yield. When investing in commodities, the use of specifics of commodities compared to other assets. The distribution of commodities we can interpret for agricultural commodities, commodities of energy, precious and other metals, and weather. Therefore, in the framework of the investment portfolio are the commodities. This is the reason why one of the most popular types of investment assets now become commodities. In the interpretation of particular commodities we talk about commodity futures. The reason is that the spot market with commodities is limited storage facilities. The growth of the popularity, which allows a wide range of commodities, has caused that in addition to from institutional investors and speculators for trade may involve even small investors. This development will be supplemented by interpretation of the charts and figers, which will be commented and used for generalization of knowledge. Finally, the article will be interpreted by the further development of the market for commodities as it by article assumes from the results of research.
The Design and Implementation of a Semi-Autonomous Surf-Zone Robot Using Advanced Sensors and a Common Robot Operating System

DTIC Science & Technology

2011-06-01

effective way- point navigation algorithm that interfaced with a Java based graphical user interface (GUI), written by Uzun, for a robot named Bender [2...the angular acceleration, θ̈, or angular rate, θ̇. When considering a joint driven by an electric motor, the inertia and friction can be divided into...interactive simulations that can receive input from user controls, scripts , and other applications, such as Excel and MATLAB. One drawback is that the
Dissecting the COW

DOE Office of Scientific and Technical Information (OSTI.GOV)

Linstadt, E.

1985-04-01

The COW, or Console On Wheels, is the primary operator interface to the SLC accelerator control system. A hardware and software description of the COW, a microcomputer based system with a color graphics display output and touch-panel and knob inputs, is given. The ease of development and expandability, due to both the modular nature of the hardware and the multitasking, interrupt driven software running in the COW, are described. Integration of the COW into the SLCNET communications network and SLC Control system is detailed.
Augmented Computer Mouse Would Measure Applied Force

NASA Technical Reports Server (NTRS)

Li, Larry C. H.

1993-01-01

Proposed computer mouse measures force of contact applied by user. Adds another dimension to two-dimensional-position-measuring capability of conventional computer mouse; force measurement designated to represent any desired continuously variable function of time and position, such as control force, acceleration, velocity, or position along axis perpendicular to computer video display. Proposed mouse enhances sense of realism and intuition in interaction between operator and computer. Useful in such applications as three-dimensional computer graphics, computer games, and mathematical modeling of dynamics.
17 CFR 32.13 - Exemption from prohibition of commodity option transactions for trade options on certain...

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Exemption from prohibition of commodity option transactions for trade options on certain agricultural commodities. 32.13 Section 32.13 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.13 Exemption from...
17 CFR 5.4 - Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Applicability of part 4 of this chapter to commodity pool operators and commodity trading advisors. 5.4 Section 5.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION OFF-EXCHANGE FOREIGN CURRENCY TRANSACTIONS § 5.4...
Acceleration of fluoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: a simulation study

NASA Astrophysics Data System (ADS)

Xue, Xinwei; Cheryauka, Arvi; Tubbs, David

2006-03-01

CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.
Advances in digital printing and quality considerations of digitally printed images

NASA Astrophysics Data System (ADS)

Waes, Walter C.

1997-02-01

The traditional 'graphic arts' market has changed very rapidly. It has been only ten years now since Aldus introduced its 'PageMaker' software for text and layout. The platform used was Apple-Mac, which became also the standard for many other graphic applications. The so-called high-end workstations disappeared. This was the start for what later was called: the desk top publishing revolution. At the same time, image scanning became also user-friendly and heavy duty scanners were reduced to desktop-size. Color- reproduction became a commodity product. Since then, the pre-press industry has been going through a technical nightmare, trying to keep up with the digital explosion. One after another, tasks and crafts of pre-press were being transformed by digital technologies. New technologies in this field came almost too fast for many people to adapt. The next digital revolution will be for the commercial printers. All the reasons are explained later in this document. There is now a definite need for a different business-strategy and a new positioning in the electronic media-world. Niches have to be located for new graphic arts- applications. Electronic services to-and-from originators' and executors environments became a requirement. Data can now flow on-line between the printer and the originator of the job. It is no longer the pre-press shop who is controlling this. In many cases, electronic data goes between the print-buyer or agency and the printer. High power communication-systems with accepted standard color- management are transforming the printer, and more particularly, the pre-press shop fatally. The new digital printing market, now in the beginning of its expected full expansion, has to do with growing requests coming from agencies and other print-buyers for: (1) short-run printing; (2) print-on-demand approximately in-time; (3) personalization or other forms of customization; (4) quick turnaround.
17 CFR 33.3 - Unlawful commodity option transactions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Unlawful commodity option... REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.3 Unlawful commodity option... of, or maintain a position in, any commodity option transaction subject to the provisions of this...
49 CFR 1248.100 - Commodity classification designated.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 49 Transportation 9 2011-10-01 2011-10-01 false Commodity classification designated. 1248.100... STATISTICS Commodity Code § 1248.100 Commodity classification designated. Commencing with reports for the..., reports of commodity statistics required to be made to the Board, shall be based on the commodity codes...
The CUBLAS and CULA based GPU acceleration of adaptive finite element framework for bioluminescence tomography.

PubMed

Zhang, Bo; Yang, Xiang; Yang, Fei; Yang, Xin; Qin, Chenghu; Han, Dong; Ma, Xibo; Liu, Kai; Tian, Jie

2010-09-13

In molecular imaging (MI), especially the optical molecular imaging, bioluminescence tomography (BLT) emerges as an effective imaging modality for small animal imaging. The finite element methods (FEMs), especially the adaptive finite element (AFE) framework, play an important role in BLT. The processing speed of the FEMs and the AFE framework still needs to be improved, although the multi-thread CPU technology and the multi CPU technology have already been applied. In this paper, we for the first time introduce a new kind of acceleration technology to accelerate the AFE framework for BLT, using the graphics processing unit (GPU). Besides the processing speed, the GPU technology can get a balance between the cost and performance. The CUBLAS and CULA are two main important and powerful libraries for programming on NVIDIA GPUs. With the help of CUBLAS and CULA, it is easy to code on NVIDIA GPU and there is no need to worry about the details about the hardware environment of a specific GPU. The numerical experiments are designed to show the necessity, effect and application of the proposed CUBLAS and CULA based GPU acceleration. From the results of the experiments, we can reach the conclusion that the proposed CUBLAS and CULA based GPU acceleration method can improve the processing speed of the AFE framework very much while getting a balance between cost and performance.
Explicit integration with GPU acceleration for large kinetic networks

DOE PAGES

Brock, Benjamin; Belt, Andrew; Billings, Jay Jay; ...

2015-09-15

In this study, we demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff thermonuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. In addition, this orders-of-magnitude decrease in computation time for solving systems of realistic kinetic networks implies thatmore » important coupled, multiphysics problems in various scientific and technical fields that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.« less
Atomic orbital-based SOS-MP2 with tensor hypercontraction. I. GPU-based tensor construction and exploiting sparsity

NASA Astrophysics Data System (ADS)

Song, Chenchen; Martínez, Todd J.

2016-05-01

We present a tensor hypercontracted (THC) scaled opposite spin second order Møller-Plesset perturbation theory (SOS-MP2) method. By using THC, we reduce the formal scaling of SOS-MP2 with respect to molecular size from quartic to cubic. We achieve further efficiency by exploiting sparsity in the atomic orbitals and using graphical processing units (GPUs) to accelerate integral construction and matrix multiplication. The practical scaling of GPU-accelerated atomic orbital-based THC-SOS-MP2 calculations is found to be N2.6 for reference data sets of water clusters and alanine polypeptides containing up to 1600 basis functions. The errors in correlation energy with respect to density-fitting-SOS-MP2 are less than 0.5 kcal/mol for all systems tested (up to 162 atoms).
Accelerating electron tomography reconstruction algorithm ICON with GPU.

PubMed

Chen, Yu; Wang, Zihao; Zhang, Jingrong; Li, Lun; Wan, Xiaohua; Sun, Fei; Zhang, Fa

2017-01-01

Electron tomography (ET) plays an important role in studying in situ cell ultrastructure in three-dimensional space. Due to limited tilt angles, ET reconstruction always suffers from the "missing wedge" problem. With a validation procedure, iterative compressed-sensing optimized NUFFT reconstruction (ICON) demonstrates its power in the restoration of validated missing information for low SNR biological ET dataset. However, the huge computational demand has become a major problem for the application of ICON. In this work, we analyzed the framework of ICON and classified the operations of major steps of ICON reconstruction into three types. Accordingly, we designed parallel strategies and implemented them on graphics processing units (GPU) to generate a parallel program ICON-GPU. With high accuracy, ICON-GPU has a great acceleration compared to its CPU version, up to 83.7×, greatly relieving ICON's dependence on computing resource.
Atomic orbital-based SOS-MP2 with tensor hypercontraction. I. GPU-based tensor construction and exploiting sparsity.

PubMed

Song, Chenchen; Martínez, Todd J

2016-05-07

We present a tensor hypercontracted (THC) scaled opposite spin second order Møller-Plesset perturbation theory (SOS-MP2) method. By using THC, we reduce the formal scaling of SOS-MP2 with respect to molecular size from quartic to cubic. We achieve further efficiency by exploiting sparsity in the atomic orbitals and using graphical processing units (GPUs) to accelerate integral construction and matrix multiplication. The practical scaling of GPU-accelerated atomic orbital-based THC-SOS-MP2 calculations is found to be N(2.6) for reference data sets of water clusters and alanine polypeptides containing up to 1600 basis functions. The errors in correlation energy with respect to density-fitting-SOS-MP2 are less than 0.5 kcal/mol for all systems tested (up to 162 atoms).
CUDA-Accelerated Geodesic Ray-Tracing for Fiber Tracking

PubMed Central

van Aart, Evert; Sepasian, Neda; Jalba, Andrei; Vilanova, Anna

2011-01-01

Diffusion Tensor Imaging (DTI) allows to noninvasively measure the diffusion of water in fibrous tissue. By reconstructing the fibers from DTI data using a fiber-tracking algorithm, we can deduce the structure of the tissue. In this paper, we outline an approach to accelerating such a fiber-tracking algorithm using a Graphics Processing Unit (GPU). This algorithm, which is based on the calculation of geodesics, has shown promising results for both synthetic and real data, but is limited in its applicability by its high computational requirements. We present a solution which uses the parallelism offered by modern GPUs, in combination with the CUDA platform by NVIDIA, to significantly reduce the execution time of the fiber-tracking algorithm. Compared to a multithreaded CPU implementation of the same algorithm, our GPU mapping achieves a speedup factor of up to 40 times. PMID:21941525
17 CFR 33.10 - Fraud in connection with commodity option transactions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Fraud in connection with commodity option transactions. 33.10 Section 33.10 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.10 Fraud in...
17 CFR 32.11 - Suspension of commodity option transactions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Suspension of commodity option... REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.11 Suspension of commodity option transactions. (a... accept money, securities or property in connection with, the purchase or sale of any commodity option, or...
7 CFR 65.135 - Covered commodity.

Code of Federal Regulations, 2014 CFR

2014-01-01

..., PEANUTS, AND GINSENG General Provisions Definitions § 65.135 Covered commodity. (a) Covered commodity... nuts; (6) Pecans; and (7) Ginseng. (b) Covered commodities are excluded from this part if the commodity...

7 CFR 65.135 - Covered commodity.

Code of Federal Regulations, 2012 CFR

2012-01-01

..., PEANUTS, AND GINSENG General Provisions Definitions § 65.135 Covered commodity. (a) Covered commodity... nuts; (6) Pecans; and (7) Ginseng. (b) Covered commodities are excluded from this part if the commodity...
7 CFR 65.135 - Covered commodity.

Code of Federal Regulations, 2013 CFR

2013-01-01

..., PEANUTS, AND GINSENG General Provisions Definitions § 65.135 Covered commodity. (a) Covered commodity... nuts; (6) Pecans; and (7) Ginseng. (b) Covered commodities are excluded from this part if the commodity...
7 CFR 65.135 - Covered commodity.

Code of Federal Regulations, 2011 CFR

2011-01-01

..., PEANUTS, AND GINSENG General Provisions Definitions § 65.135 Covered commodity. (a) Covered commodity... nuts; (6) Pecans; and (7) Ginseng. (b) Covered commodities are excluded from this part if the commodity...
17 CFR 4.6 - Exclusion for certain otherwise regulated persons from the definition of the term “commodity...

Code of Federal Regulations, 2011 CFR

2011-04-01

... otherwise regulated persons from the definition of the term âcommodity trading advisor.â 4.6 Section 4.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.6 Exclusion for certain...
17 CFR 33.4 - Designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2011 CFR

2011-04-01

... market for the trading of commodity options. 33.4 Section 33.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.4 Designation as a contract market for the trading of commodity options. The Commission may...
17 CFR 4.6 - Exclusion for certain otherwise regulated persons from the definition of the term “commodity...

Code of Federal Regulations, 2014 CFR

2014-04-01

... otherwise regulated persons from the definition of the term âcommodity trading advisor.â 4.6 Section 4.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.6 Exclusion for certain...
17 CFR 4.6 - Exclusion for certain otherwise regulated persons from the definition of the term “commodity...

Code of Federal Regulations, 2013 CFR

2013-04-01

... otherwise regulated persons from the definition of the term âcommodity trading advisor.â 4.6 Section 4.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.6 Exclusion for certain...
17 CFR 33.4 - Designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2012 CFR

2012-04-01

... market for the trading of commodity options. 33.4 Section 33.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.4 Designation as a contract market for the trading of commodity options. The Commission may...
17 CFR 4.6 - Exclusion for certain otherwise regulated persons from the definition of the term “commodity...

Code of Federal Regulations, 2012 CFR

2012-04-01

... otherwise regulated persons from the definition of the term âcommodity trading advisor.â 4.6 Section 4.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.6 Exclusion for certain...
17 CFR 4.6 - Exclusion for certain otherwise regulated persons from the definition of the term “commodity...

Code of Federal Regulations, 2010 CFR

2010-04-01

... otherwise regulated persons from the definition of the term âcommodity trading advisor.â 4.6 Section 4.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions and Exemptions § 4.6 Exclusion for certain...
Monitoring of wind load and response for cable-supported bridges in Hong Kong

NASA Astrophysics Data System (ADS)

Wong, Kai-yuen; Chan, Wai-Yee K.; Man, King-Leung

2001-08-01

Structural health monitoring for the three cable-supported bridges located in the West of Hong Kong or the Tsing Ma Control Area has been carried out since the opening of these bridges to public traffic. The three cable-supported bridges are referred to as the Tsing Ma (suspension) Bridge, the Kap Shui Mun (cable-stayed) Bridge and the Ting Kau (cable-stayed) Bridge. The structural health monitoring works involved are classified as six monitoring categories, namely, wind load and response, temperature load and response, traffic load and response, geometrical configuration monitoring, strains and stresses/forces monitoring and global dynamic characteristics monitoring. As wind loads and responses had been a major concern in the design and construction stages, this paper therefore outlines the work of wind load and response monitoring on Tsing Ma, Kap Shui Mun and Ting Kau Bridges. The paper starts with a brief description of the sensory systems. The description includes the layout and performance requirements of sensory systems for wind load and responses monitoring. Typical results of wind load and response monitoring in graphical forms are then presented. These graphical forms include the plots of wind rose diagrams, wind incidences vs wind speeds, wind turbulence intensities, wind power spectra, gust wind factors, coefficient of terrain roughness, extreme wind analyses, deck deflections/rotations vs wind speeds, acceleration spectra, acceleration/displacement contours, and stress demand ratios. Finally conclusions on wind load and response monitoring on the three cable-supported bridges are drawn.
Accelerating adaptive inverse distance weighting interpolation algorithm on a graphics processing unit

PubMed Central

Xu, Liangliang; Xu, Nengxiong

2017-01-01

This paper focuses on designing and implementing parallel adaptive inverse distance weighting (AIDW) interpolation algorithms by using the graphics processing unit (GPU). The AIDW is an improved version of the standard IDW, which can adaptively determine the power parameter according to the data points’ spatial distribution pattern and achieve more accurate predictions than those predicted by IDW. In this paper, we first present two versions of the GPU-accelerated AIDW, i.e. the naive version without profiting from the shared memory and the tiled version taking advantage of the shared memory. We also implement the naive version and the tiled version using two data layouts, structure of arrays and array of aligned structures, on both single and double precision. We then evaluate the performance of parallel AIDW by comparing it with its corresponding serial algorithm on three different machines equipped with the GPUs GT730M, M5000 and K40c. The experimental results indicate that: (i) there is no significant difference in the computational efficiency when different data layouts are employed; (ii) the tiled version is always slightly faster than the naive version; and (iii) on single precision the achieved speed-up can be up to 763 (on the GPU M5000), while on double precision the obtained highest speed-up is 197 (on the GPU K40c). To benefit the community, all source code and testing data related to the presented parallel AIDW algorithm are publicly available. PMID:28989754
Accelerating adaptive inverse distance weighting interpolation algorithm on a graphics processing unit.

PubMed

Mei, Gang; Xu, Liangliang; Xu, Nengxiong

2017-09-01

This paper focuses on designing and implementing parallel adaptive inverse distance weighting (AIDW) interpolation algorithms by using the graphics processing unit (GPU). The AIDW is an improved version of the standard IDW, which can adaptively determine the power parameter according to the data points' spatial distribution pattern and achieve more accurate predictions than those predicted by IDW. In this paper, we first present two versions of the GPU-accelerated AIDW, i.e. the naive version without profiting from the shared memory and the tiled version taking advantage of the shared memory. We also implement the naive version and the tiled version using two data layouts, structure of arrays and array of aligned structures, on both single and double precision. We then evaluate the performance of parallel AIDW by comparing it with its corresponding serial algorithm on three different machines equipped with the GPUs GT730M, M5000 and K40c. The experimental results indicate that: (i) there is no significant difference in the computational efficiency when different data layouts are employed; (ii) the tiled version is always slightly faster than the naive version; and (iii) on single precision the achieved speed-up can be up to 763 (on the GPU M5000), while on double precision the obtained highest speed-up is 197 (on the GPU K40c). To benefit the community, all source code and testing data related to the presented parallel AIDW algorithm are publicly available.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schaerer, Roman Pascal, E-mail: schaerer@mathcces.rwth-aachen.de; Bansal, Pratyuksh; Torrilhon, Manuel

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) , we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropymore » distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.« less
Accelerating cardiac bidomain simulations using graphics processing units.

PubMed

Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G

2012-08-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.
Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units

PubMed Central

Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf

2013-01-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867
GeNN: a code generation framework for accelerated brain simulations

NASA Astrophysics Data System (ADS)

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations.

PubMed

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-07

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations

PubMed Central

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/. PMID:26740369
Work stealing for GPU-accelerated parallel programs in a global address space framework: WORK STEALING ON GPU-ACCELERATED SYSTEMS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain.« less

GPU Accelerated Vector Median Filter

NASA Technical Reports Server (NTRS)

Aras, Rifat; Shen, Yuzhong

2011-01-01

Noise reduction is an important step for most image processing tasks. For three channel color images, a widely used technique is vector median filter in which color values of pixels are treated as 3-component vectors. Vector median filters are computationally expensive; for a window size of n x n, each of the n(sup 2) vectors has to be compared with other n(sup 2) - 1 vectors in distances. General purpose computation on graphics processing units (GPUs) is the paradigm of utilizing high-performance many-core GPU architectures for computation tasks that are normally handled by CPUs. In this work. NVIDIA's Compute Unified Device Architecture (CUDA) paradigm is used to accelerate vector median filtering. which has to the best of our knowledge never been done before. The performance of GPU accelerated vector median filter is compared to that of the CPU and MPI-based versions for different image and window sizes, Initial findings of the study showed 100x improvement of performance of vector median filter implementation on GPUs over CPU implementations and further speed-up is expected after more extensive optimizations of the GPU algorithm .
Design and development of the Macpherson Proton Preve Magneto rheological damper with PID controller

NASA Astrophysics Data System (ADS)

Amiruddin, I. M.; Pauziah, M.; Aminudin, A.; Unuh, M. H.

2017-10-01

Since the creation of the first petrol-fuelled vehicle by Karl Benz in the late nineteenth century, car industry has grown considerably to meet the industrial demands. Luxurious looks and agreeable rides are the primary needs of drivers. The Magneto-rheological damper balanced their damping trademark progressively by applying the damping coefficient depending on the control system. In this research, the control calculations are assessed by utilizing the MR damper. The capacity and reliably of the target force for the damper speed is investigated from control algorithm. This is imperative to defeat the damper limitation. In this study, the simulation results of the semi-dynamic MR damper with the PID controller shows better performance in sprung mass acceleration, unsprung mass acceleration and suspension dislodging with permitting over the top tyre acceleration. The altered model of the MR damper is specially designed for Proton Preve specifications and semi-active PID control. The procedure for the advancement incorporates the numerical model to graphically recreate and break down the dynamic framework by utilizing Matlab.
Comparison Tools for Assessing the Microgravity Environment of Space Missions, Carriers and Conditions

NASA Technical Reports Server (NTRS)

DeLombard, Richard; Hrovat, Kenneth; Moskowitz, Milton; McPherson, Kevin M.

1998-01-01

The microgravity environment of the NASA Shuttles and Russia's Mir space station have been measured by specially designed accelerometer systems. The need for comparisons between different missions, vehicles, conditions, etc. has been addressed by the two new processes described in this paper. The Principal Component Spectral Analysis (PCSA) and Quasi-steady Three-dimensional Histogram QTH techniques provide the means to describe the microgravity acceleration environment of a long time span of data on a single plot. As described in this paper, the PCSA and QTH techniques allow both the range and the median of the microgravity environment to be represented graphically on a single page. A variety of operating conditions may be made evident by using PCSA or QTH plots. The PCSA plot can help to distinguish between equipment operating full time or part time, as well as show the variability of the magnitude and/or frequency of an acceleration source. A QTH plot summarizes the magnitude and orientation of the low-frequency acceleration vector. This type of plot can show the microgravity effects of attitude, altitude, venting, etc.
17 CFR 33.5 - Application for designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2011 CFR

2011-04-01

... a contract market for the trading of commodity options. 33.5 Section 33.5 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.5 Application for designation as a contract market for the trading of commodity options. (a...
17 CFR 37.4 - Election to trade excluded and exempt commodities.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Election to trade excluded and exempt commodities. 37.4 Section 37.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION DERIVATIVES TRANSACTION EXECUTION FACILITIES § 37.4 Election to trade excluded and exempt commodities. A board of trade that is or elects...
17 CFR 33.5 - Application for designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2013 CFR

2013-04-01

... a contract market for the trading of commodity options. 33.5 Section 33.5 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS THAT ARE OPTIONS... contract market for the trading of commodity options. (a) Any board of trade desiring to be designated as a...
17 CFR 33.5 - Application for designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2012 CFR

2012-04-01

... a contract market for the trading of commodity options. 33.5 Section 33.5 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.5 Application for designation as a contract market for the trading of commodity options. (a...
17 CFR 33.6 - Suspension or revocation of designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2013 CFR

2013-04-01

... designation as a contract market for the trading of commodity options. 33.6 Section 33.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS THAT... designation as a contract market for the trading of commodity options. The Commission may, after notice and...
17 CFR 33.4 - Designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2013 CFR

2013-04-01

... market for the trading of commodity options. 33.4 Section 33.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS THAT ARE OPTIONS ON CONTRACTS OF SALE OF A COMMODITY FOR FUTURE DELIVERY § 33.4 Designation as a contract market for the trading...
17 CFR 33.5 - Application for designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2010 CFR

2010-04-01

... a contract market for the trading of commodity options. 33.5 Section 33.5 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.5 Application for designation as a contract market for the trading of commodity options. (a...
17 CFR 210.4-08 - General notes to financial statements.

Code of Federal Regulations, 2010 CFR

2010-04-01

..., options, and other financial instruments with similar characteristics. (ii) Derivative commodity... futures, commodity forwards, commodity swaps, commodity options, and other commodity instruments with... policies for certain derivative instruments. Disclosures regarding accounting policies shall include...
The algorithm for duration acceleration of repetitive projects considering the learning effect

NASA Astrophysics Data System (ADS)

Chen, Hongtao; Wang, Keke; Du, Yang; Wang, Liwan

2018-03-01

Repetitive project optimization problem is common in project scheduling. Repetitive Scheduling Method (RSM) has many irreplaceable advantages in the field of repetitive projects. As the same or similar work is repeated, the proficiency of workers will be correspondingly low to high, and workers will gain experience and improve the efficiency of operations. This is learning effect. Learning effect is one of the important factors affecting the optimization results in repetitive project scheduling. This paper analyzes the influence of the learning effect on the controlling path in RSM from two aspects: one is that the learning effect changes the controlling path, the other is that the learning effect doesn't change the controlling path. This paper proposes corresponding methods to accelerate duration for different types of critical activities and proposes the algorithm for duration acceleration based on the learning effect in RSM. And the paper chooses graphical method to identity activities' types and considers the impacts of the learning effect on duration. The method meets the requirement of duration while ensuring the lowest acceleration cost. A concrete bridge construction project is given to verify the effectiveness of the method. The results of this study will help project managers understand the impacts of the learning effect on repetitive projects, and use the learning effect to optimize project scheduling.
Implementing Molecular Dynamics on Hybrid High Performance Computers - Particle-Particle Particle-Mesh

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, W Michael; Kohlmeyer, Axel; Plimpton, Steven J

The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. In this paper, we present a continuation of previous work implementing algorithms for using accelerators into the LAMMPS molecular dynamics software for distributed memory parallel hybrid machines. In our previous work, we focused on acceleration for short-range models with anmore » approach intended to harness the processing power of both the accelerator and (multi-core) CPUs. To augment the existing implementations, we present an efficient implementation of long-range electrostatic force calculation for molecular dynamics. Specifically, we present an implementation of the particle-particle particle-mesh method based on the work by Harvey and De Fabritiis. We present benchmark results on the Keeneland InfiniBand GPU cluster. We provide a performance comparison of the same kernels compiled with both CUDA and OpenCL. We discuss limitations to parallel efficiency and future directions for improving performance on hybrid or heterogeneous computers.« less
Responding to climate change and the global land crisis: REDD+, market transformation and low-emissions rural development

PubMed Central

Nepstad, Daniel C.; Boyd, William; Stickler, Claudia M.; Bezerra, Tathiana; Azevedo, Andrea A.

2013-01-01

Climate change and rapidly escalating global demand for food, fuel, fibre and feed present seemingly contradictory challenges to humanity. Can greenhouse gas (GHG) emissions from land-use, more than one-fourth of the global total, decline as growth in land-based production accelerates? This review examines the status of two major international initiatives that are designed to address different aspects of this challenge. REDD+ is an emerging policy framework for providing incentives to tropical nations and states that reduce their GHG emissions from deforestation and forest degradation. Market transformation, best represented by agricultural commodity roundtables, seeks to exclude unsustainable farmers from commodity markets through international social and environmental standards for farmers and processors. These global initiatives could potentially become synergistically integrated through (i) a shared approach for measuring and favouring high environmental and social performance of land use across entire jurisdictions and (ii) stronger links with the domestic policies, finance and laws in the jurisdictions where agricultural expansion is moving into forests. To achieve scale, the principles of REDD+ and sustainable farming systems must be embedded in domestic low-emission rural development models capable of garnering support across multiple constituencies. We illustrate this potential with the case of Mato Grosso State in the Brazilian Amazon. PMID:23610173
Responding to climate change and the global land crisis: REDD+, market transformation and low-emissions rural development.

PubMed

Nepstad, Daniel C; Boyd, William; Stickler, Claudia M; Bezerra, Tathiana; Azevedo, Andrea A

2013-06-05

Climate change and rapidly escalating global demand for food, fuel, fibre and feed present seemingly contradictory challenges to humanity. Can greenhouse gas (GHG) emissions from land-use, more than one-fourth of the global total, decline as growth in land-based production accelerates? This review examines the status of two major international initiatives that are designed to address different aspects of this challenge. REDD+ is an emerging policy framework for providing incentives to tropical nations and states that reduce their GHG emissions from deforestation and forest degradation. Market transformation, best represented by agricultural commodity roundtables, seeks to exclude unsustainable farmers from commodity markets through international social and environmental standards for farmers and processors. These global initiatives could potentially become synergistically integrated through (i) a shared approach for measuring and favouring high environmental and social performance of land use across entire jurisdictions and (ii) stronger links with the domestic policies, finance and laws in the jurisdictions where agricultural expansion is moving into forests. To achieve scale, the principles of REDD+ and sustainable farming systems must be embedded in domestic low-emission rural development models capable of garnering support across multiple constituencies. We illustrate this potential with the case of Mato Grosso State in the Brazilian Amazon.
Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System.

PubMed

Liu, Yu; Hong, Yang; Lin, Chun-Yuan; Hung, Che-Lun

2015-01-01

The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.
High-Throughput Characterization of Porous Materials Using Graphics Processing Units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Jihan; Martin, Richard L.; Rübel, Oliver

We have developed a high-throughput graphics processing units (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations where the grid values represent interactions (i.e., Lennard-Jones + Coulomb potentials) between gas molecules (i.e., CHmore » $$_{4}$$ and CO$$_{2}$$) and material's framework atoms. Using a parallel flood fill CPU algorithm, inaccessible regions inside the framework structures are identified and blocked based on their energy profiles. Finally, we compute the Henry coefficients and heats of adsorption through statistical Widom insertion Monte Carlo moves in the domain restricted to the accessible space. The code offers significant speedup over a single core CPU code and allows us to characterize a set of porous materials at least an order of magnitude larger than ones considered in earlier studies. For structures selected from such a prescreening algorithm, full adsorption isotherms can be calculated by conducting multiple grand canonical Monte Carlo simulations concurrently within the GPU.« less
Communication: A reduced scaling J-engine based reformulation of SOS-MP2 using graphics processing units.

PubMed

Maurer, S A; Kussmann, J; Ochsenfeld, C

2014-08-07

We present a low-prefactor, cubically scaling scaled-opposite-spin second-order Møller-Plesset perturbation theory (SOS-MP2) method which is highly suitable for massively parallel architectures like graphics processing units (GPU). The scaling is reduced from O(N⁵) to O(N³) by a reformulation of the MP2-expression in the atomic orbital basis via Laplace transformation and the resolution-of-the-identity (RI) approximation of the integrals in combination with efficient sparse algebra for the 3-center integral transformation. In contrast to previous works that employ GPUs for post Hartree-Fock calculations, we do not simply employ GPU-based linear algebra libraries to accelerate the conventional algorithm. Instead, our reformulation allows to replace the rate-determining contraction step with a modified J-engine algorithm, that has been proven to be highly efficient on GPUs. Thus, our SOS-MP2 scheme enables us to treat large molecular systems in an accurate and efficient manner on a single GPU-server.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.

PubMed

Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan

2017-07-01

In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
[Research progress on standards of commodity classes of Chinese materia medica and discussion on several key problems].

PubMed

Yang, Guang; Zeng, Yan; Guo, Lan-Ping; Huang, Lu-Qi; Jin, Yan; Zheng, Yu-Guang; Wang, Yong-Yan

2014-05-01

Standards of commodity classes of Chinese materia medica is an important way to solve the "Lemons Problem" of traditional Chinese medicine market. Standards of commodity classes are also helpful to rebuild market mechanisms for "high price for good quality". The previous edition of commodity classes standards of Chinese materia medica was made 30 years ago. It is no longer adapted to the market demand. This article researched progress on standards of commodity classes of Chinese materia medica. It considered that biological activity is a better choice than chemical constituents for standards of commodity classes of Chinese materia medica. It is also considered that the key point to set standards of commodity classes is finding the influencing factors between "good quality" and "bad quality". The article also discussed the range of commodity classes of Chinese materia medica, and how to coordinate standards of pharmacopoeia and commodity classes. According to different demands, diversiform standards can be used in commodity classes of Chinese materia medica, but efficacy is considered the most important index of commodity standard. Decoction pieces can be included in standards of commodity classes of Chinese materia medica. The authors also formulated the standards of commodity classes of Notoginseng Radix as an example, and hope this study can make a positive and promotion effect on traditional Chinese medicine market related research.

Studying Upper-Limb Kinematics Using Inertial Sensors Embedded in Mobile Phones

PubMed Central

Bennett, Paul

2015-01-01

Background In recent years, there has been a great interest in analyzing upper-limb kinematics. Inertial measurement with mobile phones is a convenient and portable analysis method for studying humerus kinematics in terms of angular mobility and linear acceleration. Objective The aim of this analysis was to study upper-limb kinematics via mobile phones through six physical properties that correspond to angular mobility and acceleration in the three axes of space. Methods This cross-sectional study recruited healthy young adult subjects. Humerus kinematics was studied in 10 young adults with the iPhone4. They performed flexion and abduction analytical tasks. Mobility angle and lineal acceleration in each of its axes (yaw, pitch, and roll) were obtained with the iPhone4. This device was placed on the right half of the body of each subject, in the middle third of the humerus, slightly posterior. Descriptive statistics were calculated. Results Descriptive graphics of analytical tasks performed were obtained. The biggest range of motion was found in pitch angle, and the biggest acceleration was found in the y-axis in both analytical tasks. Focusing on tridimensional kinematics, bigger range of motion and acceleration was found in abduction (209.69 degrees and 23.31 degrees per second respectively). Also, very strong correlation was found between angular mobility and linear acceleration in abduction (r=.845) and flexion (r=.860). Conclusions The use of an iPhone for humerus tridimensional kinematics is feasible. This supports use of the mobile phone as a device to analyze upper-limb kinematics and to facilitate the evaluation of the patient. PMID:28582241
GPU-accelerated automatic identification of robust beam setups for proton and carbon-ion radiotherapy

NASA Astrophysics Data System (ADS)

Ammazzalorso, F.; Bednarz, T.; Jelen, U.

2014-03-01

We demonstrate acceleration on graphic processing units (GPU) of automatic identification of robust particle therapy beam setups, minimizing negative dosimetric effects of Bragg peak displacement caused by treatment-time patient positioning errors. Our particle therapy research toolkit, RobuR, was extended with OpenCL support and used to implement calculation on GPU of the Port Homogeneity Index, a metric scoring irradiation port robustness through analysis of tissue density patterns prior to dose optimization and computation. Results were benchmarked against an independent native CPU implementation. Numerical results were in agreement between the GPU implementation and native CPU implementation. For 10 skull base cases, the GPU-accelerated implementation was employed to select beam setups for proton and carbon ion treatment plans, which proved to be dosimetrically robust, when recomputed in presence of various simulated positioning errors. From the point of view of performance, average running time on the GPU decreased by at least one order of magnitude compared to the CPU, rendering the GPU-accelerated analysis a feasible step in a clinical treatment planning interactive session. In conclusion, selection of robust particle therapy beam setups can be effectively accelerated on a GPU and become an unintrusive part of the particle therapy treatment planning workflow. Additionally, the speed gain opens new usage scenarios, like interactive analysis manipulation (e.g. constraining of some setup) and re-execution. Finally, through OpenCL portable parallelism, the new implementation is suitable also for CPU-only use, taking advantage of multiple cores, and can potentially exploit types of accelerators other than GPUs.
17 CFR 32.5 - Disclosure.

Code of Federal Regulations, 2010 CFR

2010-04-01

... effect of any foreign currency fluctuations with respect to commodity option transactions which are to be... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Disclosure. 32.5 Section 32.5 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION...
17 CFR 31.6 - Registration of leverage commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... commodities. 31.6 Section 31.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION... applied to the National Futures Association for registration as a leverage transaction merchant; (2... the spot, forward, and futures markets for the generic commodity; (3) Specify a commercial or retail...
7 CFR 250.57 - Commodity schools.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 7 Agriculture 4 2011-01-01 2011-01-01 false Commodity schools. 250.57 Section 250.57 Agriculture... TERRITORIES AND POSSESSIONS AND AREAS UNDER ITS JURISDICTION National School Lunch Program (NSLP) and Other Child Nutrition Programs § 250.57 Commodity schools. (a) Categorization of commodity schools. Commodity...
7 CFR 250.57 - Commodity schools.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 7 Agriculture 4 2010-01-01 2010-01-01 false Commodity schools. 250.57 Section 250.57 Agriculture... TERRITORIES AND POSSESSIONS AND AREAS UNDER ITS JURISDICTION National School Lunch Program (NSLP) and Other Child Nutrition Programs § 250.57 Commodity schools. (a) Categorization of commodity schools. Commodity...
NASA general aviation crashworthiness seat development

NASA Technical Reports Server (NTRS)

Fasanella, E. L.; Alfaro-Bou, E.

1979-01-01

Three load limiting seat concepts for general aviation aircraft designed to lower the deceleration of the occupant in the event of a crash were sled tested and evaluated with reference to a standard seat. Dummy pelvis accelerations were reduced up to 50 percent with one of the concepts. Computer program MSOMLA (Modified Seat Occupant Model for Light Aircraft) was used to simulate the behavior of a dummy passenger in a NASA full-scale crash test of a twin engine light aircraft. A computer graphics package MANPLOT was developed to pictorially represent the occupant and seat motion.
Optimization of Selected Remote Sensing Algorithms for Embedded NVIDIA Kepler GPU Architecture

NASA Technical Reports Server (NTRS)

Riha, Lubomir; Le Moigne, Jacqueline; El-Ghazawi, Tarek

2015-01-01

This paper evaluates the potential of embedded Graphic Processing Units in the Nvidias Tegra K1 for onboard processing. The performance is compared to a general purpose multi-core CPU and full fledge GPU accelerator. This study uses two algorithms: Wavelet Spectral Dimension Reduction of Hyperspectral Imagery and Automated Cloud-Cover Assessment (ACCA) Algorithm. Tegra K1 achieved 51 for ACCA algorithm and 20 for the dimension reduction algorithm, as compared to the performance of the high-end 8-core server Intel Xeon CPU with 13.5 times higher power consumption.
GASPRNG: GPU accelerated scalable parallel random number generator library

NASA Astrophysics Data System (ADS)

Gao, Shuang; Peterson, Gregory D.

2013-04-01

Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or workstation with NVIDIA GPU (Tested on Fermi GTX480, Tesla C1060, Tesla M2070). Operating system: Linux with CUDA version 4.0 or later. Should also run on MacOS, Windows, or UNIX. Has the code been vectorized or parallelized?: Yes. Parallelized using MPI directives. RAM: 512 MB˜ 732 MB (main memory on host CPU, depending on the data type of random numbers.) / 512 MB (GPU global memory) Classification: 4.13, 6.5. Nature of problem: Many computational science applications are able to consume large numbers of random numbers. For example, Monte Carlo simulations are able to consume limitless random numbers for the computation as long as resources for the computing are supported. Moreover, parallel computational science applications require independent streams of random numbers to attain statistically significant results. The SPRNG library provides this capability, but at a significant computational cost. The GASPRNG library presented here accelerates the generators of independent streams of random numbers using graphical processing units (GPUs). Solution method: Multiple copies of random number generators in GPUs allow a computational science application to consume large numbers of random numbers from independent, parallel streams. GASPRNG is a random number generators library to allow a computational science application to employ multiple copies of random number generators to boost performance. Users can interface GASPRNG with software code executing on microprocessors and/or GPUs. Running time: The tests provided take a few minutes to run.
Gpufit: An open-source toolkit for GPU-accelerated curve fitting.

PubMed

Przybylski, Adrian; Thiel, Björn; Keller-Findeisen, Jan; Stock, Bernd; Bates, Mark

2017-11-16

We present a general purpose, open-source software library for estimation of non-linear parameters by the Levenberg-Marquardt algorithm. The software, Gpufit, runs on a Graphics Processing Unit (GPU) and executes computations in parallel, resulting in a significant gain in performance. We measured a speed increase of up to 42 times when comparing Gpufit with an identical CPU-based algorithm, with no loss of precision or accuracy. Gpufit is designed such that it is easily incorporated into existing applications or adapted for new ones. Multiple software interfaces, including to C, Python, and Matlab, ensure that Gpufit is accessible from most programming environments. The full source code is published as an open source software repository, making its function transparent to the user and facilitating future improvements and extensions. As a demonstration, we used Gpufit to accelerate an existing scientific image analysis package, yielding significantly improved processing times for super-resolution fluorescence microscopy datasets.
Understanding projectile acceleration.

PubMed

Hecht, H; Bertamini, M

2000-04-01

Throwing and catching balls or other objects is a generally highly practiced skill; however, conceptual as well as perceptual understanding of the mechanics that underlie this skill is surprisingly poor. In 5 experiments, we investigated conceptual and perceptual understanding of simple ballistic motion. Paper-and-pencil tests revealed that up to half of all participants mistakenly believed that a ball would continue to accelerate after it left the thrower's hand. Observers also showed a remarkable tolerance for anomalous trajectory shapes. Perceptual judgments based on graphics animations replicated these erroneous beliefs for shallow release angles. Observers' tolerance for anomalies tended to decrease with their distance from the actor. The findings are at odds with claims of the naive physics literature that liken intuitive understanding to Aristotelian or medieval physics theories. Instead, observers seem to project their intentions to the ball itself (externalization) or even feel that they have power over the ball when it is still close.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU

NASA Astrophysics Data System (ADS)

Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis

2016-06-01

Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20x to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.
CUDAEASY - a GPU accelerated cosmological lattice program

NASA Astrophysics Data System (ADS)

Sainio, J.

2010-05-01

This paper presents, to the author's knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA's Compute Unified Device Architecture (CUDA) and compare the performance to other similar programs in chaotic inflation models. We report speedups between one and two orders of magnitude depending on the used hardware and software while achieving small errors in single precision. Simulations that used to last roughly one day to compute can now be done in hours and this difference is expected to increase in the future. The program has been written in the spirit of LATTICEEASY and users of the aforementioned program should find it relatively easy to start using CUDAEASY in lattice simulations. The program is available at http://www.physics.utu.fi/theory/particlecosmology/cudaeasy/ under the GNU General Public License.
Semiempirical Quantum Chemical Calculations Accelerated on a Hybrid Multicore CPU-GPU Computing Platform.

PubMed

Wu, Xin; Koslowski, Axel; Thiel, Walter

2012-07-10

In this work, we demonstrate that semiempirical quantum chemical calculations can be accelerated significantly by leveraging the graphics processing unit (GPU) as a coprocessor on a hybrid multicore CPU-GPU computing platform. Semiempirical calculations using the MNDO, AM1, PM3, OM1, OM2, and OM3 model Hamiltonians were systematically profiled for three types of test systems (fullerenes, water clusters, and solvated crambin) to identify the most time-consuming sections of the code. The corresponding routines were ported to the GPU and optimized employing both existing library functions and a GPU kernel that carries out a sequence of noniterative Jacobi transformations during pseudodiagonalization. The overall computation times for single-point energy calculations and geometry optimizations of large molecules were reduced by one order of magnitude for all methods, as compared to runs on a single CPU core.
GPU-Acceleration of Sequence Homology Searches with Database Subsequence Clustering.

PubMed

Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka

2016-01-01

Sequence homology searches are used in various fields and require large amounts of computation time, especially for metagenomic analysis, owing to the large number of queries and the database size. To accelerate computing analyses, graphics processing units (GPUs) are widely used as a low-cost, high-performance computing platform. Therefore, we mapped the time-consuming steps involved in GHOSTZ, which is a state-of-the-art homology search algorithm for protein sequences, onto a GPU and implemented it as GHOSTZ-GPU. In addition, we optimized memory access for GPU calculations and for communication between the CPU and GPU. As per results of the evaluation test involving metagenomic data, GHOSTZ-GPU with 12 CPU threads and 1 GPU was approximately 3.0- to 4.1-fold faster than GHOSTZ with 12 CPU threads. Moreover, GHOSTZ-GPU with 12 CPU threads and 3 GPUs was approximately 5.8- to 7.7-fold faster than GHOSTZ with 12 CPU threads.
Effects of variable electrical conductivity and thermal conductivity on unsteady MHD free convection flow past an exponential accelerated inclined plate

NASA Astrophysics Data System (ADS)

Rana, B. M. Jewel; Ahmed, Rubel; Ahmmed, S. F.

2017-06-01

An analysis is carried out to investigate the effects of variable viscosity, thermal radiation, absorption of radiation and cross diffusion past an inclined exponential accelerated plate under the influence of variable heat and mass transfer. A set of suitable transformations has been used to obtain the non-dimensional coupled governing equations. Explicit finite difference technique has been used to solve the obtained numerical solutions of the present problem. Stability and convergence of the finite difference scheme have been carried out for this problem. Compaq Visual Fortran 6.6a has been used to calculate the numerical results. The effects of various physical parameters on the fluid velocity, temperature, concentration, coefficient of skin friction, rate of heat transfer, rate of mass transfer, streamlines and isotherms on the flow field have been presented graphically and discussed in details.
Atomic orbital-based SOS-MP2 with tensor hypercontraction. I. GPU-based tensor construction and exploiting sparsity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Chenchen; Martínez, Todd J.; SLAC National Accelerator Laboratory, Menlo Park, California 94025

We present a tensor hypercontracted (THC) scaled opposite spin second order Møller-Plesset perturbation theory (SOS-MP2) method. By using THC, we reduce the formal scaling of SOS-MP2 with respect to molecular size from quartic to cubic. We achieve further efficiency by exploiting sparsity in the atomic orbitals and using graphical processing units (GPUs) to accelerate integral construction and matrix multiplication. The practical scaling of GPU-accelerated atomic orbital-based THC-SOS-MP2 calculations is found to be N{sup 2.6} for reference data sets of water clusters and alanine polypeptides containing up to 1600 basis functions. The errors in correlation energy with respect to density-fitting-SOS-MP2 aremore » less than 0.5 kcal/mol for all systems tested (up to 162 atoms).« less
Scalar field cosmology in f(R,T) gravity via Noether symmetry

NASA Astrophysics Data System (ADS)

Sharif, M.; Nawazish, Iqra

2018-04-01

This paper investigates the existence of Noether symmetries of isotropic universe model in f(R,T) gravity admitting minimal coupling of matter and scalar fields. The scalar field incorporates two dark energy models such as quintessence and phantom models. We determine symmetry generators and corresponding conserved quantities for two particular f(R,T) models. We also evaluate exact solutions and investigate their physical behavior via different cosmological parameters. For the first model, the graphical behavior of these parameters indicate consistency with recent observations representing accelerated expansion of the universe. For the second model, these parameters identify a transition form accelerated to decelerated expansion of the universe. The potential function is found to be constant for the first model while it becomes V(φ )≈ φ 2 for the second model. We conclude that the Noether symmetry generators and corresponding conserved quantities appear in all cases.
Rapid automated classification of anesthetic depth levels using GPU based parallelization of neural networks.

PubMed

Peker, Musa; Şen, Baha; Gürüler, Hüseyin

2015-02-01

The effect of anesthesia on the patient is referred to as depth of anesthesia. Rapid classification of appropriate depth level of anesthesia is a matter of great importance in surgical operations. Similarly, accelerating classification algorithms is important for the rapid solution of problems in the field of biomedical signal processing. However numerous, time-consuming mathematical operations are required when training and testing stages of the classification algorithms, especially in neural networks. In this study, to accelerate the process, parallel programming and computing platform (Nvidia CUDA) facilitates dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU) was utilized. The system was employed to detect anesthetic depth level on related electroencephalogram (EEG) data set. This dataset is rather complex and large. Moreover, the achieving more anesthetic levels with rapid response is critical in anesthesia. The proposed parallelization method yielded high accurate classification results in a faster time.
Accelerating gravitational microlensing simulations using the Xeon Phi coprocessor

NASA Astrophysics Data System (ADS)

Chen, B.; Kantowski, R.; Dai, X.; Baron, E.; Van der Mark, P.

2017-04-01

Recently Graphics Processing Units (GPUs) have been used to speed up very CPU-intensive gravitational microlensing simulations. In this work, we use the Xeon Phi coprocessor to accelerate such simulations and compare its performance on a microlensing code with that of NVIDIA's GPUs. For the selected set of parameters evaluated in our experiment, we find that the speedup by Intel's Knights Corner coprocessor is comparable to that by NVIDIA's Fermi family of GPUs with compute capability 2.0, but less significant than GPUs with higher compute capabilities such as the Kepler. However, the very recently released second generation Xeon Phi, Knights Landing, is about 5.8 times faster than the Knights Corner, and about 2.9 times faster than the Kepler GPU used in our simulations. We conclude that the Xeon Phi is a very promising alternative to GPUs for modern high performance microlensing simulations.

GPU accelerated implementation of NCI calculations using promolecular density.

PubMed

Rubez, Gaëtan; Etancelin, Jean-Matthieu; Vigouroux, Xavier; Krajecki, Michael; Boisson, Jean-Charles; Hénon, Eric

2017-05-30

The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand-protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis

Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20xmore » to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.« less
Accelerated Path-following Iterative Shrinkage Thresholding Algorithm with Application to Semiparametric Graph Estimation

PubMed Central

Zhao, Tuo; Liu, Han

2016-01-01

We propose an accelerated path-following iterative shrinkage thresholding algorithm (APISTA) for solving high dimensional sparse nonconvex learning problems. The main difference between APISTA and the path-following iterative shrinkage thresholding algorithm (PISTA) is that APISTA exploits an additional coordinate descent subroutine to boost the computational performance. Such a modification, though simple, has profound impact: APISTA not only enjoys the same theoretical guarantee as that of PISTA, i.e., APISTA attains a linear rate of convergence to a unique sparse local optimum with good statistical properties, but also significantly outperforms PISTA in empirical benchmarks. As an application, we apply APISTA to solve a family of nonconvex optimization problems motivated by estimating sparse semiparametric graphical models. APISTA allows us to obtain new statistical recovery results which do not exist in the existing literature. Thorough numerical results are provided to back up our theory. PMID:28133430
17 CFR 14.4 - Violation of Commodity Exchange Act.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Violation of Commodity Exchange Act. 14.4 Section 14.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION... Exchange Act. The Commission may deny, temporarily or permanently, the privilege of appearing or practicing...
17 CFR 3.10 - Registration of futures commission merchants, retail foreign exchange dealers, introducing...

Code of Federal Regulations, 2011 CFR

2011-04-01

..., commodity pool operators and leverage transaction merchants. 3.10 Section 3.10 Commodity and Securities..., commodity pool operators and leverage transaction merchants. (a) Application for registration. (1)(i) Except... merchant, retail foreign exchange dealers, introducing broker, commodity trading advisor, commodity pool...
Eye gaze correction with stereovision for video-teleconferencing.

PubMed

Yang, Ruigang; Zhang, Zhengyou

2004-07-01

The lack of eye contact in desktop video teleconferencing substantially reduces the effectiveness of video contents. While expensive and bulky hardware is available on the market to correct eye gaze, researchers have been trying to provide a practical software-based solution to bring video-teleconferencing one step closer to the mass market. This paper presents a novel approach: Based on stereo analysis combined with rich domain knowledge (a personalized face model), we synthesize, using graphics hardware, a virtual video that maintains eye contact. A 3D stereo head tracker with a personalized face model is used to compute initial correspondences across two views. More correspondences are then added through template and feature matching. Finally, all the correspondence information is fused together for view synthesis using view morphing techniques. The combined methods greatly enhance the accuracy and robustness of the synthesized views. Our current system is able to generate an eye-gaze corrected video stream at five frames per second on a commodity 1 GHz PC.
29 CFR 780.114 - Wild commodities.

Code of Federal Regulations, 2013 CFR

2013-07-01

... Agricultural Or Horticultural Commodities § 780.114 Wild commodities. Employees engaged in the gathering or harvesting of wild commodities such as mosses, wild rice, burls and laurel plants, the trapping of wild... 29 Labor 3 2013-07-01 2013-07-01 false Wild commodities. 780.114 Section 780.114 Labor Regulations...
29 CFR 780.114 - Wild commodities.

Code of Federal Regulations, 2011 CFR

2011-07-01

... Agricultural Or Horticultural Commodities § 780.114 Wild commodities. Employees engaged in the gathering or harvesting of wild commodities such as mosses, wild rice, burls and laurel plants, the trapping of wild... 29 Labor 3 2011-07-01 2011-07-01 false Wild commodities. 780.114 Section 780.114 Labor Regulations...
29 CFR 780.114 - Wild commodities.

Code of Federal Regulations, 2014 CFR

2014-07-01

... Agricultural Or Horticultural Commodities § 780.114 Wild commodities. Employees engaged in the gathering or harvesting of wild commodities such as mosses, wild rice, burls and laurel plants, the trapping of wild... 29 Labor 3 2014-07-01 2014-07-01 false Wild commodities. 780.114 Section 780.114 Labor Regulations...
29 CFR 780.114 - Wild commodities.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Agricultural Or Horticultural Commodities § 780.114 Wild commodities. Employees engaged in the gathering or harvesting of wild commodities such as mosses, wild rice, burls and laurel plants, the trapping of wild... 29 Labor 3 2010-07-01 2010-07-01 false Wild commodities. 780.114 Section 780.114 Labor Regulations...
29 CFR 780.114 - Wild commodities.

Code of Federal Regulations, 2012 CFR

2012-07-01

... Agricultural Or Horticultural Commodities § 780.114 Wild commodities. Employees engaged in the gathering or harvesting of wild commodities such as mosses, wild rice, burls and laurel plants, the trapping of wild... 29 Labor 3 2012-07-01 2012-07-01 false Wild commodities. 780.114 Section 780.114 Labor Regulations...
17 CFR 37.3 - Requirements for underlying commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 5a(b)(3) of the Act, may trade any contract of sale of a commodity for future delivery (or option on... that are a security futures product, and the registered derivatives transaction execution facility is a... commodities. 37.3 Section 37.3 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION...
17 CFR 4.32 - Trading on a Registered Derivatives Transaction Execution Facility for Non-Institutional Customers.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Trading on a Registered... Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS Commodity Trading Advisors § 4.32 Trading on a Registered Derivatives Transaction Execution...
17 CFR 32.3 - Unlawful commodity option transactions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Unlawful commodity option... REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.3 Unlawful commodity option transactions. (a) On and after... extend credit in lieu thereof) from an option customer as payment of the purchase price in connection...
17 CFR 37.4 - Election to trade excluded and exempt commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Election to trade excluded and exempt commodities. 37.4 Section 37.4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION DERIVATIVES TRANSACTION EXECUTION FACILITIES § 37.4 Election to trade excluded and exempt...
17 CFR 4.32 - Trading on a Registered Derivatives Transaction Execution Facility for Non-Institutional Customers.

Code of Federal Regulations, 2012 CFR

2012-04-01

... 17 Commodity and Securities Exchanges 1 2012-04-01 2012-04-01 false Trading on a Registered... Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS Commodity Trading Advisors § 4.32 Trading on a Registered Derivatives Transaction Execution...
49 CFR 1248.100 - Commodity classification designated.

Code of Federal Regulations, 2010 CFR

2010-10-01

... STATISTICS Commodity Code § 1248.100 Commodity classification designated. Commencing with reports for the..., reports of commodity statistics required to be made to the Board, shall be based on the commodity codes... Statistics, 1963, issued by the Bureau of the Budget, and on additional codes 411 through 462 shown in § 1248...
Multiple commodities in statistical microeconomics: Model and market

NASA Astrophysics Data System (ADS)

Baaquie, Belal E.; Yu, Miao; Du, Xin

2016-11-01

A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.
The Influence Of Highway Transportation Infrastructure Condition Toward Commodity Production Generation for The Resilience Needs at Regional Internal Zone

NASA Astrophysics Data System (ADS)

Akbardin, Juang; Parikesit, Danang; Riyanto, Bambang; Mulyono, Agus Taufik

2018-02-01

The poultry commodity consumption and requirement is one of the main commodities that must be fulfilled in a region to maintain the availability of meat from poultry. Poultry commodity production is one of the production sectors that have a clean environment resistance. An increasing of poultry commodity generation production requires a smooth distribution to arrive at the processing. The livestock location as a commodity production is placed at a considerable far distance from residential and market locations. Zones that have poultry commodity production have an excess potential to supply other zones that are lacking in production to the consumption of these commodities. The condition of highway transportation infrastructure that is very diverse with the damage level availability in a zone has an influence in the supply and demand of poultry commodity requirement in the regional internal of Central Java province. In order to know the effect of highway transportation infrastructure condition toward the poultry commodity movement, demography factor and availability of freight vehicles will be reviewed to estimate the amount of poultry commodity movement generation production. Thus the poultry commodity consumption requirement that located in the internal - regional zone of central java province can be adequated from the zone. So it can be minimized the negative impacts that affect the environment at the zone in terms of comparison of the movement attraction and generation production at poultry commodity in Central Java.
Commodes: inconvenient conveniences.

PubMed

Naylor, J R; Mulley, G P

1993-11-13

To investigate use of commodes and attitudes of users and carers to them. Interview with semi-structured questionnaire of subjects supplied with commodes from Leeds community appliance centre. 140 users of a commode and 105 of their carers. Main reasons for being supplied with a commode were impaired mobility (130 subjects), difficulty in climbing stairs (128), and urinary incontinence (127). Main concerns of users and carers were lack of privacy (120 subjects felt embarrassed about using their commode, and 96 would not use it if someone was present); unpleasant smells (especially for 20 subjects who were confined to one room); physical appearance of commode chair (101 users said it had an unfavourable appearance, and 44 had tried to disguise it); and lack of follow up after commode was supplied (only 15 users and carers knew who to contact if there were problems). Users generally either had very positive or very negative attitudes to their commodes but most carers viewed them very negatively, especially with regard to cleaning them. Health professionals should be aware of people's need for privacy when advising them where to keep their commode. A standard commode is inappropriate for people confined to one room, and alternatives such as a chemical toilet should be considered. Regular follow up is needed to identify any problems such as uncomfortable or unsafe chairs. More thought should be given to the appearance of commodes in their design.

75 FR 67794 - Self-Regulatory Organizations; Chicago Board Options Exchange, Incorporated; Order Granting...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-11-03

... commodities or commodity futures, options on commodities, or other commodity derivatives or Commodity-Based... options or other derivatives on any of the foregoing; or (b) interest rate futures or options or... derivatives on any of the foregoing; or (b) interest rate futures or options or derivatives on the foregoing...
17 CFR 15.00 - Definitions of terms used in parts 15 to 21 of this chapter.

Code of Federal Regulations, 2010 CFR

2010-04-01

... commodity, means the actual commodity as distinguished from a futures or options contract in such commodity... for future delivery or commodity option transactions, or for effecting settlements of contracts for future delivery or commodity option transactions, for and between members of any designated contract...
75 FR 71762 - Self-Regulatory Organizations; The NASDAQ Stock Market LLC; Notice of Filing and Immediate...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-11-24

... commodities or commodity futures, options on commodities, or other commodity derivatives or Commodity-Based...) interest rate futures or options or derivatives on the foregoing in this subparagraph (b) (``Futures... options or other derivatives on any of the foregoing; or (b) interest rate futures or options or...
17 CFR 4.14 - Exemption from registration as a commodity trading advisor.

Code of Federal Regulations, 2011 CFR

2011-04-01

... TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS General Provisions, Definitions... commodity pool operator and the person's commodity trading advice is directed solely to, and for the sole use of, the pool or pools for which it is so registered; (5) It is exempt from registration as a...
17 CFR Appendix B to Part 43 - Enumerated Physical Commodity Contracts and Other Contracts

Code of Federal Regulations, 2012 CFR

2012-04-01

... 17 Commodity and Securities Exchanges 1 2012-04-01 2012-04-01 false Enumerated Physical Commodity... TRADING COMMISSION REAL-TIME PUBLIC REPORTING Pt. 43, App. B Appendix B to Part 43—Enumerated Physical Commodity Contracts and Other Contracts Enumerated Physical Commodity Contracts Agriculture ICE Futures U.S...
17 CFR 32.13 - Exemption from prohibition of commodity option transactions for trade options on certain...

Code of Federal Regulations, 2012 CFR

2012-04-01

... 17 Commodity and Securities Exchanges 1 2012-04-01 2012-04-01 false Exemption from prohibition of... Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION... are met at the time of the solicitation or acceptance: (1) That person is registered with the...
17 CFR 4.32 - Trading on a Registered Derivatives Transaction Execution Facility for Non-Institutional Customers.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Trading on a Registered Derivatives Transaction Execution Facility for Non-Institutional Customers. 4.32 Section 4.32 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING...
Kinetic market models with single commodity having price fluctuations

NASA Astrophysics Data System (ADS)

Chatterjee, A.; Chakrabarti, B. K.

2006-12-01

We study here numerically the behavior of an ideal gas like model of markets having only one non-consumable commodity. We investigate the behavior of the steady-state distributions of money, commodity and total wealth, as the dynamics of trading or exchange of money and commodity proceeds, with local (in time) fluctuations in the price of the commodity. These distributions are studied in markets with agents having uniform and random saving factors. The self-organizing features in money distribution are similar to the cases without any commodity (or with consumable commodities), while the commodity distribution shows an exponential decay. The wealth distribution shows interesting behavior: gamma like distribution for uniform saving propensity and has the same power-law tail, as that of the money distribution, for a market with agents having random saving propensity.
Fast analytical scatter estimation using graphics processing units.

PubMed

Ingleby, Harry; Lippuner, Jonas; Rickey, Daniel W; Li, Yue; Elbakri, Idris

2015-01-01

To develop a fast patient-specific analytical estimator of first-order Compton and Rayleigh scatter in cone-beam computed tomography, implemented using graphics processing units. The authors developed an analytical estimator for first-order Compton and Rayleigh scatter in a cone-beam computed tomography geometry. The estimator was coded using NVIDIA's CUDA environment for execution on an NVIDIA graphics processing unit. Performance of the analytical estimator was validated by comparison with high-count Monte Carlo simulations for two different numerical phantoms. Monoenergetic analytical simulations were compared with monoenergetic and polyenergetic Monte Carlo simulations. Analytical and Monte Carlo scatter estimates were compared both qualitatively, from visual inspection of images and profiles, and quantitatively, using a scaled root-mean-square difference metric. Reconstruction of simulated cone-beam projection data of an anthropomorphic breast phantom illustrated the potential of this method as a component of a scatter correction algorithm. The monoenergetic analytical and Monte Carlo scatter estimates showed very good agreement. The monoenergetic analytical estimates showed good agreement for Compton single scatter and reasonable agreement for Rayleigh single scatter when compared with polyenergetic Monte Carlo estimates. For a voxelized phantom with dimensions 128 × 128 × 128 voxels and a detector with 256 × 256 pixels, the analytical estimator required 669 seconds for a single projection, using a single NVIDIA 9800 GX2 video card. Accounting for first order scatter in cone-beam image reconstruction improves the contrast to noise ratio of the reconstructed images. The analytical scatter estimator, implemented using graphics processing units, provides rapid and accurate estimates of single scatter and with further acceleration and a method to account for multiple scatter may be useful for practical scatter correction schemes.
A Crosswalk of Mineral Commodity End Uses and North American Industry Classification System (NAICS) codes

USGS Publications Warehouse

Barry, James J.; Matos, Grecia R.; Menzie, W. David

2015-09-14

The links between the end uses of mineral commodities and the NAICS codes provide an instrument for analyzing the use of mineral commodities in the economy. The crosswalk is also a guide, highlighting those industrial sectors in the economy that rely heavily on mineral commodities. The distribution of mineral commodities across the economy is dynamic and does differ from year to year. This report reflects a snapshot of the state of the economy and mineral commodities in 2010.
Synthesis and Verification of Biobased Terephthalic Acid from Furfural

NASA Astrophysics Data System (ADS)

Tachibana, Yuya; Kimura, Saori; Kasuya, Ken-Ichi

2015-02-01

Exploiting biomass as an alternative to petrochemicals for the production of commodity plastics is vitally important if we are to become a more sustainable society. Here, we report a synthetic route for the production of terephthalic acid (TPA), the monomer of the widely used thermoplastic polymer poly(ethylene terephthalate) (PET), from the biomass-derived starting material furfural. Biobased furfural was oxidised and dehydrated to give maleic anhydride, which was further reacted with biobased furan to give its Diels-Alder (DA) adduct. The dehydration of the DA adduct gave phthalic anhydride, which was converted via phthalic acid and dipotassium phthalate to TPA. The biobased carbon content of the TPA was measured by accelerator mass spectroscopy and the TPA was found to be made of 100% biobased carbon.
17 CFR 1.19 - Prohibited trading in certain “puts” and “calls”.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Prohibited trading in certain âputsâ and âcallsâ. 1.19 Section 1.19 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Prohibited Trading in Commodity Options § 1...
17 CFR 33.6 - Suspension or revocation of designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2011 CFR

2011-04-01

... designation as a contract market for the trading of commodity options. 33.6 Section 33.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.6 Suspension or revocation of designation as a contract market for the trading...
17 CFR 1.19 - Prohibited trading in certain “puts” and “calls”.

Code of Federal Regulations, 2014 CFR

2014-04-01

... 17 Commodity and Securities Exchanges 1 2014-04-01 2014-04-01 false Prohibited trading in certain âputsâ and âcallsâ. 1.19 Section 1.19 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Prohibited Trading in Commodity Options § 1...
17 CFR 33.6 - Suspension or revocation of designation as a contract market for the trading of commodity options.

Code of Federal Regulations, 2012 CFR

2012-04-01

... designation as a contract market for the trading of commodity options. 33.6 Section 33.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF DOMESTIC EXCHANGE-TRADED COMMODITY OPTION TRANSACTIONS § 33.6 Suspension or revocation of designation as a contract market for the trading...
17 CFR 1.19 - Prohibited trading in certain “puts” and “calls”.

Code of Federal Regulations, 2012 CFR

2012-04-01

... 17 Commodity and Securities Exchanges 1 2012-04-01 2012-04-01 false Prohibited trading in certain âputsâ and âcallsâ. 1.19 Section 1.19 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Prohibited Trading in Commodity Options § 1...
17 CFR 1.19 - Prohibited trading in certain “puts” and “calls”.

Code of Federal Regulations, 2013 CFR

2013-04-01

... 17 Commodity and Securities Exchanges 1 2013-04-01 2013-04-01 false Prohibited trading in certain âputsâ and âcallsâ. 1.19 Section 1.19 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Prohibited Trading in Commodity Options § 1...
17 CFR 1.19 - Prohibited trading in certain “puts” and “calls”.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Prohibited trading in certain âputsâ and âcallsâ. 1.19 Section 1.19 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Prohibited Trading in Commodity Options § 1...
7 CFR 1421.5 - Eligible commodities.

Code of Federal Regulations, 2010 CFR

2010-01-01

...)(1) To be an eligible commodity, the commodity must be merchantable for food, feed, or other uses... poisonous to humans or animals. A commodity containing vomitoxin, aflatoxin, or Aspergillus mold may not be...
7 CFR 1421.5 - Eligible commodities.

Code of Federal Regulations, 2011 CFR

2011-01-01

...)(1) To be an eligible commodity, the commodity must be merchantable for food, feed, or other uses... poisonous to humans or animals. A commodity containing vomitoxin, aflatoxin, or Aspergillus mold may not be...

Commonly Consumed Food Commodities

EPA Pesticide Factsheets

Commonly consumed foods are those ingested for their nutrient properties. Food commodities can be either raw agricultural commodities or processed commodities, provided that they are the forms that are sold or distributed for human consumption. Learn more.
76 FR 28641 - Commodity Pool Operators: Relief From Compliance With Certain Disclosure, Reporting and...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-05-18

... are subject to certain operational and advertising requirements under Part 4, to all other provisions... 4 Advertising, Brokers, Commodity futures, Commodity pool operators, Commodity trading advisors...
Commodes: inconvenient conveniences.

PubMed Central

Naylor, J R; Mulley, G P

1993-01-01

OBJECTIVES--To investigate use of commodes and attitudes of users and carers to them. DESIGN--Interview with semi-structured questionnaire of subjects supplied with commodes from Leeds community appliance centre. SUBJECTS--140 users of a commode and 105 of their carers. RESULTS--Main reasons for being supplied with a commode were impaired mobility (130 subjects), difficulty in climbing stairs (128), and urinary incontinence (127). Main concerns of users and carers were lack of privacy (120 subjects felt embarrassed about using their commode, and 96 would not use it if someone was present); unpleasant smells (especially for 20 subjects who were confined to one room); physical appearance of commode chair (101 users said it had an unfavourable appearance, and 44 had tried to disguise it); and lack of follow up after commode was supplied (only 15 users and carers knew who to contact if there were problems). Users generally either had very positive or very negative attitudes to their commodes but most carers viewed them very negatively, especially with regard to cleaning them. CONCLUSIONS--Health professionals should be aware of people's need for privacy when advising them where to keep their commode. A standard commode is inappropriate for people confined to one room, and alternatives such as a chemical toilet should be considered. Regular follow up is needed to identify any problems such as uncomfortable or unsafe chairs. More thought should be given to the appearance of commodes in their design. Images FIG 1 FIG 2 PMID:8281060
Shorebird Migration Patterns in Response to Climate Change: A Modeling Approach

NASA Technical Reports Server (NTRS)

Smith, James A.

2010-01-01

The availability of satellite remote sensing observations at multiple spatial and temporal scales, coupled with advances in climate modeling and information technologies offer new opportunities for the application of mechanistic models to predict how continental scale bird migration patterns may change in response to environmental change. In earlier studies, we explored the phenotypic plasticity of a migratory population of Pectoral sandpipers by simulating the movement patterns of an ensemble of 10,000 individual birds in response to changes in stopover locations as an indicator of the impacts of wetland loss and inter-annual variability on the fitness of migratory shorebirds. We used an individual based, biophysical migration model, driven by remotely sensed land surface data, climate data, and biological field data. Mean stop-over durations and stop-over frequency with latitude predicted from our model for nominal cases were consistent with results reported in the literature and available field data. In this study, we take advantage of new computing capabilities enabled by recent GP-GPU computing paradigms and commodity hardware (general purchase computing on graphics processing units). Several aspects of our individual based (agent modeling) approach lend themselves well to GP-GPU computing. We have been able to allocate compute-intensive tasks to the graphics processing units, and now simulate ensembles of 400,000 birds at varying spatial resolutions along the central North American flyway. We are incorporating additional, species specific, mechanistic processes to better reflect the processes underlying bird phenotypic plasticity responses to different climate change scenarios in the central U.S.
75 FR 27338 - NASDAQ OMX Commodities Clearing-Contract Merchant LLC; NASDAQ OMX Commodities Clearing-Delivery...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-05-14

... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket Nos. ER10-912-000; ER10-913-000; ER10-914-000] NASDAQ OMX Commodities Clearing--Contract Merchant LLC; NASDAQ OMX Commodities Clearing--Delivery LLC; NASDAQ OMX Commodities Clearing--Finance LLC; Notice of Filing May 6, 2010. Take notice that, on May 3, 2010, NASDAQ OMX Commoditie...
17 CFR Appendix C to Part 4 - Form CTA-PR

Code of Federal Regulations, 2014 CFR

2014-04-01

... 17 Commodity and Securities Exchanges 1 2014-04-01 2014-04-01 false Form CTA-PR C Appendix C to Part 4 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION COMMODITY POOL OPERATORS AND COMMODITY TRADING ADVISORS Pt. 4, App. C Appendix C to Part 4—Form CTA-PR ER24FE12.052 ER24FE12...
17 CFR 41.43 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... options with persons other than brokers, dealers, futures commission merchants, floor brokers, or floor... securities, commodity futures, or commodity options with persons other than brokers, dealers, persons....43 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION SECURITY FUTURES PRODUCTS...
75 FR 54794 - Commodity Pool Operators: Relief From Compliance With Certain Disclosure, Reporting and...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-09-09

... are subject to certain operational \\7\\ and advertising requirements \\8\\ under Part 4, to all other... in 17 CFR Part 4 Advertising, Brokers, Commodity futures, Commodity pool operators, Commodity trading...
78 FR 41384 - Agricultural Advisory Committee Meeting

Federal Register 2010, 2011, 2012, 2013, 2014

2013-07-10

... COMMODITY FUTURES TRADING COMMISSION Agricultural Advisory Committee Meeting AGENCY: Commodity Futures Trading Commission. ACTION: Notice of Meeting. SUMMARY: The Commodity Futures Trading Commission's... Lachenmayr, Commodity Futures Trading Commission, Three Lafayette Centre, 1155 21st Street NW., Washington...
Vestibular coriolis effect differences modeled with three-dimensional linear-angular interactions.

PubMed

Holly, Jan E

2004-01-01

The vestibular coriolis (or "cross-coupling") effect is traditionally explained by cross-coupled angular vectors, which, however, do not explain the differences in perceptual disturbance under different acceleration conditions. For example, during head roll tilt in a rotating chair, the magnitude of perceptual disturbance is affected by a number of factors, including acceleration or deceleration of the chair rotation or a zero-g environment. Therefore, it has been suggested that linear-angular interactions play a role. The present research investigated whether these perceptual differences and others involving linear coriolis accelerations could be explained under one common framework: the laws of motion in three dimensions, which include all linear-angular interactions among all six components of motion (three angular and three linear). The results show that the three-dimensional laws of motion predict the differences in perceptual disturbance. No special properties of the vestibular system or nervous system are required. In addition, simulations were performed with angular, linear, and tilt time constants inserted into the model, giving the same predictions. Three-dimensional graphics were used to highlight the manner in which linear-angular interaction causes perceptual disturbance, and a crucial component is the Stretch Factor, which measures the "unexpected" linear component.
Accelerated event-by-event Monte Carlo microdosimetric calculations of electrons and protons tracks on a multi-core CPU and a CUDA-enabled GPU.

PubMed

Kalantzis, Georgios; Tachibana, Hidenobu

2014-01-01

For microdosimetric calculations event-by-event Monte Carlo (MC) methods are considered the most accurate. The main shortcoming of those methods is the extensive requirement for computational time. In this work we present an event-by-event MC code of low projectile energy electron and proton tracks for accelerated microdosimetric MC simulations on a graphic processing unit (GPU). Additionally, a hybrid implementation scheme was realized by employing OpenMP and CUDA in such a way that both GPU and multi-core CPU were utilized simultaneously. The two implementation schemes have been tested and compared with the sequential single threaded MC code on the CPU. Performance comparison was established on the speed-up for a set of benchmarking cases of electron and proton tracks. A maximum speedup of 67.2 was achieved for the GPU-based MC code, while a further improvement of the speedup up to 20% was achieved for the hybrid approach. The results indicate the capability of our CPU-GPU implementation for accelerated MC microdosimetric calculations of both electron and proton tracks without loss of accuracy. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Acceleration of FDTD mode solver by high-performance computing techniques.

PubMed

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
Studying Upper-Limb Kinematics Using Inertial Sensors Embedded in Mobile Phones.

PubMed

Roldan-Jimenez, Cristina; Cuesta-Vargas, Antonio; Bennett, Paul

2015-05-20

In recent years, there has been a great interest in analyzing upper-limb kinematics. Inertial measurement with mobile phones is a convenient and portable analysis method for studying humerus kinematics in terms of angular mobility and linear acceleration. The aim of this analysis was to study upper-limb kinematics via mobile phones through six physical properties that correspond to angular mobility and acceleration in the three axes of space. This cross-sectional study recruited healthy young adult subjects. Humerus kinematics was studied in 10 young adults with the iPhone4. They performed flexion and abduction analytical tasks. Mobility angle and lineal acceleration in each of its axes (yaw, pitch, and roll) were obtained with the iPhone4. This device was placed on the right half of the body of each subject, in the middle third of the humerus, slightly posterior. Descriptive statistics were calculated. Descriptive graphics of analytical tasks performed were obtained. The biggest range of motion was found in pitch angle, and the biggest acceleration was found in the y-axis in both analytical tasks. Focusing on tridimensional kinematics, bigger range of motion and acceleration was found in abduction (209.69 degrees and 23.31 degrees per second respectively). Also, very strong correlation was found between angular mobility and linear acceleration in abduction (r=.845) and flexion (r=.860). The use of an iPhone for humerus tridimensional kinematics is feasible. This supports use of the mobile phone as a device to analyze upper-limb kinematics and to facilitate the evaluation of the patient. ©Cristina Roldan-Jimenez, Antonio Cuesta-Vargas, Paul Bennett. Originally published in JMIR Rehabilitation and Assistive Technology (http://rehab.jmir.org), 20.05.2015.
SU-E-T-36: A GPU-Accelerated Monte-Carlo Dose Calculation Platform and Its Application Toward Validating a ViewRay Beam Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Y; Mazur, T; Green, O

Purpose: To build a fast, accurate and easily-deployable research platform for Monte-Carlo dose calculations. We port the dose calculation engine PENELOPE to C++, and accelerate calculations using GPU acceleration. Simulations of a Co-60 beam model provided by ViewRay demonstrate the capabilities of the platform. Methods: We built software that incorporates a beam model interface, CT-phantom model, GPU-accelerated PENELOPE engine, and GUI front-end. We rewrote the PENELOPE kernel in C++ (from Fortran) and accelerated the code on a GPU. We seamlessly integrated a Co-60 beam model (obtained from ViewRay) into our platform. Simulations of various field sizes and SSDs using amore » homogeneous water phantom generated PDDs, dose profiles, and output factors that were compared to experiment data. Results: With GPU acceleration using a dated graphics card (Nvidia Tesla C2050), a highly accurate simulation – including 100*100*100 grid, 3×3×3 mm3 voxels, <1% uncertainty, and 4.2×4.2 cm2 field size – runs 24 times faster (20 minutes versus 8 hours) than when parallelizing on 8 threads across a new CPU (Intel i7-4770). Simulated PDDs, profiles and output ratios for the commercial system agree well with experiment data measured using radiographic film or ionization chamber. Based on our analysis, this beam model is precise enough for general applications. Conclusions: Using a beam model for a Co-60 system provided by ViewRay, we evaluate a dose calculation platform that we developed. Comparison to measurements demonstrates the promise of our software for use as a research platform for dose calculations, with applications including quality assurance and treatment plan verification.« less
Acceleration of color computer-generated hologram from three-dimensional scenes with texture and depth information

NASA Astrophysics Data System (ADS)

Shimobaba, Tomoyoshi; Kakue, Takashi; Ito, Tomoyoshi

2014-06-01

We propose acceleration of color computer-generated holograms (CGHs) from three-dimensional (3D) scenes that are expressed as texture (RGB) and depth (D) images. These images are obtained by 3D graphics libraries and RGB-D cameras: for example, OpenGL and Kinect, respectively. We can regard them as two-dimensional (2D) cross-sectional images along the depth direction. The generation of CGHs from the 2D cross-sectional images requires multiple diffraction calculations. If we use convolution-based diffraction such as the angular spectrum method, the diffraction calculation takes a long time and requires large memory usage because the convolution diffraction calculation requires the expansion of the 2D cross-sectional images to avoid the wraparound noise. In this paper, we first describe the acceleration of the diffraction calculation using "Band-limited double-step Fresnel diffraction," which does not require the expansion. Next, we describe color CGH acceleration using color space conversion. In general, color CGHs are generated on RGB color space; however, we need to repeat the same calculation for each color component, so that the computational burden of the color CGH generation increases three-fold, compared with monochrome CGH generation. We can reduce the computational burden by using YCbCr color space because the 2D cross-sectional images on YCbCr color space can be down-sampled without the impairing of the image quality.
A Study on Market Efficiency of Selected Commodity Derivatives Traded on NCDEX During 2011

NASA Astrophysics Data System (ADS)

Sajipriya, N.

2012-10-01

The study aims at testing the weak form of Efficient Market Hypothesis in the context of an emerging commodity market - National Commodity Derivatives Exchange (NCDEX), which is considered as the prime commodity derivatives market in India. The study considered daily spot and futures prices of five selected commodities traded on NCDEX over 12 month period (the futures contracts originating and expiring during the period January 2011 to December 2011) The five commodities chosen are Pepper, Crude palm Oil, steel silver and Chana as they account for almost two-thirds of the value of agricultural commodity derivatives traded on NCDEX. The results of Run test indicate that both spot and futures prices are weak form efficient
17 CFR 242.401 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-04-01

... of whose business consists of transactions in securities, commodity futures, or commodity options... securities, commodity futures, or commodity options with persons other than brokers, dealers, persons... M, SHO, ATS, AC, AND NMS AND CUSTOMER MARGIN REQUIREMENTS FOR SECURITY FUTURES Customer Margin...
40 CFR 180.108 - Acephate; tolerances for residues.

Code of Federal Regulations, 2012 CFR

2012-07-01

... phosphoramidothioate, in or on the commodity. Commodity 1 Parts per million Bean, dry, seed 3.0 Bean, succulent 3.0... phosphoramidothioate, in or on the commodity. Commodity Parts per million Bean, dry, seed 1 Bean, succulent 1 Brussels...
40 CFR 180.108 - Acephate; tolerances for residues.

Code of Federal Regulations, 2011 CFR

2011-07-01

... phosphoramidothioate, in or on the commodity. Commodity 1 Parts per million Bean, dry, seed 3.0 Bean, succulent 3.0... phosphoramidothioate, in or on the commodity. Commodity Parts per million Bean, dry, seed 1 Bean, succulent 1 Brussels...
40 CFR 180.108 - Acephate; tolerances for residues.

Code of Federal Regulations, 2014 CFR

2014-07-01

... phosphoramidothioate, in or on the commodity. Commodity 1 Parts per million Bean, dry, seed 3.0 Bean, succulent 3.0... phosphoramidothioate, in or on the commodity. Commodity Parts per million Bean, dry, seed 1 Bean, succulent 1 Brussels...

40 CFR 180.108 - Acephate; tolerances for residues.

Code of Federal Regulations, 2013 CFR

2013-07-01

... phosphoramidothioate, in or on the commodity. Commodity 1 Parts per million Bean, dry, seed 3.0 Bean, succulent 3.0... phosphoramidothioate, in or on the commodity. Commodity Parts per million Bean, dry, seed 1 Bean, succulent 1 Brussels...
17 CFR 162.2 - Definitions.

Code of Federal Regulations, 2013 CFR

2013-04-01

... control with a covered affiliate. (b) Clear and conspicuous. The term “clear and conspicuous” means... exchange dealer, commodity trading advisor, commodity pool operator, introducing broker, major swap..., commodity trading advisor, commodity pool operator, introducing broker, major swap participant or swap...
17 CFR 162.2 - Definitions.

Code of Federal Regulations, 2014 CFR

2014-04-01

... corporate control with a covered affiliate. (b) Clear and conspicuous. The term “clear and conspicuous... exchange dealer, commodity trading advisor, commodity pool operator, introducing broker, major swap..., commodity trading advisor, commodity pool operator, introducing broker, major swap participant or swap...
17 CFR 162.2 - Definitions.

Code of Federal Regulations, 2012 CFR

2012-04-01

... control with a covered affiliate. (b) Clear and conspicuous. The term “clear and conspicuous” means... exchange dealer, commodity trading advisor, commodity pool operator, introducing broker, major swap..., commodity trading advisor, commodity pool operator, introducing broker, major swap participant or swap...
Cross-commodity delay discounting of alcohol and money in alcohol users

PubMed Central

Moody, Lara N.; Tegge, Allison N.; Bickel, Warren K.

2017-01-01

Despite real-world implications, the pattern of delay discounting in alcohol users when the commodities now and later differ has not been well characterized. In this study, 60 participants on Amazon's Mechanical Turk completed the Alcohol Use Disorder Identification Test (AUDIT) to assess severity of use and completed four delay discounting tasks between hypothetical, equivalent amounts of alcohol and money available at five delays. The tasks included two cross-commodity (alcohol now-money later and money now-alcohol later) and two same-commodity (money now-money later and alcohol now-alcohol later) conditions. Delay discounting was significantly associated with clinical cutoffs of the AUDIT for both of the cross-commodity conditions but not for either of the same-commodity delay discounting tasks. The cross-commodity discounting conditions were related to severity of use wherein heavy users discounted future alcohol less and future money more. The change in direction of the discounting effect was dependent on the commodity that was distally available suggesting a distinctive pattern of discounting across commodities when comparing light and heavy alcohol users. PMID:29056767
Cross-commodity delay discounting of alcohol and money in alcohol users.

PubMed

Moody, Lara N; Tegge, Allison N; Bickel, Warren K

2017-06-01

Despite real-world implications, the pattern of delay discounting in alcohol users when the commodities now and later differ has not been well characterized. In this study, 60 participants on Amazon's Mechanical Turk completed the Alcohol Use Disorder Identification Test (AUDIT) to assess severity of use and completed four delay discounting tasks between hypothetical, equivalent amounts of alcohol and money available at five delays. The tasks included two cross-commodity (alcohol now-money later and money now-alcohol later) and two same-commodity (money now-money later and alcohol now-alcohol later) conditions. Delay discounting was significantly associated with clinical cutoffs of the AUDIT for both of the cross-commodity conditions but not for either of the same-commodity delay discounting tasks. The cross-commodity discounting conditions were related to severity of use wherein heavy users discounted future alcohol less and future money more. The change in direction of the discounting effect was dependent on the commodity that was distally available suggesting a distinctive pattern of discounting across commodities when comparing light and heavy alcohol users.
17 CFR Appendix C to Part 1 - [Reserved

Code of Federal Regulations, 2014 CFR

2014-04-01

... 17 Commodity and Securities Exchanges 1 2014-04-01 2014-04-01 false [Reserved] C Appendix C to Part 1 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Appendix C to Part 1 [Reserved] ...
Massively parallel GPU-accelerated minimization of classical density functional theory

NASA Astrophysics Data System (ADS)

Stopper, Daniel; Roth, Roland

2017-08-01

In this paper, we discuss the ability to numerically minimize the grand potential of hard disks in two-dimensional and of hard spheres in three-dimensional space within the framework of classical density functional and fundamental measure theory on modern graphics cards. Our main finding is that a massively parallel minimization leads to an enormous performance gain in comparison to standard sequential minimization schemes. Furthermore, the results indicate that in complex multi-dimensional situations, a heavy parallel minimization of the grand potential seems to be mandatory in order to reach a reasonable balance between accuracy and computational cost.
Accelerating Pseudo-Random Number Generator for MCNP on GPU

NASA Astrophysics Data System (ADS)

Gong, Chunye; Liu, Jie; Chi, Lihua; Hu, Qingfeng; Deng, Li; Gong, Zhenghu

2010-09-01

Pseudo-random number generators (PRNG) are intensively used in many stochastic algorithms in particle simulations, artificial neural networks and other scientific computation. The PRNG in Monte Carlo N-Particle Transport Code (MCNP) requires long period, high quality, flexible jump and fast enough. In this paper, we implement such a PRNG for MCNP on NVIDIA's GTX200 Graphics Processor Units (GPU) using CUDA programming model. Results shows that 3.80 to 8.10 times speedup are achieved compared with 4 to 6 cores CPUs and more than 679.18 million double precision random numbers can be generated per second on GPU.
Droplet flow along the wall of rectangular channel with gradient of wettability

NASA Astrophysics Data System (ADS)

Kupershtokh, A. L.

2018-03-01

The lattice Boltzmann equations (LBE) method (LBM) is applicable for simulating the multiphysics problems of fluid flows with free boundaries, taking into account the viscosity, surface tension, evaporation and wetting degree of a solid surface. Modeling of the nonstationary motion of a drop of liquid along a solid surface with a variable level of wettability is carried out. For the computer simulation of such a problem, the three-dimensional lattice Boltzmann equations method D3Q19 is used. The LBE method allows us to parallelize the calculations on multiprocessor graphics accelerators using the CUDA programming technology.
A faster technique for rendering meshes in multiple display systems

NASA Astrophysics Data System (ADS)

Hand, Randall E.; Moorhead, Robert J., II

2003-05-01

Level of detail algorithms have widely been implemented in architectural VR walkthroughs and video games, but have not had widespread use in VR terrain visualization systems. This thesis explains a set of optimizations to allow most current level of detail algorithms run in the types of multiple display systems used in VR. It improves both the visual quality of the system through use of graphics hardware acceleration, and improves the framerate and running time through moifications to the computaitons that drive the algorithms. Using ROAM as a testbed, results show improvements between 10% and 100% on varying machines.
Demand-driven energy requirement of world economy 2007: A multi-region input-output network simulation

NASA Astrophysics Data System (ADS)

Chen, Zhan-Ming; Chen, G. Q.

2013-07-01

This study presents a network simulation of the global embodied energy flows in 2007 based on a multi-region input-output model. The world economy is portrayed as a 6384-node network and the energy interactions between any two nodes are calculated and analyzed. According to the results, about 70% of the world's direct energy input is invested in resource, heavy manufacture, and transportation sectors which provide only 30% of the embodied energy to satisfy final demand. By contrast, non-transportation services sectors contribute to 24% of the world's demand-driven energy requirement with only 6% of the direct energy input. Commodity trade is shown to be an important alternative to fuel trade in redistributing energy, as international commodity flows embody 1.74E + 20 J of energy in magnitude up to 89% of the traded fuels. China is the largest embodied energy exporter with a net export of 3.26E + 19 J, in contrast to the United States as the largest importer with a net import of 2.50E + 19 J. The recent economic fluctuations following the financial crisis accelerate the relative expansions of energy requirement by developing countries, as a consequence China will take over the place of the United States as the world's top demand-driven energy consumer in 2022 and India will become the third largest in 2015.
17 CFR 1.1 - [Reserved

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false [Reserved] 1.1 Section 1.1 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Definitions § 1.1 [Reserved] [66 FR 42269, Aug. 10, 2001] ...
75 FR 77576 - General Regulations and Derivatives Clearing Organizations

Federal Register 2010, 2011, 2012, 2013, 2014

2010-12-13

... Derivatives Clearing Organizations AGENCY: Commodity Futures Trading Commission. ACTION: Notice of proposed... clearing transactions in commodities for future delivery or commodity option transactions, or for effecting settlements of contracts for future delivery or commodity option transactions, for and between members of any...
40 CFR 414.60 - Applicability; description of the commodity organic chemicals subcategory.

Code of Federal Regulations, 2011 CFR

2011-07-01

... commodity organic chemicals subcategory. 414.60 Section 414.60 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) EFFLUENT GUIDELINES AND STANDARDS ORGANIC CHEMICALS, PLASTICS, AND SYNTHETIC FIBERS Commodity Organic Chemicals § 414.60 Applicability; description of the commodity organic chemicals...
40 CFR 414.60 - Applicability; description of the commodity organic chemicals subcategory.

Code of Federal Regulations, 2012 CFR

2012-07-01

... commodity organic chemicals subcategory. 414.60 Section 414.60 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) EFFLUENT GUIDELINES AND STANDARDS ORGANIC CHEMICALS, PLASTICS, AND SYNTHETIC FIBERS Commodity Organic Chemicals § 414.60 Applicability; description of the commodity organic chemicals...
40 CFR 414.60 - Applicability; description of the commodity organic chemicals subcategory.

Code of Federal Regulations, 2013 CFR

2013-07-01

... commodity organic chemicals subcategory. 414.60 Section 414.60 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) EFFLUENT GUIDELINES AND STANDARDS ORGANIC CHEMICALS, PLASTICS, AND SYNTHETIC FIBERS Commodity Organic Chemicals § 414.60 Applicability; description of the commodity organic chemicals...
40 CFR 414.60 - Applicability; description of the commodity organic chemicals subcategory.

Code of Federal Regulations, 2014 CFR

2014-07-01

... commodity organic chemicals subcategory. 414.60 Section 414.60 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) EFFLUENT GUIDELINES AND STANDARDS ORGANIC CHEMICALS, PLASTICS, AND SYNTHETIC FIBERS Commodity Organic Chemicals § 414.60 Applicability; description of the commodity organic chemicals...
40 CFR 414.60 - Applicability; description of the commodity organic chemicals subcategory.

Code of Federal Regulations, 2010 CFR

2010-07-01

... commodity organic chemicals subcategory. 414.60 Section 414.60 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) EFFLUENT GUIDELINES AND STANDARDS ORGANIC CHEMICALS, PLASTICS, AND SYNTHETIC FIBERS Commodity Organic Chemicals § 414.60 Applicability; description of the commodity organic chemicals...
40 CFR 180.473 - Glufosinate ammonium; tolerances for residues.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 2-amino-4-(hydroxymethylphosphinyl)butanoic acid, in or on the commodity. Commodity Parts per... measuring only the sum of glufosinate ammonium, 2-amino-4-(hydroxymethylphosphinyl)butanoic acid... stoichiometric equivalent of 2-amino-4-(hydroxymethylphosphinyl)butanoic acid, in or on the commodity. Commodity...

26 CFR 4.954-0 - Introduction.

Code of Federal Regulations, 2010 CFR

2010-04-01

... corporation beginning after December 31, 1986. Consequently, any gain or loss (including foreign currency gain.... [Reserved] (f) Commodities transactions. (1) In general. (2) Definitions. (i) Commodity. (ii) Commodities transaction. (3) Definition of the term “qualified active sales”. (i) In general. (ii) Sale of commodities...
26 CFR 4.954-0 - Introduction.

Code of Federal Regulations, 2011 CFR

2011-04-01

... corporation beginning after December 31, 1986. Consequently, any gain or loss (including foreign currency gain.... [Reserved] (f) Commodities transactions. (1) In general. (2) Definitions. (i) Commodity. (ii) Commodities transaction. (3) Definition of the term “qualified active sales”. (i) In general. (ii) Sale of commodities...
7 CFR 17.1 - General.

Code of Federal Regulations, 2010 CFR

2010-01-01

... Secretary of Agriculture SALES OF AGRICULTURAL COMMODITIES MADE AVAILABLE UNDER TITLE I OF THE AGRICULTURAL... commodities by the Commodity Credit Corporation (CCC), through private trade channels to the maximum extent..., as amended (hereinafter called “the Act”). (b) Agricultural commodities agreements. (1) Under the Act...
32 CFR 275.3 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-07-01

... dealer in securities or commodities. (9) An investment banker or investment company. (10) A currency... matters. (26) Any futures commission merchant, commodity trading advisor, or commodity pool operator registered, or required to register, under the Commodity Exchange Act that is located inside any State or...
7 CFR 17.1 - General.

Code of Federal Regulations, 2011 CFR

2011-01-01

... Secretary of Agriculture SALES OF AGRICULTURAL COMMODITIES MADE AVAILABLE UNDER TITLE I OF THE AGRICULTURAL... commodities by the Commodity Credit Corporation (CCC), through private trade channels to the maximum extent..., as amended (hereinafter called “the Act”). (b) Agricultural commodities agreements. (1) Under the Act...
77 FR 27444 - Joint CFTC-SEC Advisory Committee on Emerging Regulatory Issues

Federal Register 2010, 2011, 2012, 2013, 2014

2012-05-10

... SECURITIES AND EXCHANGE COMMISSION COMMODITY FUTURES TRADING COMMISSION [Release Nos. 34-66932... and Exchange Commission (``SEC'') and Commodity Futures Trading Commission (``CFTC'') (each, an.... Commodity Futures Trading Commission Written comments may be mailed to the Commodity Futures Trading...
17 CFR 1.1 - [Reserved

Code of Federal Regulations, 2012 CFR

2012-04-01

... 17 Commodity and Securities Exchanges 1 2012-04-01 2012-04-01 false [Reserved] 1.1 Section 1.1 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Definitions § 1.1 [Reserved] [66 FR 42269, Aug. 10, 2001] ...
17 CFR 1.1 - [Reserved

Code of Federal Regulations, 2013 CFR

2013-04-01

... 17 Commodity and Securities Exchanges 1 2013-04-01 2013-04-01 false [Reserved] 1.1 Section 1.1 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Definitions § 1.1 [Reserved] [66 FR 42269, Aug. 10, 2001] ...
17 CFR 1.1 - [Reserved

Code of Federal Regulations, 2014 CFR

2014-04-01

... 17 Commodity and Securities Exchanges 1 2014-04-01 2014-04-01 false [Reserved] 1.1 Section 1.1 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Definitions § 1.1 [Reserved] [66 FR 42269, Aug. 10, 2001] ...
A world of minerals in your mobile device

USGS Publications Warehouse

Jenness, Jane E.; Ober, Joyce A.; Wilkins, Aleeza M.; Gambogi, Joseph

2016-09-15

Mobile phones and other high-technology communications devices could not exist without mineral commodities. More than one-half of all components in a mobile device—including its electronics, display, battery, speakers, and more—are made from mined and semiprocessed materials (mineral commodities). Some mineral commodities can be recovered as byproducts during the production and processing of other commodities. As an example, bauxite is mined for its aluminum content, but gallium is recovered during the aluminum production process. The images show the ore minerals (sources) of some mineral commodities that are used to make components of a mobile device. On the reverse side, the map and table depict the major source countries producing these mineral commodities along with how these commodities are used in mobile devices. For more information on minerals, visit http://minerals.usgs.gov.
A GPU-accelerated immersive audio-visual framework for interaction with molecular dynamics using consumer depth sensors.

PubMed

Glowacki, David R; O'Connor, Michael; Calabró, Gaetano; Price, James; Tew, Philip; Mitchell, Thomas; Hyde, Joseph; Tew, David P; Coughtrie, David J; McIntosh-Smith, Simon

2014-01-01

With advances in computational power, the rapidly growing role of computational/simulation methodologies in the physical sciences, and the development of new human-computer interaction technologies, the field of interactive molecular dynamics seems destined to expand. In this paper, we describe and benchmark the software algorithms and hardware setup for carrying out interactive molecular dynamics utilizing an array of consumer depth sensors. The system works by interpreting the human form as an energy landscape, and superimposing this landscape on a molecular dynamics simulation to chaperone the motion of the simulated atoms, affecting both graphics and sonified simulation data. GPU acceleration has been key to achieving our target of 60 frames per second (FPS), giving an extremely fluid interactive experience. GPU acceleration has also allowed us to scale the system for use in immersive 360° spaces with an array of up to ten depth sensors, allowing several users to simultaneously chaperone the dynamics. The flexibility of our platform for carrying out molecular dynamics simulations has been considerably enhanced by wrappers that facilitate fast communication with a portable selection of GPU-accelerated molecular force evaluation routines. In this paper, we describe a 360° atmospheric molecular dynamics simulation we have run in a chemistry/physics education context. We also describe initial tests in which users have been able to chaperone the dynamics of 10-alanine peptide embedded in an explicit water solvent. Using this system, both expert and novice users have been able to accelerate peptide rare event dynamics by 3-4 orders of magnitude.
Preliminary Metallogenic Map of North America; a listing of deposits by commodity

USGS Publications Warehouse

Lee, Michael P.; Guild, Philip White; Schruben, Paul G.

1987-01-01

The 4,215 ore deposits shown on the Preliminary Metallogenic Map of North America and contained in the Metallogenic Map file have been sorted by their principal (first-listed) commodities and grouped into metallic and nonmetallic categories. Deposit listings for 56 individual metals and minerals have been assembled using the data base and are arranged alphabetically by country, political subdivision (for the larger countries), and deposit name. Map numbers, major and minor constituents, geographic coordinates, and a geologic code are given for each deposit; additionally, the relative size and deposit class have been derived from the code and are listed separately. The frequencies of individual commodities and commodity groups by type, geographic distribution, and geologic occurrence are summarized in tables, and the relationships of associated commodities to principal commodities in the data base are emphasized in both tables and brief texts. In all, 49 metals and minerals are listed as principal (first or only) commodities and 7 more are shown as 'major' but not principal commodities. (Commodities listed as 'minor' in the data base were not sorted or tabulated separately.) Metals, divided into six subgroups, predominate over nonmetallic minerals by a ratio of about 7 to 1, although in terms of quantities and value the disparity is not so great. Within the metals group, the ranking according to frequency is as follows: base, precious, iron and alloying, other (antimony, beryllium, and others), nuclear-fuel, and light metals. The most frequently occurring commodity in the Metallogenic Map file is gold. Copper is ranked second, both in number of occurrences and as the principal commodity in deposits. Silver is ranked third in frequency of occurrence; lead and zinc are ranked fourth and fifth, respectively. Iron, ranked sixth in frequency of occurrence as a major commodity, is the third most reported principal commodity in the data base, ahead of silver (ranked fourth), lead (ranked fifth), and zinc (ranked sixth).
7 CFR 1427.22 - Commodity certificate exchanges.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 7 Agriculture 10 2010-01-01 2010-01-01 false Commodity certificate exchanges. 1427.22 Section 1427... Deficiency Payments § 1427.22 Commodity certificate exchanges. (a) For any outstanding marketing assistance loan provided for upland cotton, a producer may purchase a commodity certificate and exchange that...
44 CFR 206.151 - Food commodities.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 44 Emergency Management and Assistance 1 2013-10-01 2013-10-01 false Food commodities. 206.151... Food commodities. (a) The Administrator will assure that adequate stocks of food will be ready and... section, the Administrator may direct the Secretary of Agriculture to purchase food commodities in...
44 CFR 206.151 - Food commodities.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 44 Emergency Management and Assistance 1 2014-10-01 2014-10-01 false Food commodities. 206.151... Food commodities. (a) The Administrator will assure that adequate stocks of food will be ready and... section, the Administrator may direct the Secretary of Agriculture to purchase food commodities in...
44 CFR 206.151 - Food commodities.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 44 Emergency Management and Assistance 1 2011-10-01 2011-10-01 false Food commodities. 206.151... Food commodities. (a) The Administrator will assure that adequate stocks of food will be ready and... section, the Administrator may direct the Secretary of Agriculture to purchase food commodities in...
44 CFR 206.151 - Food commodities.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 44 Emergency Management and Assistance 1 2012-10-01 2011-10-01 true Food commodities. 206.151... Food commodities. (a) The Administrator will assure that adequate stocks of food will be ready and... section, the Administrator may direct the Secretary of Agriculture to purchase food commodities in...
Commodity-based Approach for Evaluating the Value of Freight Moving on Texas’ Roadway Network

DOT National Transportation Integrated Search

2017-12-10

The researchers took a commodity-based approach to evaluate the value of a list of selected commodities moved on the Texas freight network. This approach takes advantage of commodity-specific data sources and modeling processes. It provides a unique ...
22 CFR 228.51 - Commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Commodities. 228.51 Section 228.51 Foreign Relations AGENCY FOR INTERNATIONAL DEVELOPMENT RULES ON SOURCE, ORIGIN AND NATIONALITY FOR COMMODITIES AND... agricultural commodities, motor vehicles, or pharmaceuticals (see § 228.13, “Special source rules requiring...
17 CFR 190.10 - General.

Code of Federal Regulations, 2012 CFR

2012-04-01

... STATEMENT IS FURNISHED TO YOU BECAUSE RULE 190.10 (c) OF THE COMMODITY FUTURES TRADING COMMISSION REQUIRES... any combination of the following: futures commission merchant, commodity option dealer, foreign... “from or for the commodity futures account” or “from or for the commodity options account” of such...

17 CFR 32.6 - Segregation.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Segregation. 32.6 Section 32.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.6 Segregation. (a) Any person which accepts money, securities, or property from an option...
17 CFR 32.6 - Segregation.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Segregation. 32.6 Section 32.6 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION REGULATION OF COMMODITY OPTION TRANSACTIONS § 32.6 Segregation. (a) Any person which accepts money, securities, or property from an option...
7 CFR 1421.110 - Commodity certificate exchanges.

Code of Federal Regulations, 2010 CFR

2010-01-01

... COMMODITIES-MARKETING ASSISTANCE LOANS AND LOAN DEFICIENCY PAYMENTS FOR 2008 THROUGH 2012 Marketing Assistance Loans § 1421.110 Commodity certificate exchanges. (a) For any outstanding marketing assistance loan for... commodity certificate for the marketing assistance loan collateral. (b) The exchange rate is the lesser of...
7 CFR 1488.8 - Documents required after delivery.

Code of Federal Regulations, 2011 CFR

2011-01-01

... AGRICULTURAL COMMODITIES Financing of Export Sales of Agricultural Commodities From Private Stocks Under CCC... delivery. (a) CCC will purchase an exporter's account receivable only if the Treasurer, Commodity Credit... or Assistant Treasurer, CCC, after date of delivery of commodities exported or to be exported under...
7 CFR 1488.8 - Documents required after delivery.

Code of Federal Regulations, 2010 CFR

2010-01-01

... AGRICULTURAL COMMODITIES Financing of Export Sales of Agricultural Commodities From Private Stocks Under CCC... delivery. (a) CCC will purchase an exporter's account receivable only if the Treasurer, Commodity Credit... or Assistant Treasurer, CCC, after date of delivery of commodities exported or to be exported under...
Experiences with Transitioning Science Data Production from a Symmetric Multiprocessor Platform to a Linux Cluster Environment

NASA Astrophysics Data System (ADS)

Walter, R. J.; Protack, S. P.; Harris, C. J.; Caruthers, C.; Kusterer, J. M.

2008-12-01

NASA's Atmospheric Science Data Center at the NASA Langley Research Center performs all of the science data processing for the Multi-angle Imaging SpectroRadiometer (MISR) instrument. MISR is one of the five remote sensing instruments flying aboard NASA's Terra spacecraft. From the time of Terra launch in December 1999 until February 2008, all MISR science data processing was performed on a Silicon Graphics, Inc. (SGI) platform. However, dramatic improvements in commodity computing technology coupled with steadily declining project budgets during that period eventually made transitioning MISR processing to a commodity computing environment both feasible and necessary. The Atmospheric Science Data Center has successfully ported the MISR science data processing environment from the SGI platform to a Linux cluster environment. There were a multitude of technical challenges associated with this transition. Even though the core architecture of the production system did not change, the manner in which it interacted with underlying hardware was fundamentally different. In addition, there are more potential throughput bottlenecks in a cluster environment than there are in a symmetric multiprocessor environment like the SGI platform and each of these had to be addressed. Once all the technical issues associated with the transition were resolved, the Atmospheric Science Data Center had a MISR science data processing system with significantly higher throughput than the SGI platform at a fraction of the cost. In addition to the commodity hardware, free and open source software such as S4PM, Sun Grid Engine, PostgreSQL and Ganglia play a significant role in the new system. Details of the technical challenges and resolutions, software systems, performance improvements, and cost savings associated with the transition will be discussed. The Atmospheric Science Data Center in Langley's Science Directorate leads NASA's program for the processing, archival and distribution of Earth science data in the areas of radiation budget, clouds, aerosols, and tropospheric chemistry. The Data Center was established in 1991 to support NASA's Earth Observing System and the U.S. Global Change Research Program. It is unique among NASA data centers in the size of its archive, cutting edge computing technology, and full range of data services. For more information regarding ASDC data holdings, documentation, tools and services, visit http://eosweb.larc.nasa.gov
A graphical approach to electric sail mission design with radial thrust

NASA Astrophysics Data System (ADS)

Mengali, Giovanni; Quarta, Alessandro A.; Aliasi, Generoso

2013-02-01

This paper describes a semi-analytical approach to electric sail mission analysis under the assumption that the spacecraft experiences a purely radial, outward, propulsive acceleration. The problem is tackled by means of the potential well concept, a very effective idea that was originally introduced by Prussing and Coverstone in 1998. Unlike a classical procedure that requires the numerical integration of the equations of motion, the proposed method provides an estimate of the main spacecraft trajectory parameters, as its maximum and minimum attainable distance from the Sun, with the simple use of analytical relationships and elementary graphs. A number of mission scenarios clearly show the effectiveness of the proposed approach. In particular, when the spacecraft parking orbit is either circular or elliptic it is possible to find the optimal performances required to reach an escape condition or a given distance from the Sun. Another example is given by the optimal strategy required to reach a heliocentric Keplerian orbit of prescribed orbital period. Finally the graphical approach is applied to the preliminary design of a nodal mission towards a Near Earth Asteroid.
Communication: A reduced scaling J-engine based reformulation of SOS-MP2 using graphics processing units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maurer, S. A.; Kussmann, J.; Ochsenfeld, C., E-mail: Christian.Ochsenfeld@cup.uni-muenchen.de

2014-08-07

We present a low-prefactor, cubically scaling scaled-opposite-spin second-order Møller-Plesset perturbation theory (SOS-MP2) method which is highly suitable for massively parallel architectures like graphics processing units (GPU). The scaling is reduced from O(N{sup 5}) to O(N{sup 3}) by a reformulation of the MP2-expression in the atomic orbital basis via Laplace transformation and the resolution-of-the-identity (RI) approximation of the integrals in combination with efficient sparse algebra for the 3-center integral transformation. In contrast to previous works that employ GPUs for post Hartree-Fock calculations, we do not simply employ GPU-based linear algebra libraries to accelerate the conventional algorithm. Instead, our reformulation allows tomore » replace the rate-determining contraction step with a modified J-engine algorithm, that has been proven to be highly efficient on GPUs. Thus, our SOS-MP2 scheme enables us to treat large molecular systems in an accurate and efficient manner on a single GPU-server.« less
A fast mass spring model solver for high-resolution elastic objects

NASA Astrophysics Data System (ADS)

Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian

2017-03-01

Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.
22 CFR 228.52 - Suppliers of commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Suppliers of commodities. 228.52 Section 228.52 Foreign Relations AGENCY FOR INTERNATIONAL DEVELOPMENT RULES ON SOURCE, ORIGIN AND NATIONALITY FOR COMMODITIES AND SERVICES FINANCED BY USAID Waivers § 228.52 Suppliers of commodities. Geographic code changes...
7 CFR 17.2 - Definition of terms.

Code of Federal Regulations, 2010 CFR

2010-01-01

... Office of the Secretary of Agriculture SALES OF AGRICULTURAL COMMODITIES MADE AVAILABLE UNDER TITLE I OF... in the second legal entity. CCC. The Commodity Credit Corporation, USDA. Commodity. An agricultural commodity produced in the United States, or product thereof produced in the United States, as specified in...
17 CFR 1.32 - Segregated account; daily computation and record.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Segregated account; daily computation and record. 1.32 Section 1.32 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Recordkeeping § 1.32 Segregated account...
31 CFR 560.533 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2011 CFR

2011-07-01

... commodities, medicine, and medical devices. 560.533 Section 560.533 Money and Finance: Treasury Regulations... Brokering sales of agricultural commodities, medicine, and medical devices. (a) General license for... agricultural commodities, medicine, and medical devices, provided that the sale and exportation or...
31 CFR 560.533 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2010 CFR

2010-07-01

... commodities, medicine, and medical devices. 560.533 Section 560.533 Money and Finance: Treasury Regulations... Brokering sales of agricultural commodities, medicine, and medical devices. (a) General license for... agricultural commodities, medicine, and medical devices, provided that the sale and exportation or...
Capital dissipation minimization for a class of complex irreversible resource exchange processes

NASA Astrophysics Data System (ADS)

Xia, Shaojun; Chen, Lingen

2017-05-01

A model of a class of irreversible resource exchange processes (REPes) between a firm and a producer with commodity flow leakage from the producer to a competitive market is established in this paper. The REPes are assumed to obey the linear commodity transfer law (LCTL). Optimal price paths for capital dissipation minimization (CDM) (it can measure economic process irreversibility) are obtained. The averaged optimal control theory is used. The optimal REP strategy is also compared with other strategies, such as constant-firm-price operation and constant-commodity-flow operation, and effects of the amount of commodity transferred and the commodity flow leakage on the optimal REP strategy are also analyzed. The commodity prices of both the producer and the firm for the CDM of the REPes with commodity flow leakage change with the time exponentially.
17 CFR 2.2 - Authority to affix seal.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 17 Commodity and Securities Exchanges 1 2010-04-01 2010-04-01 false Authority to affix seal. 2.2 Section 2.2 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION OFFICIAL SEAL § 2.2 Authority to affix seal. (a) The following officials of the Commodity Futures Trading Commission are...
17 CFR 2.2 - Authority to affix seal.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Authority to affix seal. 2.2 Section 2.2 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION OFFICIAL SEAL § 2.2 Authority to affix seal. (a) The following officials of the Commodity Futures Trading Commission are...
29 CFR 780.112 - General meaning of “agriculture or horticultural commodities.”

Code of Federal Regulations, 2011 CFR

2011-07-01

... EXEMPTIONS APPLICABLE TO AGRICULTURE, PROCESSING OF AGRICULTURAL COMMODITIES, AND RELATED SUBJECTS UNDER THE FAIR LABOR STANDARDS ACT General Scope of Agriculture Agricultural Or Horticultural Commodities § 780.112 General meaning of “agriculture or horticultural commodities.” Section 3(f) of the Act defines as...
29 CFR 780.112 - General meaning of “agriculture or horticultural commodities.”

Code of Federal Regulations, 2010 CFR

2010-07-01

... EXEMPTIONS APPLICABLE TO AGRICULTURE, PROCESSING OF AGRICULTURAL COMMODITIES, AND RELATED SUBJECTS UNDER THE FAIR LABOR STANDARDS ACT General Scope of Agriculture Agricultural Or Horticultural Commodities § 780.112 General meaning of “agriculture or horticultural commodities.” Section 3(f) of the Act defines as...
29 CFR 780.112 - General meaning of “agriculture or horticultural commodities.”

Code of Federal Regulations, 2013 CFR

2013-07-01

... EXEMPTIONS APPLICABLE TO AGRICULTURE, PROCESSING OF AGRICULTURAL COMMODITIES, AND RELATED SUBJECTS UNDER THE FAIR LABOR STANDARDS ACT General Scope of Agriculture Agricultural Or Horticultural Commodities § 780.112 General meaning of “agriculture or horticultural commodities.” Section 3(f) of the Act defines as...

29 CFR 780.112 - General meaning of “agriculture or horticultural commodities.”

Code of Federal Regulations, 2014 CFR

2014-07-01

... EXEMPTIONS APPLICABLE TO AGRICULTURE, PROCESSING OF AGRICULTURAL COMMODITIES, AND RELATED SUBJECTS UNDER THE FAIR LABOR STANDARDS ACT General Scope of Agriculture Agricultural Or Horticultural Commodities § 780.112 General meaning of “agriculture or horticultural commodities.” Section 3(f) of the Act defines as...
29 CFR 780.112 - General meaning of “agriculture or horticultural commodities.”

Code of Federal Regulations, 2012 CFR

2012-07-01

... EXEMPTIONS APPLICABLE TO AGRICULTURE, PROCESSING OF AGRICULTURAL COMMODITIES, AND RELATED SUBJECTS UNDER THE FAIR LABOR STANDARDS ACT General Scope of Agriculture Agricultural Or Horticultural Commodities § 780.112 General meaning of “agriculture or horticultural commodities.” Section 3(f) of the Act defines as...
22 CFR 228.11 - Source and origin of commodities.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Source and origin of commodities. 228.11 Section 228.11 Foreign Relations AGENCY FOR INTERNATIONAL DEVELOPMENT RULES ON SOURCE, ORIGIN AND... Commodity Procurement Transactions for USAID Financing § 228.11 Source and origin of commodities. (a) The...
17 CFR 1.49 - Denomination of customer funds and location of depositories.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Denomination of customer funds and location of depositories. 1.49 Section 1.49 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION GENERAL REGULATIONS UNDER THE COMMODITY EXCHANGE ACT Miscellaneous § 1.49 Denomination...
17 CFR Appendix B to Part 190 - Special Bankruptcy Distributions

Code of Federal Regulations, 2011 CFR

2011-04-01

... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Special Bankruptcy Distributions B Appendix B to Part 190 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION... purposes of this distributional rule, XM accounts will be deemed to be commodity interest accounts and...
31 CFR 560.533 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2013 CFR

2013-07-01

... commodities, medicine, and medical devices. 560.533 Section 560.533 Money and Finance: Treasury Regulations... Policy § 560.533 Brokering sales of agricultural commodities, medicine, and medical devices. (a) General... of agricultural commodities, medicine, and medical devices, provided that the sale and exportation or...
31 CFR 538.526 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2011 CFR

2011-07-01

... commodities, medicine, and medical devices. 538.526 Section 538.526 Money and Finance: Treasury Regulations... Brokering sales of agricultural commodities, medicine, and medical devices. (a) General license for... agricultural commodities, medicine, and medical devices to the Government of Sudan, to any individual or entity...
31 CFR 560.533 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2014 CFR

2014-07-01

... commodities, medicine, and medical devices. 560.533 Section 560.533 Money and Finance: Treasury Regulations... Policy § 560.533 Brokering sales of agricultural commodities, medicine, and medical devices. (a) General... agricultural commodities, medicine, and medical devices, provided that the sale and exportation or...
31 CFR 538.526 - Brokering sales of agricultural commodities, medicine, and medical devices.

Code of Federal Regulations, 2010 CFR

2010-07-01

... commodities, medicine, and medical devices. 538.526 Section 538.526 Money and Finance: Treasury Regulations... Brokering sales of agricultural commodities, medicine, and medical devices. (a) General license for... agricultural commodities, medicine, and medical devices to the Government of Sudan, to any individual or entity...
31 CFR 560.530 - Commercial sales, exportation, and reexportation of agricultural commodities, medicine, and...

Code of Federal Regulations, 2014 CFR

2014-07-01

... reexportation of agricultural commodities, medicine, and medical devices. 560.530 Section 560.530 Money and... commodities, medicine, and medical devices. (a)(1) One-year license requirement. (i) The exportation or reexportation of agricultural commodities, medicine, and medical devices that are not covered by the general...
17 CFR 4.12 - Exemption from provisions of part 4.

Code of Federal Regulations, 2010 CFR

2010-04-01

... part 4. 4.12 Section 4.12 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION... said Act; (B) Will generally and routinely engage in the buying and selling of securities and securities derived instruments; (C) Will not enter into commodity futures and commodity options contracts for...
31 CFR 560.526 - Commodities trading and related transactions.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 31 Money and Finance:Treasury 3 2012-07-01 2012-07-01 false Commodities trading and related... Licenses, Authorizations and Statements of Licensing Policy § 560.526 Commodities trading and related transactions. (a) Trading in Iranian-origin commodities. With respect to § 560.206, specific licenses may be...
75 FR 44890 - Operation, in the Ordinary Course, of a Commodity Broker in Bankruptcy

Federal Register 2010, 2011, 2012, 2013, 2014

2010-07-30

... of the customers of such commodity broker, under appropriate circumstances, as determined by the... contracts on behalf of the customers of such commodity broker (the ``Notice'').\\1\\ The proposed rule stated... customer accounts are handled, under appropriate circumstances, in a commodity broker bankruptcy, which may...
7 CFR 17.5 - Contracts between commodity suppliers and importers.

Code of Federal Regulations, 2010 CFR

2010-01-01

... officers and a description of the firm's experience as an exporter of U.S. agricultural commodities. Copies... under this part and the purchase authorization. (3) If, at the time the commodity supplier reports the... requirements unless otherwise specified in the purchase authorization. (1) Commodity contracts between...
17 CFR 31.6 - Registration of leverage commodities.

Code of Federal Regulations, 2011 CFR

2011-04-01

... commodity's economic value and how such amendments might affect the ability of leverage customers making or... a change in the economic value of such commodities and, if so, quantify the extent of such changes... the ability of leverage customers electing to make or take delivery of the commodity at an economic...
31 CFR 560.526 - Commodities trading and related transactions.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 31 Money and Finance: Treasury 3 2010-07-01 2010-07-01 false Commodities trading and related... Licenses, Authorizations and Statements of Licensing Policy § 560.526 Commodities trading and related transactions. (a) Trading in Iranian-origin commodities. With respect to § 560.206, specific licenses may be...
7 CFR 1488.9a - Evidence of export for commodities delivered before export.

Code of Federal Regulations, 2013 CFR

2013-01-01

... COMMODITIES Financing of Export Sales of Agricultural Commodities From Private Stocks Under CCC Export Credit... financial period is 12 months or less, the exporter shall furnish a certification to the Treasurer, CCC... Assistant Treasurer, CCC, certifying that the commodities have been exported. The certification must include...
7 CFR 1488.9a - Evidence of export for commodities delivered before export.

Code of Federal Regulations, 2012 CFR

2012-01-01

... COMMODITIES Financing of Export Sales of Agricultural Commodities From Private Stocks Under CCC Export Credit... financial period is 12 months or less, the exporter shall furnish a certification to the Treasurer, CCC... Assistant Treasurer, CCC, certifying that the commodities have been exported. The certification must include...
7 CFR 1488.9a - Evidence of export for commodities delivered before export.

Code of Federal Regulations, 2014 CFR

2014-01-01

... COMMODITIES Financing of Export Sales of Agricultural Commodities From Private Stocks Under CCC Export Credit... financial period is 12 months or less, the exporter shall furnish a certification to the Treasurer, CCC... Assistant Treasurer, CCC, certifying that the commodities have been exported. The certification must include...
Synthesis and Verification of Biobased Terephthalic Acid from Furfural

PubMed Central

Tachibana, Yuya; Kimura, Saori; Kasuya, Ken-ichi

2015-01-01

Exploiting biomass as an alternative to petrochemicals for the production of commodity plastics is vitally important if we are to become a more sustainable society. Here, we report a synthetic route for the production of terephthalic acid (TPA), the monomer of the widely used thermoplastic polymer poly(ethylene terephthalate) (PET), from the biomass-derived starting material furfural. Biobased furfural was oxidised and dehydrated to give maleic anhydride, which was further reacted with biobased furan to give its Diels-Alder (DA) adduct. The dehydration of the DA adduct gave phthalic anhydride, which was converted via phthalic acid and dipotassium phthalate to TPA. The biobased carbon content of the TPA was measured by accelerator mass spectroscopy and the TPA was found to be made of 100% biobased carbon. PMID:25648201

Some links on this page may take you to non-federal websites. Their policies may differ from this site.