Science.gov

Sample records for mobile graphics processing

  1. Evaluating Mobile Graphics Processing Units (GPUs) for Real-Time Resource Constrained Applications

    SciTech Connect

    Meredith, J; Conger, J; Liu, Y; Johnson, J

    2005-11-11

    Modern graphics processing units (GPUs) can provide tremendous performance boosts for some applications beyond what a single CPU can accomplish, and their performance is growing at a rate faster than CPUs as well. Mobile GPUs available for laptops have the small form factor and low power requirements suitable for use in embedded processing. We evaluated several desktop and mobile GPUs and CPUs on traditional and non-traditional graphics tasks, as well as on the most time consuming pieces of a full hyperspectral imaging application. Accuracy remained high despite small differences in arithmetic operations like rounding. Performance improvements are summarized here relative to a desktop Pentium 4 CPU.

  2. Oklahoma's Mobile Computer Graphics Laboratory.

    ERIC Educational Resources Information Center

    McClain, Gerald R.

    This Computer Graphics Laboratory houses an IBM 1130 computer, U.C.C. plotter, printer, card reader, two key punch machines, and seminar-type classroom furniture. A "General Drafting Graphics System" (GDGS) is used, based on repetitive use of basic coordinate and plot generating commands. The system is used by 12 institutions of higher education…

  3. Graphical Language for Data Processing

    NASA Technical Reports Server (NTRS)

    Alphonso, Keith

    2011-01-01

    A graphical language for processing data allows processing elements to be connected with virtual wires that represent data flows between processing modules. The processing of complex data, such as lidar data, requires many different algorithms to be applied. The purpose of this innovation is to automate the processing of complex data, such as LIDAR, without the need for complex scripting and programming languages. The system consists of a set of user-interface components that allow the user to drag and drop various algorithmic and processing components onto a process graph. By working graphically, the user can completely visualize the process flow and create complex diagrams. This innovation supports the nesting of graphs, such that a graph can be included in another graph as a single step for processing. In addition to the user interface components, the system includes a set of .NET classes that represent the graph internally. These classes provide the internal system representation of the graphical user interface. The system includes a graph execution component that reads the internal representation of the graph (as described above) and executes that graph. The execution of the graph follows the interpreted model of execution in that each node is traversed and executed from the original internal representation. In addition, there are components that allow external code elements, such as algorithms, to be easily integrated into the system, thus making the system infinitely expandable.

  4. Toward energy-aware balancing of mobile graphics

    NASA Astrophysics Data System (ADS)

    Stavrakis, Efstathios; Polychronis, Marios; Pelekanos, Nectarios; Artusi, Alessandro; Hadjichristodoulou, Panayiotis; Chrysanthou, Yiorgos

    2015-03-01

    In the area of computer graphics the design of hardware and software has primarily been driven by the need to achieve maximum performance. Energy efficiency was usually neglected, assuming that a stable always-on power source was available. However, the advent of the mobile era has brought into question these ideas and designs in computer graphics since mobile devices are both limited by their computational capabilities and their energy sources. Aligned to this emerging need in computer graphics for energy efficiency analysis we have setup a software framework to obtain power measurements from 3D scenes using off-the-shelf hardware that allows for sampling the energy consumption over the power rails of the CPU and GPU. Our experiments include geometric complexity, texture resolution and common CPU and GPU workloads. The goal of this work is to combine the knowledge obtained from these measurements into a prototype energy-aware balancer of processing resources. The balancer dynamically selects the rendering parameters and uses a simple framerate-based dynamic frequency scaling strategy. Our experimental results demonstrate that our power saving framework can achieve savings of approximately 40%.

  5. Cockpit weather graphics using mobile satellite communications

    NASA Technical Reports Server (NTRS)

    Seth, Shashi

    1993-01-01

    Many new companies are pushing state-of-the-art technology to bring a revolution in the cockpits of General Aviation (GA) aircraft. The vision, according to Dr. Bruce Holmes - the Assistant Director for Aeronautics at National Aeronautics and Space Administration's (NASA) Langley Research Center, is to provide such an advanced flight control system that the motor and cognitive skills you use to drive a car would be very similar to the ones you would use to fly an airplane. We at ViGYAN, Inc., are currently developing a system called the Pilot Weather Advisor (PWxA), which would be a part of such an advanced technology flight management system. The PWxA provides graphical depictions of weather information in the cockpit of aircraft in near real-time, through the use of broadcast satellite communications. The purpose of this system is to improve the safety and utility of GA aircraft operations. Considerable effort is being extended for research in the design of graphical weather systems, notably the works of Scanlon and Dash. The concept of providing pilots with graphical depictions of weather conditions, overlaid on geographical and navigational maps, is extremely powerful.

  6. Graphics hardware accelerated panorama builder for mobile phones

    NASA Astrophysics Data System (ADS)

    Bordallo López, Miguel; Hannuksela, Jari; Silvén, Olli; Vehviläinen, Markku

    2009-02-01

    Modern mobile communication devices frequently contain built-in cameras allowing users to capture highresolution still images, but at the same time the imaging applications are facing both usability and throughput bottlenecks. The difficulties in taking ad hoc pictures of printed paper documents with multi-megapixel cellular phone cameras on a common business use case, illustrate these problems for anyone. The result can be examined only after several seconds, and is often blurry, so a new picture is needed, although the view-finder image had looked good. The process can be a frustrating one with waits and the user not being able to predict the quality beforehand. The problems can be traced to the processor speed and camera resolution mismatch, and application interactivity demands. In this context we analyze building mosaic images of printed documents from frames selected from VGA resolution (640x480 pixel) video. High interactivity is achieved by providing real-time feedback on the quality, while simultaneously guiding the user actions. The graphics processing unit of the mobile device can be used to speed up the reconstruction computations. To demonstrate the viability of the concept, we present an interactive document scanning application implemented on a Nokia N95 mobile phone.

  7. Graphic Design in Libraries: A Conceptual Process

    ERIC Educational Resources Information Center

    Ruiz, Miguel

    2014-01-01

    Providing successful library services requires efficient and effective communication with users; therefore, it is important that content creators who develop visual materials understand key components of design and, specifically, develop a holistic graphic design process. Graphic design, as a form of visual communication, is the process of…

  8. Hyperspectral processing in graphical processing units

    NASA Astrophysics Data System (ADS)

    Winter, Michael E.; Winter, Edwin M.

    2011-06-01

    With the advent of the commercial 3D video card in the mid 1990s, we have seen an order of magnitude performance increase with each generation of new video cards. While these cards were designed primarily for visualization and video games, it became apparent after a short while that they could be used for scientific purposes. These Graphical Processing Units (GPUs) are rapidly being incorporated into data processing tasks usually reserved for general purpose computers. It has been found that many image processing problems scale well to modern GPU systems. We have implemented four popular hyperspectral processing algorithms (N-FINDR, linear unmixing, Principal Components, and the RX anomaly detection algorithm). These algorithms show an across the board speedup of at least a factor of 10, with some special cases showing extreme speedups of a hundred times or more.

  9. Optimization Techniques for 3D Graphics Deployment on Mobile Devices

    NASA Astrophysics Data System (ADS)

    Koskela, Timo; Vatjus-Anttila, Jarkko

    2015-03-01

    3D Internet technologies are becoming essential enablers in many application areas including games, education, collaboration, navigation and social networking. The use of 3D Internet applications with mobile devices provides location-independent access and richer use context, but also performance issues. Therefore, one of the important challenges facing 3D Internet applications is the deployment of 3D graphics on mobile devices. In this article, we present an extensive survey on optimization techniques for 3D graphics deployment on mobile devices and qualitatively analyze the applicability of each technique from the standpoints of visual quality, performance and energy consumption. The analysis focuses on optimization techniques related to data-driven 3D graphics deployment, because it supports off-line use, multi-user interaction, user-created 3D graphics and creation of arbitrary 3D graphics. The outcome of the analysis facilitates the development and deployment of 3D Internet applications on mobile devices and provides guidelines for future research.

  10. Process and representation in graphical displays

    NASA Technical Reports Server (NTRS)

    Gillan, Douglas J.; Lewis, Robert; Rudisill, Marianne

    1990-01-01

    How people comprehend graphics is examined. Graphical comprehension involves the cognitive representation of information from a graphic display and the processing strategies that people apply to answer questions about graphics. Research on representation has examined both the features present in a graphic display and the cognitive representation of the graphic. The key features include the physical components of a graph, the relation between the figure and its axes, and the information in the graph. Tests of people's memory for graphs indicate that both the physical and informational aspect of a graph are important in the cognitive representation of a graph. However, the physical (or perceptual) features overshadow the information to a large degree. Processing strategies also involve a perception-information distinction. In order to answer simple questions (e.g., determining the value of a variable, comparing several variables, and determining the mean of a set of variables), people switch between two information processing strategies: (1) an arithmetic, look-up strategy in which they use a graph much like a table, looking up values and performing arithmetic calculations; and (2) a perceptual strategy in which they use the spatial characteristics of the graph to make comparisons and estimations. The user's choice of strategies depends on the task and the characteristics of the graph. A theory of graphic comprehension is presented.

  11. Process and representation in graphical displays

    NASA Technical Reports Server (NTRS)

    Gillan, Douglas J.; Lewis, Robert; Rudisill, Marianne

    1993-01-01

    Our initial model of graphic comprehension has focused on statistical graphs. Like other models of human-computer interaction, models of graphical comprehension can be used by human-computer interface designers and developers to create interfaces that present information in an efficient and usable manner. Our investigation of graph comprehension addresses two primary questions: how do people represent the information contained in a data graph?; and how do they process information from the graph? The topics of focus for graphic representation concern the features into which people decompose a graph and the representations of the graph in memory. The issue of processing can be further analyzed as two questions: what overall processing strategies do people use?; and what are the specific processing skills required?

  12. Graphics processing unit-assisted lossless decompression

    DOEpatents

    Loughry, Thomas A.

    2016-04-12

    Systems and methods for decompressing compressed data that has been compressed by way of a lossless compression algorithm are described herein. In a general embodiment, a graphics processing unit (GPU) is programmed to receive compressed data packets and decompress such packets in parallel. The compressed data packets are compressed representations of an image, and the lossless compression algorithm is a Rice compression algorithm.

  13. Graphical analysis of power systems for mobile robotics

    NASA Astrophysics Data System (ADS)

    Raade, Justin William

    The field of mobile robotics places stringent demands on the power system. Energetic autonomy, or the ability to function for a useful operation time independent of any tether, refueling, or recharging, is a driving force in a robot designed for a field application. The focus of this dissertation is the development of two graphical analysis tools, namely Ragone plots and optimal hybridization plots, for the design of human scale mobile robotic power systems. These tools contribute to the intuitive understanding of the performance of a power system and expand the toolbox of the design engineer. Ragone plots are useful for graphically comparing the merits of different power systems for a wide range of operation times. They plot the specific power versus the specific energy of a system on logarithmic scales. The driving equations in the creation of a Ragone plot are derived in terms of several important system parameters. Trends at extreme operation times (both very short and very long) are examined. Ragone plot analysis is applied to the design of several power systems for high-power human exoskeletons. Power systems examined include a monopropellant-powered free piston hydraulic pump, a gasoline-powered internal combustion engine with hydraulic actuators, and a fuel cell with electric actuators. Hybrid power systems consist of two or more distinct energy sources that are used together to meet a single load. They can often outperform non-hybrid power systems in low duty-cycle applications or those with widely varying load profiles and long operation times. Two types of energy sources are defined: engine-like and capacitive. The hybridization rules for different combinations of energy sources are derived using graphical plots of hybrid power system mass versus the primary system power. Optimal hybridization analysis is applied to several power systems for low-power human exoskeletons. Hybrid power systems examined include a fuel cell and a solar panel coupled with

  14. Graphics Processing Unit Assisted Thermographic Compositing

    NASA Technical Reports Server (NTRS)

    Ragasa, Scott; McDougal, Matthew; Russell, Sam

    2013-01-01

    Objective: To develop a software application utilizing general purpose graphics processing units (GPUs) for the analysis of large sets of thermographic data. Background: Over the past few years, an increasing effort among scientists and engineers to utilize the GPU in a more general purpose fashion is allowing for supercomputer level results at individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU to allow for throughput that was previously reserved for compute clusters. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Signal (image) processing is one area were GPUs are being used to greatly increase the performance of certain algorithms and analysis techniques.

  15. Process control graphics for petrochemical plants

    SciTech Connect

    Lieber, R.E.

    1982-12-01

    Describes many specialized features of a computer control system, schematic/graphics in particular, which are vital to effectively run today's complex refineries and chemical plants. Illustrates such control systems as a full-graphic control house panel of the 60s, a European refinery control house of the early 70s, and the Ingolstadt refinery control house. Presents diagram showing a shape library. Implementation of state-of-the-art control theory, distributed control, dual hi-way digital instrument systems, and many other person-machine interface developments have been prime factors in process control. Further developments in person-machine interfaces are in progress including voice input/output, touch screen, and other entry devices. Color usage, angle of projection, control house lighting, and pattern recognition are all being studied by vendors, users, and academics. These studies involve psychologists concerned with ''quality of life'' factors, employee relations personnel concerned with labor contracts or restrictions, as well as operations personnel concerned with just getting the plant to run better.

  16. Partial wave analysis using graphics processing units

    NASA Astrophysics Data System (ADS)

    Berger, Niklaus; Beijiang, Liu; Jike, Wang

    2010-04-01

    Partial wave analysis is an important tool for determining resonance properties in hadron spectroscopy. For large data samples however, the un-binned likelihood fits employed are computationally very expensive. At the Beijing Spectrometer (BES) III experiment, an increase in statistics compared to earlier experiments of up to two orders of magnitude is expected. In order to allow for a timely analysis of these datasets, additional computing power with short turnover times has to be made available. It turns out that graphics processing units (GPUs) originally developed for 3D computer games have an architecture of massively parallel single instruction multiple data floating point units that is almost ideally suited for the algorithms employed in partial wave analysis. We have implemented a framework for tensor manipulation and partial wave fits called GPUPWA. The user writes a program in pure C++ whilst the GPUPWA classes handle computations on the GPU, memory transfers, caching and other technical details. In conjunction with a recent graphics processor, the framework provides a speed-up of the partial wave fit by more than two orders of magnitude compared to legacy FORTRAN code.

  17. Kernel density estimation using graphical processing unit

    NASA Astrophysics Data System (ADS)

    Sunarko, Su'ud, Zaki

    2015-09-01

    Kernel density estimation for particles distributed over a 2-dimensional space is calculated using a single graphical processing unit (GTX 660Ti GPU) and CUDA-C language. Parallel calculations are done for particles having bivariate normal distribution and by assigning calculations for equally-spaced node points to each scalar processor in the GPU. The number of particles, blocks and threads are varied to identify favorable configuration. Comparisons are obtained by performing the same calculation using 1, 2 and 4 processors on a 3.0 GHz CPU using MPICH 2.0 routines. Speedups attained with the GPU are in the range of 88 to 349 times compared the multiprocessor CPU. Blocks of 128 threads are found to be the optimum configuration for this case.

  18. Graphics Processing Units for HEP trigger systems

    NASA Astrophysics Data System (ADS)

    Ammendola, R.; Bauce, M.; Biagioni, A.; Chiozzi, S.; Cotta Ramusino, A.; Fantechi, R.; Fiorini, M.; Giagu, S.; Gianoli, A.; Lamanna, G.; Lonardo, A.; Messina, A.; Neri, I.; Paolucci, P. S.; Piandani, R.; Pontisso, L.; Rescigno, M.; Simula, F.; Sozzi, M.; Vicini, P.

    2016-07-01

    General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on CERN NA62 experiment trigger system. The use of GPU in higher level trigger system is also briefly considered.

  19. Graphics Processing Unit Assisted Thermographic Compositing

    NASA Technical Reports Server (NTRS)

    Ragasa, Scott; McDougal, Matthew; Russell, Sam

    2012-01-01

    Objective: To develop a software application utilizing general purpose graphics processing units (GPUs) for the analysis of large sets of thermographic data. Background: Over the past few years, an increasing effort among scientists and engineers to utilize the GPU in a more general purpose fashion is allowing for supercomputer level results at individual workstations. As data sets grow, the methods to work them grow at an equal, and often great, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU to allow for throughput that was previously reserved for compute clusters. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Signal (image) processing is one area were GPUs are being used to greatly increase the performance of certain algorithms and analysis techniques. Technical Methodology/Approach: Apply massively parallel algorithms and data structures to the specific analysis requirements presented when working with thermographic data sets.

  20. Animation and Learning: Selective Processing of Information in Dynamic Graphics.

    ERIC Educational Resources Information Center

    Lowe, R. K.

    2003-01-01

    Studied the selective processing of information in dynamic graphics by 12 undergraduates who received training aided by animation and 12 who did not. Results indicate selective processing of the animation that involved perceptually driven dynamic effects and raise questions about the assumed superiority of animations over static graphics. (SLD)

  1. Text and Illustration Processing System (TIPS) User's Manual. Volume 2: Graphics Processing System.

    ERIC Educational Resources Information Center

    Cox, Ray; Braby, Richard

    This manual contains the procedures to teach the relatively inexperienced author how to enter and process graphic information on a graphics processing system developed by the Training Analysis and Evaluation Group. It describes the illustration processing routines, including scanning graphics into computer memory, displaying graphics, enhancing…

  2. Graphics

    ERIC Educational Resources Information Center

    Post, Susan

    1975-01-01

    An art teacher described an elective course in graphics which was designed to enlarge a student's knowledge of value, color, shape within a shape, transparency, line and texture. This course utilized the technique of working a multi-colored print from a single block that was first introduced by Picasso. (Author/RK)

  3. Reading the Graphics: What Is the Relationship between Graphical Reading Processes and Student Comprehension?

    ERIC Educational Resources Information Center

    Norman, Rebecca R.

    2012-01-01

    Research on comprehension of written text and reading processes suggests a greater use of reading processes is associated with higher scores on comprehension measures of those same texts. Although researchers have suggested that the graphics in text convey important meaning, little research exists on the relationship between children's processes…

  4. Reading the Graphics: Reading Processes Prompted by the Graphics as Second Graders Read Informational Text

    ERIC Educational Resources Information Center

    Norman, Rebecca R.

    2010-01-01

    This dissertation is comprised of two manuscripts that resulted from a single study using verbal protocols to examine the reading processes prompted by the graphics as second graders read informational text. Verbal protocols have provided researchers with an understanding of the processes readers use as they read. Little is known, however, about…

  5. Picture This: Processes Prompted by Graphics in Informational Text

    ERIC Educational Resources Information Center

    Norman, Rebecca R.

    2010-01-01

    Verbal protocols have provided literacy researchers with a strong understanding of what processes readers (both adults and children) use as they read narrative and informational text. Little is known, however, about the comprehension processes that are prompted by the graphics in these texts. This study of nine second graders used verbal protocol…

  6. Graphic Arts: Book Two. Process Camera, Stripping, and Platemaking.

    ERIC Educational Resources Information Center

    Farajollahi, Karim; And Others

    The second of a three-volume set of instructional materials for a course in graphic arts, this manual consists of 10 instructional units dealing with the process camera, stripping, and platemaking. Covered in the individual units are the process camera and darkroom photography, line photography, half-tone photography, other darkroom techniques,…

  7. Graphic Arts: Process Camera, Stripping, and Platemaking. Third Edition.

    ERIC Educational Resources Information Center

    Crummett, Dan

    This document contains teacher and student materials for a course in graphic arts concentrating on camera work, stripping, and plate making in the printing process. Eight units of instruction cover the following topics: (1) the process camera and darkroom equipment; (2) line photography; (3) halftone photography; (4) other darkroom techniques; (5)…

  8. Graphic Arts: The Press and Finishing Processes. Third Edition.

    ERIC Educational Resources Information Center

    Crummett, Dan

    This document contains teacher and student materials for a course in graphic arts concentrating on printing presses and the finishing process for publications. Seven units of instruction cover the following topics: (1) offset press systems; (2) offset inks and dampening chemistry; (3) offset press operating procedures; (4) preventive maintenance…

  9. Graphic Arts: Book Three. The Press and Related Processes.

    ERIC Educational Resources Information Center

    Farajollahi, Karim; And Others

    The third of a three-volume set of instructional materials for a graphic arts course, this manual consists of nine instructional units dealing with presses and related processes. Covered in the units are basic press fundamentals, offset press systems, offset press operating procedures, offset inks and dampening chemistry, preventive maintenance…

  10. Digital-Computer Processing of Graphical Data. Final Report.

    ERIC Educational Resources Information Center

    Freeman, Herbert

    The final report of a two-year study concerned with the digital-computer processing of graphical data. Five separate investigations carried out under this study are described briefly, and a detailed bibliography, complete with abstracts, is included in which are listed the technical papers and reports published during the period of this program.…

  11. Engineering graphics and image processing at Langley Research Center

    NASA Technical Reports Server (NTRS)

    Voigt, Susan J.

    1985-01-01

    The objective of making raster graphics and image processing techniques readily available for the analysis and display of engineering and scientific data is stated. The approach is to develop and acquire tools and skills which are applied to support research activities in such disciplines as aeronautics and structures. A listing of grants and key personnel are given.

  12. Obliterable of graphics and correction of skew using Hough transform for mobile captured documents

    NASA Astrophysics Data System (ADS)

    Chethan, H. K.; Kumar, G. Hemantha

    2011-10-01

    CBDA is an emerging field in Computer Vision and Pattern Recognition. In recent technology camera are incorporated to several electronic equipments and are very interesting and thus playing a vital role by replacing scanner with hand held imaging devices like Digital Cameras, Mobile phones and gaming devices attached with these camera. The goal of the work is to remove graphics from the document which plays a vital role in recognition of characters from the mobile captured documents. In this paper we have proposed a novel method for separating or removal of graphics like logos, animations other than the text from the document and method to reduce noise and finally textual content skew is estimated and corrected using Hough Transform. The experimental results show the efficacy compared to the result of well known existing methods.

  13. Novel graphical environment for virtual and real-world operations of tracked mobile manipulators

    NASA Astrophysics Data System (ADS)

    Chen, ChuXin; Trivedi, Mohan M.; Azam, Mir; Lassiter, Nils T.

    1993-08-01

    A simulation, animation, visualization and interactive control (SAVIC) environment has been developed for the design and operation of an integrated mobile manipulator system. This unique system possesses the abilities for (1) multi-sensor simulation, (2) kinematics and locomotion animation, (3) dynamic motion and manipulation animation, (4) transformation between real and virtual modes within the same graphics system, (5) ease in exchanging software modules and hardware devices between real and virtual world operations, and (6) interfacing with a real robotic system. This paper describes a working system and illustrates the concepts by presenting the simulation, animation and control methodologies for a unique mobile robot with articulated tracks, a manipulator, and sensory modules.

  14. Beam line error analysis, position correction, and graphic processing

    NASA Astrophysics Data System (ADS)

    Wang, Fuhua; Mao, Naifeng

    1993-12-01

    A beam transport line error analysis and beam position correction code called ``EAC'' has been enveloped associated with a graphics and data post processing package for TRANSPORT. Based on the linear optics design using TRANSPORT or other general optics codes, EAC independently analyzes effects of magnet misalignments, systematic and statistical errors of magnetic fields as well as the effects of the initial beam positions, on the central trajectory and upon the transverse beam emittance dilution. EAC also provides an efficient way to develop beam line trajectory correcting schemes. The post processing package generates various types of graphics such as the beam line geometrical layout, plots of the Twiss parameters, beam envelopes, etc. It also generates an EAC input file, thus connecting EAC with general optics codes. EAC and the post processing package are small size codes, that are easy to access and use. They have become useful tools for the design of transport lines at SSCL.

  15. Graphical representation of the process of solving problems in statics

    NASA Astrophysics Data System (ADS)

    Lopez, Carlos

    2011-03-01

    It is presented a method of construction to a graphical representation technique of knowledge called Conceptual Chains. Especially, this tool has been focused to the representation of processes and applied to solving problems in physics, mathematics and engineering. The method is described in ten steps and is illustrated with its development in a particular topic of statics. Various possible didactic applications of this technique are showed.

  16. Graphics processing unit acceleration of computational electromagnetic methods

    NASA Astrophysics Data System (ADS)

    Inman, Matthew

    The use of Graphical Processing Units (GPU's) for scientific applications has been evolving and expanding for the decade. GPU's provide an alternative to the CPU in the creation and execution of the numerical codes that are often relied upon in to perform simulations in computational electromagnetics. While originally designed purely to display graphics on the users monitor, GPU's today are essentially powerful floating point co-processors that can be programmed not only to render complex graphics, but also perform the complex mathematical calculations often encountered in scientific computing. Currently the GPU's being produced often contain hundreds of separate cores able to access large amounts of high-speed dedicated memory. By utilizing the power offered by such a specialized processor, it is possible to drastically speed up the calculations required in computational electromagnetics. This increase in speed allows for the use of GPU based simulations in a variety of situations that the computational time has heretofore been a limiting factor in, such as in educational courses. Many situations in teaching electromagnetics often rely upon simple examples of problems due to the simulation times needed to analyze more complex problems. The use of GPU based simulations will be shown to allow demonstrations of more advanced problems than previously allowed by adapting the methods for use on the GPU. Modules will be developed for a wide variety of teaching situations utilizing the speed of the GPU to demonstrate various techniques and ideas previously unrealizable.

  17. Adaptive-optics optical coherence tomography processing using a graphics processing unit.

    PubMed

    Shafer, Brandon A; Kriske, Jeffery E; Kocaoglu, Omer P; Turner, Timothy L; Liu, Zhuolin; Lee, John Jaehwan; Miller, Donald T

    2014-01-01

    Graphics processing units are increasingly being used for scientific computing for their powerful parallel processing abilities, and moderate price compared to super computers and computing grids. In this paper we have used a general purpose graphics processing unit to process adaptive-optics optical coherence tomography (AOOCT) images in real time. Increasing the processing speed of AOOCT is an essential step in moving the super high resolution technology closer to clinical viability. PMID:25570838

  18. Porting a Hall MHD Code to a Graphic Processing Unit

    NASA Technical Reports Server (NTRS)

    Dorelli, John C.

    2011-01-01

    We present our experience porting a Hall MHD code to a Graphics Processing Unit (GPU). The code is a 2nd order accurate MUSCL-Hancock scheme which makes use of an HLL Riemann solver to compute numerical fluxes and second-order finite differences to compute the Hall contribution to the electric field. The divergence of the magnetic field is controlled with Dedner?s hyperbolic divergence cleaning method. Preliminary benchmark tests indicate a speedup (relative to a single Nehalem core) of 58x for a double precision calculation. We discuss scaling issues which arise when distributing work across multiple GPUs in a CPU-GPU cluster.

  19. Off-line graphics processing: a case study

    SciTech Connect

    Harris, D.D.

    1983-09-01

    The Drafting Systems organization at Bendix, Kansas City Division, is responsible for the creation of computer-readable media used for producing photoplots, phototools, and production traveler illustrations. From 1977 when the organization acquired its first Applicon system, until 1982 when the off-line graphics processing system was added, the production of Gerber photoplotter tapes and APPLE files presented an ever increasing load on the Applicon systems. This paper describes how the organization is now using a VAX to offload this work from the Applicon systems and presents the techniques used to automate the flow of data from the Applicon sources to the final users.

  20. Line-by-line spectroscopic simulations on graphics processing units

    NASA Astrophysics Data System (ADS)

    Collange, Sylvain; Daumas, Marc; Defour, David

    2008-01-01

    We report here on software that performs line-by-line spectroscopic simulations on gases. Elaborate models (such as narrow band and correlated-K) are accurate and efficient for bands where various components are not simultaneously and significantly active. Line-by-line is probably the most accurate model in the infrared for blends of gases that contain high proportions of H 2O and CO 2 as this was the case for our prototype simulation. Our implementation on graphics processing units sustains a speedup close to 330 on computation-intensive tasks and 12 on memory intensive tasks compared to implementations on one core of high-end processors. This speedup is due to data parallelism, efficient memory access for specific patterns and some dedicated hardware operators only available in graphics processing units. It is obtained leaving most of processor resources available and it would scale linearly with the number of graphics processing units in parallel machines. Line-by-line simulation coupled with simulation of fluid dynamics was long believed to be economically intractable but our work shows that it could be done with some affordable additional resources compared to what is necessary to perform simulations on fluid dynamics alone. Program summaryProgram title: GPU4RE Catalogue identifier: ADZY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADZY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 62 776 No. of bytes in distributed program, including test data, etc.: 1 513 247 Distribution format: tar.gz Programming language: C++ Computer: x86 PC Operating system: Linux, Microsoft Windows. Compilation requires either gcc/g++ under Linux or Visual C++ 2003/2005 and Cygwin under Windows. It has been tested using gcc 4.1.2 under Ubuntu Linux 7.04 and using Visual C

  1. Optimized Laplacian image sharpening algorithm based on graphic processing unit

    NASA Astrophysics Data System (ADS)

    Ma, Tinghuai; Li, Lu; Ji, Sai; Wang, Xin; Tian, Yuan; Al-Dhelaan, Abdullah; Al-Rodhaan, Mznah

    2014-12-01

    In classical Laplacian image sharpening, all pixels are processed one by one, which leads to large amount of computation. Traditional Laplacian sharpening processed on CPU is considerably time-consuming especially for those large pictures. In this paper, we propose a parallel implementation of Laplacian sharpening based on Compute Unified Device Architecture (CUDA), which is a computing platform of Graphic Processing Units (GPU), and analyze the impact of picture size on performance and the relationship between the processing time of between data transfer time and parallel computing time. Further, according to different features of different memory, an improved scheme of our method is developed, which exploits shared memory in GPU instead of global memory and further increases the efficiency. Experimental results prove that two novel algorithms outperform traditional consequentially method based on OpenCV in the aspect of computing speed.

  2. Exploiting graphics processing units for computational biology and bioinformatics.

    PubMed

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700. PMID:20658333

  3. Graphic and Phonological Processing in Chinese Character Identification.

    ERIC Educational Resources Information Center

    Ju, Daushen; Jackson, Nancy Ewald

    1995-01-01

    Examines the effect of graphic, phonological, and graphic-and-phonological information on Chinese character identification by 22 Mandarin-speaking Taiwanese graduate students. Finds that graphic information plays an essential role in Chinese character identification, while phonological information does not enhance the accuracy of identification.…

  4. Solar physics applications of computer graphics and image processing

    NASA Technical Reports Server (NTRS)

    Altschuler, M. D.

    1985-01-01

    Computer graphics devices coupled with computers and carefully developed software provide new opportunities to achieve insight into the geometry and time evolution of scalar, vector, and tensor fields and to extract more information quickly and cheaply from the same image data. Two or more different fields which overlay in space can be calculated from the data (and the physics), then displayed from any perspective, and compared visually. The maximum regions of one field can be compared with the gradients of another. Time changing fields can also be compared. Images can be added, subtracted, transformed, noise filtered, frequency filtered, contrast enhanced, color coded, enlarged, compressed, parameterized, and histogrammed, in whole or section by section. Today it is possible to process multiple digital images to reveal spatial and temporal correlations and cross correlations. Data from different observatories taken at different times can be processed, interpolated, and transformed to a common coordinate system.

  5. Graphics processing units accelerated semiclassical initial value representation molecular dynamics.

    PubMed

    Tamascelli, Dario; Dambrosio, Francesco Saverio; Conte, Riccardo; Ceotto, Michele

    2014-05-01

    This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the GPU implementation of the semiclassical code are provided. Four molecules with an increasing number of atoms are considered and the GPU-calculated vibrational frequencies perfectly match the benchmark values. The computational time scaling of two GPUs (NVIDIA Tesla C2075 and Kepler K20), respectively, versus two CPUs (Intel Core i5 and Intel Xeon E5-2687W) and the critical issues related to the GPU implementation are discussed. The resulting reduction in computational time and power consumption is significant and semiclassical GPU calculations are shown to be environment friendly. PMID:24811627

  6. Graphics Processing Unit Enhanced Parallel Document Flocking Clustering

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E; ST Charles, Jesse Lee

    2010-01-01

    Analyzing and clustering documents is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to generate results in a reasonable amount of time. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. In this paper, we have conducted research to exploit this archi- tecture and apply its strengths to the flocking based document clustering problem. Using the CUDA platform from NVIDIA, we developed a doc- ument flocking implementation to be run on the NVIDIA GEFORCE GPU. Performance gains ranged from thirty-six to nearly sixty times improvement of the GPU over the CPU implementation.

  7. Fast Docking on Graphics Processing Units via Ray-Casting

    PubMed Central

    Khar, Karen R.; Goldschmidt, Lukasz; Karanicolas, John

    2013-01-01

    Docking Approach using Ray Casting (DARC) is structure-based computational method for carrying out virtual screening by docking small-molecules into protein surface pockets. In a complementary study we find that DARC can be used to identify known inhibitors from large sets of decoy compounds, and can identify new compounds that are active in biochemical assays. Here, we describe our adaptation of DARC for use on Graphics Processing Units (GPUs), leading to a speedup of approximately 27-fold in typical-use cases over the corresponding calculations carried out using a CPU alone. This dramatic speedup of DARC will enable screening larger compound libraries, screening with more conformations of each compound, and including multiple receptor conformations when screening. We anticipate that all three of these enhanced approaches, which now become tractable, will lead to improved screening results. PMID:23976948

  8. Graphics processing units accelerated semiclassical initial value representation molecular dynamics

    NASA Astrophysics Data System (ADS)

    Tamascelli, Dario; Dambrosio, Francesco Saverio; Conte, Riccardo; Ceotto, Michele

    2014-05-01

    This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the GPU implementation of the semiclassical code are provided. Four molecules with an increasing number of atoms are considered and the GPU-calculated vibrational frequencies perfectly match the benchmark values. The computational time scaling of two GPUs (NVIDIA Tesla C2075 and Kepler K20), respectively, versus two CPUs (Intel Core i5 and Intel Xeon E5-2687W) and the critical issues related to the GPU implementation are discussed. The resulting reduction in computational time and power consumption is significant and semiclassical GPU calculations are shown to be environment friendly.

  9. Implementing wide baseline matching algorithms on a graphics processing unit.

    SciTech Connect

    Rothganger, Fredrick H.; Larson, Kurt W.; Gonzales, Antonio Ignacio; Myers, Daniel S.

    2007-10-01

    Wide baseline matching is the state of the art for object recognition and image registration problems in computer vision. Though effective, the computational expense of these algorithms limits their application to many real-world problems. The performance of wide baseline matching algorithms may be improved by using a graphical processing unit as a fast multithreaded co-processor. In this paper, we present an implementation of the difference of Gaussian feature extractor, based on the CUDA system of GPU programming developed by NVIDIA, and implemented on their hardware. For a 2000x2000 pixel image, the GPU-based method executes nearly thirteen times faster than a comparable CPU-based method, with no significant loss of accuracy.

  10. Graphics processing units accelerated semiclassical initial value representation molecular dynamics

    SciTech Connect

    Tamascelli, Dario; Dambrosio, Francesco Saverio; Conte, Riccardo; Ceotto, Michele

    2014-05-07

    This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the GPU implementation of the semiclassical code are provided. Four molecules with an increasing number of atoms are considered and the GPU-calculated vibrational frequencies perfectly match the benchmark values. The computational time scaling of two GPUs (NVIDIA Tesla C2075 and Kepler K20), respectively, versus two CPUs (Intel Core i5 and Intel Xeon E5-2687W) and the critical issues related to the GPU implementation are discussed. The resulting reduction in computational time and power consumption is significant and semiclassical GPU calculations are shown to be environment friendly.

  11. Graphics Processing Units and High-Dimensional Optimization

    PubMed Central

    Zhou, Hua; Lange, Kenneth; Suchard, Marc A.

    2011-01-01

    This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100 fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board. PMID:21847315

  12. Graphics Processing Unit Acceleration of Gyrokinetic Turbulence Simulations

    NASA Astrophysics Data System (ADS)

    Hause, Benjamin; Parker, Scott

    2012-10-01

    We find a substantial increase in on-node performance using Graphics Processing Unit (GPU) acceleration in gyrokinetic delta-f particle-in-cell simulation. Optimization is performed on a two-dimensional slab gyrokinetic particle simulation using the Portland Group Fortran compiler with the GPU accelerator compiler directives. We have implemented the GPU acceleration on a Core I7 gaming PC with a NVIDIA GTX 580 GPU. We find comparable, or better, acceleration relative to the NERSC DIRAC cluster with the NVIDIA Tesla C2050 computing processor. The Tesla C 2050 is about 2.6 times more expensive than the GTX 580 gaming GPU. Optimization strategies and comparisons between DIRAC and the gaming PC will be presented. We will also discuss progress on optimizing the comprehensive three dimensional general geometry GEM code.

  13. Graphic Arts: Process Camera, Stripping, and Platemaking. Teacher Guide.

    ERIC Educational Resources Information Center

    Feasley, Sue C., Ed.

    This curriculum guide is the second in a three-volume series of instructional materials for competency-based graphic arts instruction. Each publication is designed to include the technical content and tasks necessary for a student to be employed in an entry-level graphic arts occupation. Introductory materials include an instructional/task…

  14. Role of Graphics Tools in the Learning Design Process

    ERIC Educational Resources Information Center

    Laisney, Patrice; Brandt-Pomares, Pascale

    2015-01-01

    This paper discusses the design activities of students in secondary school in France. Graphics tools are now part of the capacity of design professionals. It is therefore apt to reflect on their integration into the technological education. Has the use of intermediate graphical tools changed students' performance, and if so in what direction,…

  15. Graphic Arts: The Press and Finishing Processes. Teacher Guide.

    ERIC Educational Resources Information Center

    Feasley, Sue C., Ed.

    This curriculum guide is the third in a three-volume series of instructional materials for competency-based graphic arts instruction. Each publication is designed to include the technical content and tasks necessary for a student to be employed in an entry-level graphic arts occupation. Introductory materials include an instructional/task analysis…

  16. Matrix decomposition graphics processing unit solver for Poisson image editing

    NASA Astrophysics Data System (ADS)

    Lei, Zhao; Wei, Li

    2012-10-01

    In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.

  17. Accelerating sparse linear algebra using graphics processing units

    NASA Astrophysics Data System (ADS)

    Spagnoli, Kyle E.; Humphrey, John R.; Price, Daniel K.; Kelmelis, Eric J.

    2011-06-01

    The modern graphics processing unit (GPU) found in many standard personal computers is a highly parallel math processor capable of over 1 TFLOPS of peak computational throughput at a cost similar to a high-end CPU with excellent FLOPS-to-watt ratio. High-level sparse linear algebra operations are computationally intense, often requiring large amounts of parallel operations and would seem a natural fit for the processing power of the GPU. Our work is on a GPU accelerated implementation of sparse linear algebra routines. We present results from both direct and iterative sparse system solvers. The GPU execution model featured by NVIDIA GPUs based on CUDA demands very strong parallelism, requiring between hundreds and thousands of simultaneous operations to achieve high performance. Some constructs from linear algebra map extremely well to the GPU and others map poorly. CPUs, on the other hand, do well at smaller order parallelism and perform acceptably during low-parallelism code segments. Our work addresses this via hybrid a processing model, in which the CPU and GPU work simultaneously to produce results. In many cases, this is accomplished by allowing each platform to do the work it performs most naturally. For example, the CPU is responsible for graph theory portion of the direct solvers while the GPU simultaneously performs the low level linear algebra routines.

  18. Parallel Latent Semantic Analysis using a Graphics Processing Unit

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E; Cavanagh, Joseph M

    2009-01-01

    Latent Semantic Analysis (LSA) can be used to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. In this paper, we presented a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture (CUDA) and Compute Unified Basic Linear Algebra Subprograms (CUBLAS). The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. For large matrices that have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version.

  19. Accelerating sino-atrium computer simulations with graphic processing units.

    PubMed

    Zhang, Hong; Xiao, Zheng; Lin, Shien-fong

    2015-01-01

    Sino-atrial node cells (SANCs) play a significant role in rhythmic firing. To investigate their role in arrhythmia and interactions with the atrium, computer simulations based on cellular dynamic mathematical models are generally used. However, the large-scale computation usually makes research difficult, given the limited computational power of Central Processing Units (CPUs). In this paper, an accelerating approach with Graphic Processing Units (GPUs) is proposed in a simulation consisting of the SAN tissue and the adjoining atrium. By using the operator splitting method, the computational task was made parallel. Three parallelization strategies were then put forward. The strategy with the shortest running time was further optimized by considering block size, data transfer and partition. The results showed that for a simulation with 500 SANCs and 30 atrial cells, the execution time taken by the non-optimized program decreased 62% with respect to a serial program running on CPU. The execution time decreased by 80% after the program was optimized. The larger the tissue was, the more significant the acceleration became. The results demonstrated the effectiveness of the proposed GPU-accelerating methods and their promising applications in more complicated biological simulations. PMID:26406070

  20. Accelerating radio astronomy cross-correlation with graphics processing units

    NASA Astrophysics Data System (ADS)

    Clark, M. A.; LaPlante, P. C.; Greenhill, L. J.

    2013-05-01

    We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from 'large-Formula' arrays of many radio antennas. The computational part of the algorithm, the X-engine, is implemented efficiently on NVIDIA's Fermi architecture, sustaining up to 79% of the peak single-precision floating-point throughput. We compare performance obtained for hardware- and software-managed caches, observing significantly better performance for the latter. The high performance reported involves use of a multi-level data tiling strategy in memory and use of a pipelined algorithm with simultaneous computation and transfer of data from host to device memory. The speed of code development, flexibility, and low cost of the GPU implementations compared with application-specific integrated circuit (ASIC) and field programmable gate array (FPGA) implementations have the potential to greatly shorten the cycle of correlator development and deployment, for cases where some power-consumption penalty can be tolerated.

  1. Handling geophysical flows: Numerical modelling using Graphical Processing Units

    NASA Astrophysics Data System (ADS)

    Garcia-Navarro, Pilar; Lacasta, Asier; Juez, Carmelo; Morales-Hernandez, Mario

    2016-04-01

    Computational tools may help engineers in the assessment of sediment transport during the decision-making processes. The main requirements are that the numerical results have to be accurate and simulation models must be fast. The present work is based on the 2D shallow water equations in combination with the 2D Exner equation [1]. The resulting numerical model accuracy was already discussed in previous work. Regarding the speed of the computation, the Exner equation slows down the already costly 2D shallow water model as the number of variables to solve is increased and the numerical stability is more restrictive. On the other hand, the movement of poorly sorted material over steep areas constitutes a hazardous environmental problem. Computational tools help in the predictions of such landslides [2]. In order to overcome this problem, this work proposes the use of Graphical Processing Units (GPUs) for decreasing significantly the simulation time [3, 4]. The numerical scheme implemented in GPU is based on a finite volume scheme. The mathematical model and the numerical implementation are compared against experimental and field data. In addition, the computational times obtained with the Graphical Hardware technology are compared against Single-Core (sequential) and Multi-Core (parallel) CPU implementations. References [Juez et al.(2014)] Juez, C., Murillo, J., & Garca-Navarro, P. (2014) A 2D weakly-coupled and efficient numerical model for transient shallow flow and movable bed. Advances in Water Resources. 71 93-109. [Juez et al.(2013)] Juez, C., Murillo, J., & Garca-Navarro, P. (2013) . 2D simulation of granular flow over irregular steep slopes using global and local coordinates. Journal of Computational Physics. 225 166-204. [Lacasta et al.(2014)] Lacasta, A., Morales-Hernndez, M., Murillo, J., & Garca-Navarro, P. (2014) An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes Advances in Engineering Software. 78 1-15. [Lacasta

  2. Retrospective Study on Mathematical Modeling Based on Computer Graphic Processing

    NASA Astrophysics Data System (ADS)

    Zhang, Kai Li

    Graphics & image making is an important field in computer application, in which visualization software has been widely used with the characteristics of convenience and quick. However, it was thought by modeling designers that the software had been limited in it's function and flexibility because mathematics modeling platform was not built. A non-visualization graphics software appearing at this moment enabled the graphics & image design has a very good mathematics modeling platform. In the paper, a polished pyramid is established by multivariate spline function algorithm, and validate the non-visualization software is good in mathematical modeling.

  3. Accelerating compartmental modeling on a graphical processing unit.

    PubMed

    Ben-Shalom, Roy; Liberman, Gilad; Korngreen, Alon

    2013-01-01

    Compartmental modeling is a widely used tool in neurophysiology but the detail and scope of such models is frequently limited by lack of computational resources. Here we implement compartmental modeling on low cost Graphical Processing Units (GPUs), which significantly increases simulation speed compared to NEURON. Testing two methods for solving the current diffusion equation system revealed which method is more useful for specific neuron morphologies. Regions of applicability were investigated using a range of simulations from a single membrane potential trace simulated in a simple fork morphology to multiple traces on multiple realistic cells. A runtime peak 150-fold faster than the CPU was achieved. This application can be used for statistical analysis and data fitting optimizations of compartmental models and may be used for simultaneously simulating large populations of neurons. Since GPUs are forging ahead and proving to be more cost-effective than CPUs, this may significantly decrease the cost of computation power and open new computational possibilities for laboratories with limited budgets. PMID:23508232

  4. Multilevel Summation of Electrostatic Potentials Using Graphics Processing Units*

    PubMed Central

    Hardy, David J.; Stone, John E.; Schulten, Klaus

    2009-01-01

    Physical and engineering practicalities involved in microprocessor design have resulted in flat performance growth for traditional single-core microprocessors. The urgent need for continuing increases in the performance of scientific applications requires the use of many-core processors and accelerators such as graphics processing units (GPUs). This paper discusses GPU acceleration of the multilevel summation method for computing electrostatic potentials and forces for a system of charged atoms, which is a problem of paramount importance in biomolecular modeling applications. We present and test a new GPU algorithm for the long-range part of the potentials that computes a cutoff pair potential between lattice points, essentially convolving a fixed 3-D lattice of “weights” over all sub-cubes of a much larger lattice. The implementation exploits the different memory subsystems provided on the GPU to stream optimally sized data sets through the multiprocessors. We demonstrate for the full multilevel summation calculation speedups of up to 26 using a single GPU and 46 using multiple GPUs, enabling the computation of a high-resolution map of the electrostatic potential for a system of 1.5 million atoms in under 12 seconds. PMID:20161132

  5. Graphics processing unit-based alignment of protein interaction networks.

    PubMed

    Xie, Jiang; Zhou, Zhonghua; Ma, Jin; Xiang, Chaojuan; Nie, Qing; Zhang, Wu

    2015-08-01

    Network alignment is an important bridge to understanding human protein-protein interactions (PPIs) and functions through model organisms. However, the underlying subgraph isomorphism problem complicates and increases the time required to align protein interaction networks (PINs). Parallel computing technology is an effective solution to the challenge of aligning large-scale networks via sequential computing. In this study, the typical Hungarian-Greedy Algorithm (HGA) is used as an example for PIN alignment. The authors propose a HGA with 2-nearest neighbours (HGA-2N) and implement its graphics processing unit (GPU) acceleration. Numerical experiments demonstrate that HGA-2N can find alignments that are close to those found by HGA while dramatically reducing computing time. The GPU implementation of HGA-2N optimises the parallel pattern, computing mode and storage mode and it improves the computing time ratio between the CPU and GPU compared with HGA when large-scale networks are considered. By using HGA-2N in GPUs, conserved PPIs can be observed, and potential PPIs can be predicted. Among the predictions based on 25 common Gene Ontology terms, 42.8% can be found in the Human Protein Reference Database. Furthermore, a new method of reconstructing phylogenetic trees is introduced, which shows the same relationships among five herpes viruses that are obtained using other methods. PMID:26243827

  6. Kinematic modelling of disc galaxies using graphics processing units

    NASA Astrophysics Data System (ADS)

    Bekiaris, G.; Glazebrook, K.; Fluke, C. J.; Abraham, R.

    2016-01-01

    With large-scale integral field spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such as the graphics processing unit (GPU), as an accelerator for the computationally expensive model-fitting procedure. We review the algorithms involved in model-fitting and evaluate their suitability for GPU implementation. We employ different optimization techniques, including the Levenberg-Marquardt and nested sampling algorithms, but also a naive brute-force approach based on nested grids. We find that the GPU can accelerate the model-fitting procedure up to a factor of ˜100 when compared to a single-threaded CPU, and up to a factor of ˜10 when compared to a multithreaded dual CPU configuration. Our method's accuracy, precision and robustness are assessed by successfully recovering the kinematic properties of simulated data, and also by verifying the kinematic modelling results of galaxies from the GHASP and DYNAMO surveys as found in the literature. The resulting GBKFIT code is available for download from: http://supercomputing.swin.edu.au/gbkfit.

  7. Graphics Processing Unit Acceleration of Gyrokinetic Turbulence Simulations

    NASA Astrophysics Data System (ADS)

    Hause, Benjamin; Parker, Scott; Chen, Yang

    2013-10-01

    We find a substantial increase in on-node performance using Graphics Processing Unit (GPU) acceleration in gyrokinetic delta-f particle-in-cell simulation. Optimization is performed on a two-dimensional slab gyrokinetic particle simulation using the Portland Group Fortran compiler with the OpenACC compiler directives and Fortran CUDA. Mixed implementation of both Open-ACC and CUDA is demonstrated. CUDA is required for optimizing the particle deposition algorithm. We have implemented the GPU acceleration on a third generation Core I7 gaming PC with two NVIDIA GTX 680 GPUs. We find comparable, or better, acceleration relative to the NERSC DIRAC cluster with the NVIDIA Tesla C2050 computing processor. The Tesla C 2050 is about 2.6 times more expensive than the GTX 580 gaming GPU. We also see enormous speedups (10 or more) on the Titan supercomputer at Oak Ridge with Kepler K20 GPUs. Results show speed-ups comparable or better than that of OpenMP models utilizing multiple cores. The use of hybrid OpenACC, CUDA Fortran, and MPI models across many nodes will also be discussed. Optimization strategies will be presented. We will discuss progress on optimizing the comprehensive three dimensional general geometry GEM code.

  8. Use of general purpose graphics processing units with MODFLOW

    USGS Publications Warehouse

    Hughes, Joseph D.; White, Jeremy T.

    2013-01-01

    To evaluate the use of general-purpose graphics processing units (GPGPUs) to improve the performance of MODFLOW, an unstructured preconditioned conjugate gradient (UPCG) solver has been developed. The UPCG solver uses a compressed sparse row storage scheme and includes Jacobi, zero fill-in incomplete, and modified-incomplete lower-upper (LU) factorization, and generalized least-squares polynomial preconditioners. The UPCG solver also includes options for sequential and parallel solution on the central processing unit (CPU) using OpenMP. For simulations utilizing the GPGPU, all basic linear algebra operations are performed on the GPGPU; memory copies between the central processing unit CPU and GPCPU occur prior to the first iteration of the UPCG solver and after satisfying head and flow criteria or exceeding a maximum number of iterations. The efficiency of the UPCG solver for GPGPU and CPU solutions is benchmarked using simulations of a synthetic, heterogeneous unconfined aquifer with tens of thousands to millions of active grid cells. Testing indicates GPGPU speedups on the order of 2 to 8, relative to the standard MODFLOW preconditioned conjugate gradient (PCG) solver, can be achieved when (1) memory copies between the CPU and GPGPU are optimized, (2) the percentage of time performing memory copies between the CPU and GPGPU is small relative to the calculation time, (3) high-performance GPGPU cards are utilized, and (4) CPU-GPGPU combinations are used to execute sequential operations that are difficult to parallelize. Furthermore, UPCG solver testing indicates GPGPU speedups exceed parallel CPU speedups achieved using OpenMP on multicore CPUs for preconditioners that can be easily parallelized.

  9. Use of general purpose graphics processing units with MODFLOW.

    PubMed

    Hughes, Joseph D; White, Jeremy T

    2013-01-01

    To evaluate the use of general-purpose graphics processing units (GPGPUs) to improve the performance of MODFLOW, an unstructured preconditioned conjugate gradient (UPCG) solver has been developed. The UPCG solver uses a compressed sparse row storage scheme and includes Jacobi, zero fill-in incomplete, and modified-incomplete lower-upper (LU) factorization, and generalized least-squares polynomial preconditioners. The UPCG solver also includes options for sequential and parallel solution on the central processing unit (CPU) using OpenMP. For simulations utilizing the GPGPU, all basic linear algebra operations are performed on the GPGPU; memory copies between the central processing unit CPU and GPCPU occur prior to the first iteration of the UPCG solver and after satisfying head and flow criteria or exceeding a maximum number of iterations. The efficiency of the UPCG solver for GPGPU and CPU solutions is benchmarked using simulations of a synthetic, heterogeneous unconfined aquifer with tens of thousands to millions of active grid cells. Testing indicates GPGPU speedups on the order of 2 to 8, relative to the standard MODFLOW preconditioned conjugate gradient (PCG) solver, can be achieved when (1) memory copies between the CPU and GPGPU are optimized, (2) the percentage of time performing memory copies between the CPU and GPGPU is small relative to the calculation time, (3) high-performance GPGPU cards are utilized, and (4) CPU-GPGPU combinations are used to execute sequential operations that are difficult to parallelize. Furthermore, UPCG solver testing indicates GPGPU speedups exceed parallel CPU speedups achieved using OpenMP on multicore CPUs for preconditioners that can be easily parallelized. PMID:23281733

  10. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    SciTech Connect

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  11. Accelerating chemical database searching using graphics processing units.

    PubMed

    Liu, Pu; Agrafiotis, Dimitris K; Rassokhin, Dmitrii N; Yang, Eric

    2011-08-22

    The utility of chemoinformatics systems depends on the accurate computer representation and efficient manipulation of chemical compounds. In such systems, a small molecule is often digitized as a large fingerprint vector, where each element indicates the presence/absence or the number of occurrences of a particular structural feature. Since in theory the number of unique features can be exceedingly large, these fingerprint vectors are usually folded into much shorter ones using hashing and modulo operations, allowing fast "in-memory" manipulation and comparison of molecules. There is increasing evidence that lossless fingerprints can substantially improve retrieval performance in chemical database searching (substructure or similarity), which have led to the development of several lossless fingerprint compression algorithms. However, any gains in storage and retrieval afforded by compression need to be weighed against the extra computational burden required for decompression before these fingerprints can be compared. Here we demonstrate that graphics processing units (GPU) can greatly alleviate this problem, enabling the practical application of lossless fingerprints on large databases. More specifically, we show that, with the help of a ~$500 ordinary video card, the entire PubChem database of ~32 million compounds can be searched in ~0.2-2 s on average, which is 2 orders of magnitude faster than a conventional CPU. If multiple query patterns are processed in batch, the speedup is even more dramatic (less than 0.02-0.2 s/query for 1000 queries). In the present study, we use the Elias gamma compression algorithm, which results in a compression ratio as high as 0.097. PMID:21696144

  12. Area-delay trade-offs of texture decompressors for a graphics processing unit

    NASA Astrophysics Data System (ADS)

    Novoa Súñer, Emilio; Ituero, Pablo; López-Vallejo, Marisa

    2011-05-01

    Graphics Processing Units have become a booster for the microelectronics industry. However, due to intellectual property issues, there is a serious lack of information on implementation details of the hardware architecture that is behind GPUs. For instance, the way texture is handled and decompressed in a GPU to reduce bandwidth usage has never been dealt with in depth from a hardware point of view. This work addresses a comparative study on the hardware implementation of different texture decompression algorithms for both conventional (PCs and video game consoles) and mobile platforms. Circuit synthesis is performed targeting both a reconfigurable hardware platform and a 90nm standard cell library. Area-delay trade-offs have been extensively analyzed, which allows us to compare the complexity of decompressors and thus determine suitability of algorithms for systems with limited hardware resources.

  13. Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units

    PubMed Central

    Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf

    2013-01-01

    Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867

  14. Flocking-based Document Clustering on the Graphics Processing Unit

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E; Patton, Robert M; ST Charles, Jesse Lee

    2008-01-01

    Abstract?Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. Each bird represents a single document and flies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to receive results in a reasonable amount of time. However, flocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have found increased performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefit the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NIVIDA? we developed a document flocking implementation to be run on the NIVIDA?GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3000 documents. The results of these tests were very significant. Performance gains ranged from three to nearly five times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.

  15. Viscoelastic Finite Difference Modeling Using Graphics Processing Units

    NASA Astrophysics Data System (ADS)

    Fabien-Ouellet, G.; Gloaguen, E.; Giroux, B.

    2014-12-01

    Full waveform seismic modeling requires a huge amount of computing power that still challenges today's technology. This limits the applicability of powerful processing approaches in seismic exploration like full-waveform inversion. This paper explores the use of Graphics Processing Units (GPU) to compute a time based finite-difference solution to the viscoelastic wave equation. The aim is to investigate whether the adoption of the GPU technology is susceptible to reduce significantly the computing time of simulations. The code presented herein is based on the freely accessible software of Bohlen (2002) in 2D provided under a General Public License (GNU) licence. This implementation is based on a second order centred differences scheme to approximate time differences and staggered grid schemes with centred difference of order 2, 4, 6, 8, and 12 for spatial derivatives. The code is fully parallel and is written using the Message Passing Interface (MPI), and it thus supports simulations of vast seismic models on a cluster of CPUs. To port the code from Bohlen (2002) on GPUs, the OpenCl framework was chosen for its ability to work on both CPUs and GPUs and its adoption by most of GPU manufacturers. In our implementation, OpenCL works in conjunction with MPI, which allows computations on a cluster of GPU for large-scale model simulations. We tested our code for model sizes between 1002 and 60002 elements. Comparison shows a decrease in computation time of more than two orders of magnitude between the GPU implementation run on a AMD Radeon HD 7950 and the CPU implementation run on a 2.26 GHz Intel Xeon Quad-Core. The speed-up varies depending on the order of the finite difference approximation and generally increases for higher orders. Increasing speed-ups are also obtained for increasing model size, which can be explained by kernel overheads and delays introduced by memory transfers to and from the GPU through the PCI-E bus. Those tests indicate that the GPU memory size

  16. Megahertz processing rate for Fourier domain optical coherence tomography using a graphics processing unit

    NASA Astrophysics Data System (ADS)

    Watanabe, Yuuki; Kamiyama, Dai

    2012-01-01

    We developed the ultra high-speed processing of FD-OCT images using a low-cost graphics processing unit (GPU) with many stream processors to realize highly parallel processing. The processing line rates of half range FD-OCT and full range FD-OCT were 1.34 MHz and 0.70 MHz for a spectral interference image of 1024 FFT size x 2048 lateral A-scans, respectively. A display rate of 22.5 frames per second for processed full range images was achieved in our OCT system using an InGaAs line scan camera operated at 47 kHz.

  17. Graphic Arts: Process Camera, Stripping, and Platemaking. Fourth Edition. Teacher Edition [and] Student Edition.

    ERIC Educational Resources Information Center

    Multistate Academic and Vocational Curriculum Consortium, Stillwater, OK.

    This publication contains both a teacher edition and a student edition of materials for a course in graphic arts that covers the process camera, stripping, and platemaking. The course introduces basic concepts and skills necessary for entry-level employment in a graphic communication occupation. The contents of the materials are tied to measurable…

  18. Software Graphics Processing Unit (sGPU) for Deep Space Applications

    NASA Technical Reports Server (NTRS)

    McCabe, Mary; Salazar, George; Steele, Glen

    2015-01-01

    A graphics processing capability will be required for deep space missions and must include a range of applications, from safety-critical vehicle health status to telemedicine for crew health. However, preliminary radiation testing of commercial graphics processing cards suggest they cannot operate in the deep space radiation environment. Investigation into an Software Graphics Processing Unit (sGPU)comprised of commercial-equivalent radiation hardened/tolerant single board computers, field programmable gate arrays, and safety-critical display software shows promising results. Preliminary performance of approximately 30 frames per second (FPS) has been achieved. Use of multi-core processors may provide a significant increase in performance.

  19. Processing and distribution of graphical information in offshore engineering

    SciTech Connect

    Rodriguez, M.V.R.; Simao, N.C.; Lorenzoni, C.; Ferrante, A.J.

    1994-12-31

    This paper reports briefly the experience of the Production Department of Petrobras in transforming its centralized, mainframe based, computational environment into a open distributed client/server computational environment, through the application of downsizing concepts. Then the paper focused its attention on the problem of handling technical graphics information regarding its more than 70 fixed offshore platforms, going from 10 to 180 m water depth. The solution adopted, with emphasis on the local network corresponding to a typical production region, and on the connection of the offshore platforms to that network, is discussed first. The experiences collected during the implementation and operation of such solution are then reported, and practical conclusions are presented with regard to the main issues involved.

  20. The Merging Of Computer Graphics And Image Processing Technologies And Applications

    NASA Astrophysics Data System (ADS)

    Brammer, Robert F.; Stephenson, Thomas P.

    1990-01-01

    Historically, computer graphics and image processing technologies and applications have been distinct, both in their research communities and in their hardware and software product suppliers. Computer graphics deals with synthesized visual depictions of outputs from computer models*, whereas image processing (and analysis) deals with computational operations on input data from "imaging sensors"**. Furthermore, the fundamental storage and computational aspects of these two fields are different from one another. For example, many computer graphics applications store data using vector formats whereas image processing applications generally use raster formats. Computer graphics applications may involve polygonal representations, floating point operations, and mathematical models of physical phenomena such as lighting conditions, surface reflecting properties, etc. Image processing applications may involve pixel operations, fixed point representations, global operations (e.g. image rotations), and nonlinear signal processing algorithms.

  1. Grace: A cross-platform micromagnetic simulator on graphics processing units

    NASA Astrophysics Data System (ADS)

    Zhu, Ru

    2015-12-01

    A micromagnetic simulator running on graphics processing units (GPUs) is presented. Different from GPU implementations of other research groups which are predominantly running on NVidia's CUDA platform, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform independent. It runs on GPUs from venders including NVidia, AMD and Intel, and achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude. The simulator paved the way for running large size micromagnetic simulations on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics cards, and is freely available to download.

  2. Mobile Devices and GPU Parallelism in Ionospheric Data Processing

    NASA Astrophysics Data System (ADS)

    Mascharka, D.; Pankratius, V.

    2015-12-01

    Scientific data acquisition in the field is often constrained by data transfer backchannels to analysis environments. Geoscientists are therefore facing practical bottlenecks with increasing sensor density and variety. Mobile devices, such as smartphones and tablets, offer promising solutions to key problems in scientific data acquisition, pre-processing, and validation by providing advanced capabilities in the field. This is due to affordable network connectivity options and the increasing mobile computational power. This contribution exemplifies a scenario faced by scientists in the field and presents the "Mahali TEC Processing App" developed in the context of the NSF-funded Mahali project. Aimed at atmospheric science and the study of ionospheric Total Electron Content (TEC), this app is able to gather data from various dual-frequency GPS receivers. It demonstrates parsing of full-day RINEX files on mobile devices and on-the-fly computation of vertical TEC values based on satellite ephemeris models that are obtained from NASA. Our experiments show how parallel computing on the mobile device GPU enables fast processing and visualization of up to 2 million datapoints in real-time using OpenGL. GPS receiver bias is estimated through minimum TEC approximations that can be interactively adjusted by scientists in the graphical user interface. Scientists can also perform approximate computations for "quickviews" to reduce CPU processing time and memory consumption. In the final stage of our mobile processing pipeline, scientists can upload data to the cloud for further processing. Acknowledgements: The Mahali project (http://mahali.mit.edu) is funded by the NSF INSPIRE grant no. AGS-1343967 (PI: V. Pankratius). We would like to acknowledge our collaborators at Boston College, Virginia Tech, Johns Hopkins University, Colorado State University, as well as the support of UNAVCO for loans of dual-frequency GPS receivers for use in this project, and Intel for loans of

  3. Graphical user interface for image acquisition and processing

    DOEpatents

    Goldberg, Kenneth A.

    2002-01-01

    An event-driven GUI-based image acquisition interface for the IDL programming environment designed for CCD camera control and image acquisition directly into the IDL environment where image manipulation and data analysis can be performed, and a toolbox of real-time analysis applications. Running the image acquisition hardware directly from IDL removes the necessity of first saving images in one program and then importing the data into IDL for analysis in a second step. Bringing the data directly into IDL creates an opportunity for the implementation of IDL image processing and display functions in real-time. program allows control over the available charge coupled device (CCD) detector parameters, data acquisition, file saving and loading, and image manipulation and processing, all from within IDL. The program is built using IDL's widget libraries to control the on-screen display and user interface.

  4. Calculation of HELAS amplitudes for QCD processes using graphics processing unit (GPU)

    NASA Astrophysics Data System (ADS)

    Hagiwara, K.; Kanzaki, J.; Okamura, N.; Rainwater, D.; Stelzer, T.

    2010-11-01

    We use a graphics processing unit (GPU) for fast calculations of helicity amplitudes of quark and gluon scattering processes in massless QCD. New HEGET ( HELAS Evaluation with GPU Enhanced Technology) codes for gluon self-interactions are introduced, and a C++ program to convert the MadGraph generated FORTRAN codes into HEGET codes in CUDA (a C-platform for general purpose computing on GPU) is created. Because of the proliferation of the number of Feynman diagrams and the number of independent color amplitudes, the maximum number of final state jets we can evaluate on a GPU is limited to 4 for pure gluon processes ( gg→4 g), or 5 for processes with one or more quark lines such as qoverline{q}→ 5g and qq→ qq+3 g. Compared with the usual CPU-based programs, we obtain 60-100 times better performance on the GPU, except for 5-jet production processes and the gg→4 g processes for which the GPU gain over the CPU is about 20.

  5. Mobile processing in open systems

    SciTech Connect

    Sapaty, P.S.

    1996-12-31

    A universal spatial automaton, called WAVE, for highly parallel processing in arbitrary distributed systems is described. The automaton is based on a virus principle where recursive programs, or waves, self-navigate in networks of data or processes in multiple cooperative parts while controlling and modifying the environment they exist in and move through. The layered general organization of the automaton as well as its distributed implementation in computer networks have been discussed. As the automaton dynamically creates, modifies, activates and processes any knowledge networks arbitrarily distributed in computer networks, it can easily model any other paradigms for parallel and distributed computing. Comparison of WAVE with some known programming models and languages, and ideas of their possible integration have also been given.

  6. Student Thinking Processes While Constructing Graphic Representations of Textbook Content: What Insights Do Think-Alouds Provide?

    ERIC Educational Resources Information Center

    Scott, D. Beth; Dreher, Mariam Jean

    2016-01-01

    This study examined the thinking processes students engage in while constructing graphic representations of textbook content. Twenty-eight students who either used graphic representations in a routine manner during social studies instruction or learned to construct graphic representations based on the rhetorical patterns used to organize textbook…

  7. A graphically oriented specification language for automatic code generation. GRASP/Ada: A Graphical Representation of Algorithms, Structure, and Processes for Ada, phase 1

    NASA Technical Reports Server (NTRS)

    Cross, James H., II; Morrison, Kelly I.; May, Charles H., Jr.; Waddel, Kathryn C.

    1989-01-01

    The first phase of a three-phase effort to develop a new graphically oriented specification language which will facilitate the reverse engineering of Ada source code into graphical representations (GRs) as well as the automatic generation of Ada source code is described. A simplified view of the three phases of Graphical Representations for Algorithms, Structure, and Processes for Ada (GRASP/Ada) with respect to three basic classes of GRs is presented. Phase 1 concentrated on the derivation of an algorithmic diagram, the control structure diagram (CSD) (CRO88a) from Ada source code or Ada PDL. Phase 2 includes the generation of architectural and system level diagrams such as structure charts and data flow diagrams and should result in a requirements specification for a graphically oriented language able to support automatic code generation. Phase 3 will concentrate on the development of a prototype to demonstrate the feasibility of this new specification language.

  8. Multiparallel decompression simultaneously using multicore central processing unit and graphic processing unit

    NASA Astrophysics Data System (ADS)

    Petta, Andrea; Serra, Luigi; De Nino, Maurizio

    2013-01-01

    The discrete wavelet transform (DWT)-based compression algorithm is widely used in many image compression systems. The time-consuming computation of the 9/7 discrete wavelet decomposition and the bit-plane decoding is usually the bottleneck of these systems. In order to perform real-time decompression on a massive bit stream of compressed images continuously down-linked from the satellite, we propose a different graphic processing unit (GPU)-accelerated decoding system. In this system, the GPU and multiple central processing unit (CPU) threads are run in parallel. To obtain the maximum throughput via a different pipeline structure for processing continuous satellite images, an additional balancing algorithm workload has been implemented to distribute the jobs to both CPU and GPU parts to have approximately the same processing speed. Through the pipelined CPU and GPU heterogeneous computing, the entire decoding system approaches a speedup of 15× as compared to its single-threaded CPU counterpart. The proposed channel and source decoding system is able to decompress 1024×1024 satellite images at a speed of 20 frames/s.

  9. Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...

  10. A DDC Bibliography on Optical or Graphic Information Processing (Information Sciences Series). Volume I.

    ERIC Educational Resources Information Center

    Defense Documentation Center, Alexandria, VA.

    This unclassified-unlimited bibliography contains 183 references, with abstracts, dealing specifically with optical or graphic information processing. Citations are grouped under three headings: display devices and theory, character recognition, and pattern recognition. Within each group, they are arranged in accession number (AD-number) sequence.…

  11. Graphics Processing Unit-Based Bioheat Simulation to Facilitate Rapid Decision Making Associated with Cryosurgery Training.

    PubMed

    Keelan, Robert; Zhang, Hong; Shimada, Kenji; Rabin, Yoed

    2016-04-01

    This study focuses on the implementation of an efficient numerical technique for cryosurgery simulations on a graphics processing unit as an alternative means to accelerate runtime. This study is part of an ongoing effort to develop computerized training tools for cryosurgery, with prostate cryosurgery as a developmental model. The ability to perform rapid simulations of various test cases is critical to facilitate sound decision making associated with medical training. Consistent with clinical practice, the training tool aims at correlating the frozen region contour and the corresponding temperature field with the target region shape. The current study focuses on the feasibility of graphics processing unit-based computation using C++ accelerated massive parallelism, as one possible implementation. Benchmark results on a variety of computation platforms display between 3-fold acceleration (laptop) and 13-fold acceleration (gaming computer) of cryosurgery simulation, in comparison with the more common implementation on a multicore central processing unit. While the general concept of graphics processing unit-based simulations is not new, its application to phase-change problems, combined with the unique requirements for cryosurgery optimization, represents the core contribution of the current study. PMID:25941162

  12. iMOSFLM: a new graphical interface for diffraction-image processing with MOSFLM

    PubMed Central

    Battye, T. Geoff G.; Kontogiannis, Luke; Johnson, Owen; Powell, Harold R.; Leslie, Andrew G. W.

    2011-01-01

    iMOSFLM is a graphical user interface to the diffraction data-integration program MOSFLM. It is designed to simplify data processing by dividing the process into a series of steps, which are normally carried out sequentially. Each step has its own display pane, allowing control over parameters that influence that step and providing graphical feedback to the user. Suitable values for integration parameters are set automatically, but additional menus provide a detailed level of control for experienced users. The image display and the interfaces to the different tasks (indexing, strategy calculation, cell refinement, integration and history) are described. The most important parameters for each step and the best way of assessing success or failure are discussed. PMID:21460445

  13. TRIIG - Time-lapse reproduction of images through interactive graphics. [digital processing of quality hard copy

    NASA Technical Reports Server (NTRS)

    Buckner, J. D.; Council, H. W.; Edwards, T. R.

    1974-01-01

    Description of the hardware and software implementing the system of time-lapse reproduction of images through interactive graphics (TRIIG). The system produces a quality hard copy of processed images in a fast and inexpensive manner. This capability allows for optimal development of processing software through the rapid viewing of many image frames in an interactive mode. Three critical optical devices are used to reproduce an image: an Optronics photo reader/writer, the Adage Graphics Terminal, and Polaroid Type 57 high speed film. Typical sources of digitized images are observation satellites, such as ERTS or Mariner, computer coupled electron microscopes for high-magnification studies, or computer coupled X-ray devices for medical research.

  14. Particle-in-cell Simulations with Charge-Conserving Current Deposition on Graphic Processing Units

    NASA Astrophysics Data System (ADS)

    Kong, Xianglong; Huang, Michael; Ren, Chuang; Decyk, Viktor

    2010-11-01

    We present an implementation of a fully relativistic, electromagnetic PIC code, with charge-conserving current deposition, on graphics processing units (GPUs) with NVIDIA's massively multithreaded computing architecture CUDA. A particle-based computation thread assignment was used in the current deposition scheme and write conflicts among the threads were resolved by a thread racing technique. A parallel particle sorting scheme was also developed and used. The implementation took advantage of fast on-chip shared memory. The 2D implementation achieved a one particle-step process time of 2.28 ns for cold plasma runs and 8.53 ns for extremely relativistic plasma runs on a GTX 280 graphic card, which were respectively 90 and 29 times faster than a single threaded state-of-art CPU code. A comparable speedup was also achieved for the 3D implementation.

  15. Signal processing for ION mobility spectrometers

    NASA Technical Reports Server (NTRS)

    Taylor, S.; Hinton, M.; Turner, R.

    1995-01-01

    Signal processing techniques for systems based upon Ion Mobility Spectrometry will be discussed in the light of 10 years of experience in the design of real-time IMS. Among the topics to be covered are compensation techniques for variations in the number density of the gas - the use of an internal standard (a reference peak) or pressure and temperature sensors. Sources of noise and methods for noise reduction will be discussed together with resolution limitations and the ability of deconvolution techniques to improve resolving power. The use of neural networks (either by themselves or as a component part of a processing system) will be reviewed.

  16. Process analysis using ion mobility spectrometry.

    PubMed

    Baumbach, J I

    2006-03-01

    Ion mobility spectrometry, originally used to detect chemical warfare agents, explosives and illegal drugs, is now frequently applied in the field of process analytics. The method combines both high sensitivity (detection limits down to the ng to pg per liter and ppb(v)/ppt(v) ranges) and relatively low technical expenditure with a high-speed data acquisition. In this paper, the working principles of IMS are summarized with respect to the advantages and disadvantages of the technique. Different ionization techniques, sample introduction methods and preseparation methods are considered. Proven applications of different types of ion mobility spectrometer (IMS) used at ISAS will be discussed in detail: monitoring of gas insulated substations, contamination in water, odoration of natural gas, human breath composition and metabolites of bacteria. The example applications discussed relate to purity (gas insulated substations), ecology (contamination of water resources), plants and person safety (odoration of natural gas), food quality control (molds and bacteria) and human health (breath analysis). PMID:16132133

  17. Advanced colour processing for mobile devices

    NASA Astrophysics Data System (ADS)

    Gillich, Eugen; Dörksen, Helene; Lohweg, Volker

    2015-02-01

    Mobile devices such as smartphones are going to play an important role in professionally image processing tasks. However, mobile systems were not designed for such applications, especially in terms of image processing requirements like stability and robustness. One major drawback is the automatic white balance, which comes with the devices. It is necessary for many applications, but of no use when applied to shiny surfaces. Such an issue appears when image acquisition takes place in differently coloured illuminations caused by different environments. This results in inhomogeneous appearances of the same subject. In our paper we show a new approach for handling the complex task of generating a low-noise and sharp image without spatial filtering. Our method is based on the fact that we analyze the spectral and saturation distribution of the channels. Furthermore, the RGB space is transformed into a more convenient space, a particular HSI space. We generate the greyscale image by a control procedure that takes into account the colour channels. This leads in an adaptive colour mixing model with reduced noise. The results of the optimized images are used to show how, e. g., image classification benefits from our colour adaptation approach.

  18. Systems Biology Graphical Notation: Process Description language Level 1 Version 1.3.

    PubMed

    Moodie, Stuart; Le Novère, Nicolas; Demir, Emek; Mi, Huaiyu; Villéger, Alice

    2015-01-01

    The Systems Biological Graphical Notation (SBGN) is an international community effort for standardized graphical representations of biological pathways and networks. The goal of SBGN is to provide unambiguous pathway and network maps for readers with different scientific backgrounds as well as to support efficient and accurate exchange of biological knowledge between different research communities, industry, and other players in systems biology. Three SBGN languages, Process Description (PD), Entity Relationship (ER) and Activity Flow (AF), allow for the representation of different aspects of biological and biochemical systems at different levels of detail. The SBGN Process Description language represents biological entities and processes between these entities within a network. SBGN PD focuses on the mechanistic description and temporal dependencies of biological interactions and transformations. The nodes (elements) are split into entity nodes describing, e.g., metabolites, proteins, genes and complexes, and process nodes describing, e.g., reactions and associations. The edges (connections) provide descriptions of relationships (or influences) between the nodes, such as consumption, production, stimulation and inhibition. Among all three languages of SBGN, PD is the closest to metabolic and regulatory pathways in biological literature and textbooks, but its well-defined semantics offer a superior precision in expressing biological knowledge. PMID:26528561

  19. Creating Interactive Graphical Overlays in the Advanced Weather Interactive Processing System Using Shapefiles and DGM Files

    NASA Technical Reports Server (NTRS)

    Barrett, Joe H., III; Lafosse, Richard; Hood, Doris; Hoeth, Brian

    2007-01-01

    Graphical overlays can be created in real-time in the Advanced Weather Interactive Processing System (AWIPS) using shapefiles or Denver AWIPS Risk Reduction and Requirements Evaluation (DARE) Graphics Metafile (DGM) files. This presentation describes how to create graphical overlays on-the-fly for AWIPS, by using two examples of AWIPS applications that were created by the Applied Meteorology Unit (AMU) located at Cape Canaveral Air Force Station (CCAFS), Florida. The first example is the Anvil Threat Corridor Forecast Tool, which produces a shapefile that depicts a graphical threat corridor of the forecast movement of thunderstorm anvil clouds, based on the observed or forecast upper-level winds. This tool is used by the Spaceflight Meteorology Group (SMG) at Johnson Space Center, Texas and 45th Weather Squadron (45 WS) at CCAFS to analyze the threat of natural or space vehicle-triggered lightning over a location. The second example is a launch and landing trajectory tool that produces a DGM file that plots the ground track of space vehicles during launch or landing. The trajectory tool can be used by SMG and the 45 WS forecasters to analyze weather radar imagery along a launch or landing trajectory. The presentation will list the advantages and disadvantages of both file types for creating interactive graphical overlays in future AWIPS applications. Shapefiles are a popular format used extensively in Geographical Information Systems. They are usually used in AWIPS to depict static map backgrounds. A shapefile stores the geometry and attribute information of spatial features in a dataset (ESRI 1998). Shapefiles can contain point, line, and polygon features. Each shapefile contains a main file, index file, and a dBASE table. The main file contains a record for each spatial feature, which describes the feature with a list of its vertices. The index file contains the offset of each record from the beginning of the main file. The dBASE table contains records for each

  20. Efficient neighbor list calculation for molecular simulation of colloidal systems using graphics processing units

    NASA Astrophysics Data System (ADS)

    Howard, Michael P.; Anderson, Joshua A.; Nikoubashman, Arash; Glotzer, Sharon C.; Panagiotopoulos, Athanassios Z.

    2016-06-01

    We present an algorithm based on linear bounding volume hierarchies (LBVHs) for computing neighbor (Verlet) lists using graphics processing units (GPUs) for colloidal systems characterized by large size disparities. We compare this to a GPU implementation of the current state-of-the-art CPU algorithm based on stenciled cell lists. We report benchmarks for both neighbor list algorithms in a Lennard-Jones binary mixture with synthetic interaction range disparity and a realistic colloid solution. LBVHs outperformed the stenciled cell lists for systems with moderate or large size disparity and dilute or semidilute fractions of large particles, conditions typical of colloidal systems.

  1. Solution of relativistic quantum optics problems using clusters of graphical processing units

    SciTech Connect

    Gordon, D.F. Hafizi, B.; Helle, M.H.

    2014-06-15

    Numerical solution of relativistic quantum optics problems requires high performance computing due to the rapid oscillations in a relativistic wavefunction. Clusters of graphical processing units are used to accelerate the computation of a time dependent relativistic wavefunction in an arbitrary external potential. The stationary states in a Coulomb potential and uniform magnetic field are determined analytically and numerically, so that they can used as initial conditions in fully time dependent calculations. Relativistic energy levels in extreme magnetic fields are recovered as a means of validation. The relativistic ionization rate is computed for an ion illuminated by a laser field near the usual barrier suppression threshold, and the ionizing wavefunction is displayed.

  2. Graphics processing unit-accelerated double random phase encoding for fast image encryption

    NASA Astrophysics Data System (ADS)

    Lee, Jieun; Yi, Faliu; Saifullah, Rao; Moon, Inkyu

    2014-11-01

    We propose a fast double random phase encoding (DRPE) algorithm using a graphics processing unit (GPU)-based stream-processing model. A performance analysis of the accelerated DRPE implementation that employs the Compute Unified Device Architecture programming environment is presented. We show that the proposed methodology executed on a GPU can dramatically increase encryption speed compared with central processing unit sequential computing. Our experimental results demonstrate that in encryption data of an image with a pixel size of 1000×1000, where one pixel has a 32-bit depth, our GPU version of the DRPE scheme can be approximately two times faster than the advanced encryption standard algorithm implemented on a GPU. In addition, the quality of parallel processing on the presented DRPE acceleration method is evaluated with performance parameters, such as speedup, efficiency, and redundancy.

  3. General Purpose Graphics Processing Unit Based High-Rate Rice Decompression and Reed-Solomon Decoding.

    SciTech Connect

    Loughry, Thomas A.

    2015-02-01

    As the volume of data acquired by space-based sensors increases, mission data compression/decompression and forward error correction code processing performance must likewise scale. This competency development effort was explored using the General Purpose Graphics Processing Unit (GPGPU) to accomplish high-rate Rice Decompression and high-rate Reed-Solomon (RS) decoding at the satellite mission ground station. Each algorithm was implemented and benchmarked on a single GPGPU. Distributed processing across one to four GPGPUs was also investigated. The results show that the GPGPU has considerable potential for performing satellite communication Data Signal Processing, with three times or better performance improvements and up to ten times reduction in cost over custom hardware, at least in the case of Rice Decompression and Reed-Solomon Decoding.

  4. Pre and post processing using the IBM 3277 display station graphics attachment (RPQ7H0284)

    NASA Technical Reports Server (NTRS)

    Burroughs, S. H.; Lawlor, M. B.; Miller, I. M.

    1978-01-01

    A graphical interactive procedure operating under TSO and utilizing two CRT display terminals is shown to be an effective means of accomplishing mesh generation, establishing boundary conditions, and reviewing graphic output for finite element analysis activity.

  5. Modified graphical autocatalytic set model of combustion process in circulating fluidized bed boiler

    NASA Astrophysics Data System (ADS)

    Yusof, Nurul Syazwani; Bakar, Sumarni Abu; Ismail, Razidah

    2014-07-01

    Circulating Fluidized Bed Boiler (CFB) is a device for generating steam by burning fossil fuels in a furnace operating under a special hydrodynamic condition. Autocatalytic Set has provided a graphical model of chemical reactions that occurred during combustion process in CFB. Eight important chemical substances known as species were represented as nodes and catalytic relationships between nodes are represented by the edges in the graph. In this paper, the model is extended and modified by considering other relevant chemical reactions that also exist during the process. Catalytic relationship among the species in the model is discussed. The result reveals that the modified model is able to gives more explanation of the relationship among the species during the process at initial time t.

  6. Computation of Large Covariance Matrices by SAMMY on Graphical Processing Units and Multicore CPUs

    SciTech Connect

    Arbanas, Goran; Dunn, Michael E; Wiarda, Dorothea

    2011-01-01

    Computational power of Graphical Processing Units and multicore CPUs was harnessed by the nuclear data evaluation code SAMMY to speed up computations of large Resonance Parameter Covariance Matrices (RPCMs). This was accomplished by linking SAMMY to vendor-optimized implementations of the matrix-matrix multiplication subroutine of the Basic Linear Algebra Library to compute the most time-consuming step. The U-235 RPCM computed previously using a triple-nested loop was re-computed using the NVIDIA implementation of the subroutine on a single Tesla Fermi Graphical Processing Unit, and also using the Intel's Math Kernel Library implementation on two different multicore CPU systems. A multiplication of two matrices of dimensions 16,000 x 20,000 that had previously taken days, took approximately one minute on the GPU. Similar performance was achieved on a dual six-core CPU system. The magnitude of the speed-up suggests that these, or similar, combinations of hardware and libraries may be useful for large matrix operations in SAMMY. Uniform interfaces of standard linear algebra libraries make them a promising candidate for a programming framework of a new generation of SAMMY for the emerging heterogeneous computing platforms.

  7. BarraCUDA - a fast short read sequence aligner using graphics processing units

    PubMed Central

    2012-01-01

    Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497

  8. Data processing and presentation for a personalised, image-driven medical graphical avatar.

    PubMed

    de Ridder, Michael; Bi, Lei; Constantinescu, Liviu; Kim, Jinman; Feng, David Dagan

    2013-01-01

    With the continuing digital revolution in the healthcare industry, patients are being confronted with the difficult task of managing their digital medical data. Current personal health record (PHR) systems are able to store and consolidate this data, but they are limited in providing tools to facilitate patients' understanding and management of the data. One reason for this stems from the limited use of contextual information, especially in presenting spatial details such as in volumetric images and videos, as well as time-based temporal data. Further, lack of meaningful visualisation techniques exist to represent the data stored in PHRs. In this paper we propose a medical graphical avatar (MGA) constructed from whole-body patient images, and a navigable timeline of the patient's medical records. A data mapping framework is presented that extracts information from medical multimedia data such as images, video and text, to populate our PHR timeline, while also embedding spatial and textual annotations such as regions of interest (ROIs) that are automatically derived from image processing algorithms. We developed a prototype to process the various forms of PHR data and present the data in a graphical avatar. We analysed the usefulness of our system under various scenarios of patient data use and present preliminary results that indicate that our system performs well on standard consumer hardware. PMID:24110654

  9. Using Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

    NASA Astrophysics Data System (ADS)

    O'Connor, A. S.; Justice, B.; Harris, A. T.

    2013-12-01

    Graphics Processing Units (GPUs) are high-performance multiple-core processors capable of very high computational speeds and large data throughput. Modern GPUs are inexpensive and widely available commercially. These are general-purpose parallel processors with support for a variety of programming interfaces, including industry standard languages such as C. GPU implementations of algorithms that are well suited for parallel processing can often achieve speedups of several orders of magnitude over optimized CPU codes. Significant improvements in speeds for imagery orthorectification, atmospheric correction, target detection and image transformations like Independent Components Analsyis (ICA) have been achieved using GPU-based implementations. Additional optimizations, when factored in with GPU processing capabilities, can provide 50x - 100x reduction in the time required to process large imagery. Exelis Visual Information Solutions (VIS) has implemented a CUDA based GPU processing frame work for accelerating ENVI and IDL processes that can best take advantage of parallelization. Testing Exelis VIS has performed shows that orthorectification can take as long as two hours with a WorldView1 35,0000 x 35,000 pixel image. With GPU orthorecification, the same orthorectification process takes three minutes. By speeding up image processing, imagery can successfully be used by first responders, scientists making rapid discoveries with near real time data, and provides an operational component to data centers needing to quickly process and disseminate data.

  10. Modified Anderson Method for Accelerating 3D-RISM Calculations Using Graphics Processing Unit.

    PubMed

    Maruyama, Yutaka; Hirata, Fumio

    2012-09-11

    A fast algorithm is proposed to solve the three-dimensional reference interaction site model (3D-RISM) theory on a graphics processing unit (GPU). 3D-RISM theory is a powerful tool for investigating biomolecular processes in solution; however, such calculations are often both memory-intensive and time-consuming. We sought to accelerate these calculations using GPUs, but to work around the problem of limited memory size in GPUs, we modified the less memory-intensive "Anderson method" to give faster convergence to 3D-RISM calculations. Using this method on a Tesla C2070 GPU, we reduced the total computational time by a factor of 8, 1.4 times by the modified Andersen method and 5.7 times by GPU, compared to calculations on an Intel Xeon machine (eight cores, 3.33 GHz) with the conventional method. PMID:26605714

  11. SeqTrace: A Graphical Tool for Rapidly Processing DNA Sequencing Chromatograms

    PubMed Central

    Stucky, Brian J.

    2012-01-01

    Modern applications of Sanger DNA sequencing often require converting a large number of chromatogram trace files into high-quality DNA sequences for downstream analyses. Relatively few nonproprietary software tools are available to assist with this process. SeqTrace is a new, free, and open-source software application that is designed to automate the entire workflow by facilitating easy batch processing of large numbers of trace files. SeqTrace can identify, align, and compute consensus sequences from matching forward and reverse traces, filter low-quality base calls, and end-trim finished sequences. The software features a graphical interface that includes a full-featured chromatogram viewer and sequence editor. SeqTrace runs on most popular operating systems and is freely available, along with supporting documentation, at http://seqtrace.googlecode.com/. PMID:22942788

  12. Acceleration of Early-Photon Fluorescence Molecular Tomography with Graphics Processing Units

    PubMed Central

    Wang, Xin; Zhang, Bin; Cao, Xu; Liu, Fei; Luo, Jianwen; Bai, Jing

    2013-01-01

    Fluorescence molecular tomography (FMT) with early-photons can improve the spatial resolution and fidelity of the reconstructed results. However, its computing scale is always large which limits its applications. In this paper, we introduced an acceleration strategy for the early-photon FMT with graphics processing units (GPUs). According to the procedure, the whole solution of FMT was divided into several modules and the time consumption for each module is studied. In this strategy, two most time consuming modules (Gd and W modules) were accelerated with GPU, respectively, while the other modules remained coded in the Matlab. Several simulation studies with a heterogeneous digital mouse atlas were performed to confirm the performance of the acceleration strategy. The results confirmed the feasibility of the strategy and showed that the processing speed was improved significantly. PMID:23606899

  13. Stereo system based on a graphics processing unit for pedestrian detection and tracking

    NASA Astrophysics Data System (ADS)

    Nam, Bodam; Kang, Sungil; Hong, Hyunki; Eem, Changkyoung

    2010-12-01

    This paper presents a novel stereo system, based on a graphics processing unit (GPU), for pedestrian detection in real images. The process of obtaining a dense disparity map and the edge properties of the scene to extract a region of interest (ROI) is designed on a GPU for real-time applications. After extracting the histograms of the oriented gradients on the ROIs, a support vector machine classifies them as pedestrian and nonpedestrian types. The system employs the recognition-by-components method, which compensates for the pose and articulation changes of pedestrians. In order to effectively track spatial pedestrian estimates over sequences, subwindows at distinctive parts of human beings are used as measurements for the Kalman filter.

  14. High-speed nonlinear finite element analysis for surgical simulation using graphics processing units.

    PubMed

    Taylor, Z A; Cheng, M; Ourselin, S

    2008-05-01

    The use of biomechanical modelling, especially in conjunction with finite element analysis, has become common in many areas of medical image analysis and surgical simulation. Clinical employment of such techniques is hindered by conflicting requirements for high fidelity in the modelling approach, and fast solution speeds. We report the development of techniques for high-speed nonlinear finite element analysis for surgical simulation. We use a fully nonlinear total Lagrangian explicit finite element formulation which offers significant computational advantages for soft tissue simulation. However, the key contribution of the work is the presentation of a fast graphics processing unit (GPU) solution scheme for the finite element equations. To the best of our knowledge, this represents the first GPU implementation of a nonlinear finite element solver. We show that the present explicit finite element scheme is well suited to solution via highly parallel graphics hardware, and that even a midrange GPU allows significant solution speed gains (up to 16.8 x) compared with equivalent CPU implementations. For the models tested the scheme allows real-time solution of models with up to 16,000 tetrahedral elements. The use of GPUs for such purposes offers a cost-effective high-performance alternative to expensive multi-CPU machines, and may have important applications in medical image analysis and surgical simulation. PMID:18450538

  15. Real-time nonlinear finite element analysis for surgical simulation using graphics processing units.

    PubMed

    Taylor, Zeike A; Cheng, Mario; Ourselin, Sébastien

    2007-01-01

    Clinical employment of biomechanical modelling techniques in areas of medical image analysis and surgical simulation is often hindered by conflicting requirements for high fidelity in the modelling approach and high solution speeds. We report the development of techniques for high-speed nonlinear finite element (FE) analysis for surgical simulation. We employ a previously developed nonlinear total Lagrangian explicit FE formulation which offers significant computational advantages for soft tissue simulation. However, the key contribution of the work is the presentation of a fast graphics processing unit (GPU) solution scheme for the FE equations. To the best of our knowledge this represents the first GPU implementation of a nonlinear FE solver. We show that the present explicit FE scheme is well-suited to solution via highly parallel graphics hardware, and that even a midrange GPU allows significant solution speed gains (up to 16.4x) compared with equivalent CPU implementations. For the models tested the scheme allows real-time solution of models with up to 16000 tetrahedral elements. The use of GPUs for such purposes offers a cost-effective high-performance alternative to expensive multi-CPU machines, and may have important applications in medical image analysis and surgical simulation. PMID:18051120

  16. A software architecture for multi-cellular system simulations on graphics processing units.

    PubMed

    Jeannin-Girardon, Anne; Ballet, Pascal; Rodin, Vincent

    2013-09-01

    The first aim of simulation in virtual environment is to help biologists to have a better understanding of the simulated system. The cost of such simulation is significantly reduced compared to that of in vivo simulation. However, the inherent complexity of biological system makes it hard to simulate these systems on non-parallel architectures: models might be made of sub-models and take several scales into account; the number of simulated entities may be quite large. Today, graphics cards are used for general purpose computing which has been made easier thanks to frameworks like CUDA or OpenCL. Parallelization of models may however not be easy: parallel computer programing skills are often required; several hardware architectures may be used to execute models. In this paper, we present the software architecture we built in order to implement various models able to simulate multi-cellular system. This architecture is modular and it implements data structures adapted for graphics processing units architectures. It allows efficient simulation of biological mechanisms. PMID:23900760

  17. Open-source graphics processing unit-accelerated ray tracer for optical simulation

    NASA Astrophysics Data System (ADS)

    Mauch, Florian; Gronle, Marc; Lyda, Wolfram; Osten, Wolfgang

    2013-05-01

    Ray tracing still is the workhorse in optical design and simulation. Its basic principle, propagating light as a set of mutually independent rays, implies a linear dependency of the computational effort and the number of rays involved in the problem. At the same time, the mutual independence of the light rays bears a huge potential for parallelization of the computational load. This potential has recently been recognized in the visualization community, where graphics processing unit (GPU)-accelerated ray tracing is used to render photorealistic images. However, precision requirements in optical simulation are substantially higher than in visualization, and therefore performance results known from visualization cannot be expected to transfer to optical simulation one-to-one. In this contribution, we present an open-source implementation of a GPU-accelerated ray tracer, based on nVidias acceleration engine OptiX, that traces in double precision and exploits the massively parallel architecture of modern graphics cards. We compare its performance to a CPU-based tracer that has been developed in parallel.

  18. Lossy hyperspectral image compression tuned for spectral mixture analysis applications on NVidia graphics processing units

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Plaza, Javier; Sánchez, Sergio; Paz, Abel

    2009-08-01

    In this paper, we develop a computationally efficient approach for lossy compression of remotely sensed hyperspectral images which has been specifically tuned to preserve the relevant information required in spectral mixture analysis (SMA) applications. The proposed method is based on two steps: 1) endmember extraction, and 2) linear spectral unmixing. Two endmember extraction algorithms: the pixel purity index (PPI) and the automatic morphological endmember extraction (AMEE), and a fully constrained linear spectral unmixing (FCLSU) algorithm have been considered in this work to devise the proposed lossy compression strategy. The proposed methodology has been implemented in graphics processing units (GPUs) of NVidiaTM type. Our experiments demonstrate that it can achieve very high compression ratios when applied to standard hyperspectral data sets, and can also retain the relevant information required for spectral unmixing in a computationally efficient way, achieving speedups in the order of 26 on a NVidiaTM GeForce 8800 GTX graphic card when compared to an optimized implementation of the same code in a dual-core CPU.

  19. SU-E-P-59: A Graphical Interface for XCAT Phantom Configuration, Generation and Processing

    SciTech Connect

    Myronakis, M; Cai, W; Dhou, S; Cifter, F; Lewis, J; Hurwitz, M

    2015-06-15

    Purpose: To design a comprehensive open-source, publicly available, graphical user interface (GUI) to facilitate the configuration, generation, processing and use of the 4D Extended Cardiac-Torso (XCAT) phantom. Methods: The XCAT phantom includes over 9000 anatomical objects as well as respiratory, cardiac and tumor motion. It is widely used for research studies in medical imaging and radiotherapy. The phantom generation process involves the configuration of a text script to parameterize the geometry, motion, and composition of the whole body and objects within it, and to generate simulated PET or CT images. To avoid the need for manual editing or script writing, our MATLAB-based GUI uses slider controls, drop-down lists, buttons and graphical text input to parameterize and process the phantom. Results: Our GUI can be used to: a) generate parameter files; b) generate the voxelized phantom; c) combine the phantom with a lesion; d) display the phantom; e) produce average and maximum intensity images from the phantom output files; f) incorporate irregular patient breathing patterns; and f) generate DICOM files containing phantom images. The GUI provides local help information using tool-tip strings on the currently selected phantom, minimizing the need for external documentation. The DICOM generation feature is intended to simplify the process of importing the phantom images into radiotherapy treatment planning systems or other clinical software. Conclusion: The GUI simplifies and automates the use of the XCAT phantom for imaging-based research projects in medical imaging or radiotherapy. This has the potential to accelerate research conducted with the XCAT phantom, or to ease the learning curve for new users. This tool does not include the XCAT phantom software itself. We would like to acknowledge funding from MRA, Varian Medical Systems Inc.

  20. Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit.

    PubMed

    Watanabe, Yuuki; Itagaki, Toshiki

    2009-01-01

    Fourier domain optical coherence tomography (FD-OCT) requires resampling of spectrally resolved depth information from wavelength to wave number, and the subsequent application of the inverse Fourier transform. The display rates of OCT images are much slower than the image acquisition rates due to processing speed limitations on most computers. We demonstrate a real-time display of processed OCT images using a linear-in-wave-number (linear-k) spectrometer and a graphics processing unit (GPU). We use the linear-k spectrometer with the combination of a diffractive grating with 1200 lines/mm and a F2 equilateral prism in the 840-nm spectral region to avoid calculating the resampling process. The calculations of the fast Fourier transform (FFT) are accelerated by the GPU with many stream processors, which realizes highly parallel processing. A display rate of 27.9 frames/sec for processed images (2048 FFT size x 1000 lateral A-scans) is achieved in our OCT system using a line scan CCD camera operated at 27.9 kHz. PMID:20059237

  1. Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit

    NASA Astrophysics Data System (ADS)

    Watanabe, Yuuki; Itagaki, Toshiki

    2009-11-01

    Fourier domain optical coherence tomography (FD-OCT) requires resampling of spectrally resolved depth information from wavelength to wave number, and the subsequent application of the inverse Fourier transform. The display rates of OCT images are much slower than the image acquisition rates due to processing speed limitations on most computers. We demonstrate a real-time display of processed OCT images using a linear-in-wave-number (linear-k) spectrometer and a graphics processing unit (GPU). We use the linear-k spectrometer with the combination of a diffractive grating with 1200 lines/mm and a F2 equilateral prism in the 840-nm spectral region to avoid calculating the resampling process. The calculations of the fast Fourier transform (FFT) are accelerated by the GPU with many stream processors, which realizes highly parallel processing. A display rate of 27.9 frames/sec for processed images (2048 FFT size×1000 lateral A-scans) is achieved in our OCT system using a line scan CCD camera operated at 27.9 kHz.

  2. Implementation and optimization of ultrasound signal processing algorithms on mobile GPU

    NASA Astrophysics Data System (ADS)

    Kong, Woo Kyu; Lee, Wooyoul; Kim, Kyu Cheol; Yoo, Yangmo; Song, Tai-Kyong

    2014-03-01

    A general-purpose graphics processing unit (GPGPU) has been used for improving computing power in medical ultrasound imaging systems. Recently, a mobile GPU becomes powerful to deal with 3D games and videos at high frame rates on Full HD or HD resolution displays. This paper proposes the method to implement ultrasound signal processing on a mobile GPU available in the high-end smartphone (Galaxy S4, Samsung Electronics, Seoul, Korea) with programmable shaders on the OpenGL ES 2.0 platform. To maximize the performance of the mobile GPU, the optimization of shader design and load sharing between vertex and fragment shader was performed. The beamformed data were captured from a tissue mimicking phantom (Model 539 Multipurpose Phantom, ATS Laboratories, Inc., Bridgeport, CT, USA) by using a commercial ultrasound imaging system equipped with a research package (Ultrasonix Touch, Ultrasonix, Richmond, BC, Canada). The real-time performance is evaluated by frame rates while varying the range of signal processing blocks. The implementation method of ultrasound signal processing on OpenGL ES 2.0 was verified by analyzing PSNR with MATLAB gold standard that has the same signal path. CNR was also analyzed to verify the method. From the evaluations, the proposed mobile GPU-based processing method has no significant difference with the processing using MATLAB (i.e., PSNR<52.51 dB). The comparable results of CNR were obtained from both processing methods (i.e., 11.31). From the mobile GPU implementation, the frame rates of 57.6 Hz were achieved. The total execution time was 17.4 ms that was faster than the acquisition time (i.e., 34.4 ms). These results indicate that the mobile GPU-based processing method can support real-time ultrasound B-mode processing on the smartphone.

  3. Parallel multigrid solver of radiative transfer equation for photon transport via graphics processing unit.

    PubMed

    Gao, Hao; Phan, Lan; Lin, Yuting

    2012-09-01

    A graphics processing unit-based parallel multigrid solver for a radiative transfer equation with vacuum boundary condition or reflection boundary condition is presented for heterogeneous media with complex geometry based on two-dimensional triangular meshes or three-dimensional tetrahedral meshes. The computational complexity of this parallel solver is linearly proportional to the degrees of freedom in both angular and spatial variables, while the full multigrid method is utilized to minimize the number of iterations. The overall gain of speed is roughly 30 to 300 fold with respect to our prior multigrid solver, which depends on the underlying regime and the parallelization. The numerical validations are presented with the MATLAB codes at https://sites.google.com/site/rtefastsolver/. PMID:23085905

  4. Particle-In-Cell simulations of high pressure plasmas using graphics processing units

    NASA Astrophysics Data System (ADS)

    Gebhardt, Markus; Atteln, Frank; Brinkmann, Ralf Peter; Mussenbrock, Thomas; Mertmann, Philipp; Awakowicz, Peter

    2009-10-01

    Particle-In-Cell (PIC) simulations are widely used to understand the fundamental phenomena in low-temperature plasmas. Particularly plasmas at very low gas pressures are studied using PIC methods. The inherent drawback of these methods is that they are very time consuming -- certain stability conditions has to be satisfied. This holds even more for the PIC simulation of high pressure plasmas due to the very high collision rates. The simulations take up to very much time to run on standard computers and require the help of computer clusters or super computers. Recent advances in the field of graphics processing units (GPUs) provides every personal computer with a highly parallel multi processor architecture for very little money. This architecture is freely programmable and can be used to implement a wide class of problems. In this paper we present the concepts of a fully parallel PIC simulation of high pressure plasmas using the benefits of GPU programming.

  5. On the use of graphics processing units (GPUs) for molecular dynamics simulation of spherical particles

    NASA Astrophysics Data System (ADS)

    Hidalgo, R. C.; Kanzaki, T.; Alonso-Marroquin, F.; Luding, S.

    2013-06-01

    General-purpose computation on Graphics Processing Units (GPU) on personal computers has recently become an attractive alternative to parallel computing on clusters and supercomputers. We present the GPU-implementation of an accurate molecular dynamics algorithm for a system of spheres. The new hybrid CPU-GPU implementation takes into account all the degrees of freedom, including the quaternion representation of 3D rotations. For additional versatility, the contact interaction between particles is defined using a force law of enhanced generality, which accounts for the elastic and dissipative interactions, and the hard-sphere interaction parameters are translated to the soft-sphere parameter set. We prove that the algorithm complies with the statistical mechanical laws by examining the homogeneous cooling of a granular gas with rotation. The results are in excellent agreement with well established mean-field theories for low-density hard sphere systems. This GPU technique dramatically reduces user waiting time, compared with a traditional CPU implementation.

  6. Acceleration of the GAMESS-UK electronic structure package on graphical processing units.

    PubMed

    Wilkinson, Karl A; Sherwood, Paul; Guest, Martyn F; Naidoo, Kevin J

    2011-07-30

    The approach used to calculate the two-electron integral by many electronic structure packages including generalized atomic and molecular electronic structure system-UK has been designed for CPU-based compute units. We redesigned the two-electron compute algorithm for acceleration on a graphical processing unit (GPU). We report the acceleration strategy and illustrate it on the (ss|ss) type integrals. This strategy is general for Fortran-based codes and uses the Accelerator compiler from Portland Group International and GPU-based accelerators from Nvidia. The evaluation of (ss|ss) type integrals within calculations using Hartree Fock ab initio methods and density functional theory are accelerated by single and quad GPU hardware systems by factors of 43 and 153, respectively. The overall speedup for a single self consistent field cycle is at least a factor of eight times faster on a single GPU compared with that of a single CPU. PMID:21541963

  7. Efficient implementation of effective core potential integrals and gradients on graphical processing units

    NASA Astrophysics Data System (ADS)

    Song, Chenchen; Wang, Lee-Ping; Sachse, Torsten; Preiß, Julia; Presselt, Martin; Martínez, Todd J.

    2015-07-01

    Effective core potential integral and gradient evaluations are accelerated via implementation on graphical processing units (GPUs). Two simple formulas are proposed to estimate the upper bounds of the integrals, and these are used for screening. A sorting strategy is designed to balance the workload between GPU threads properly. Significant improvements in performance and reduced scaling with system size are observed when combining the screening and sorting methods, and the calculations are highly efficient for systems containing up to 10 000 basis functions. The GPU implementation preserves the precision of the calculation; the ground state Hartree-Fock energy achieves good accuracy for CdSe and ZnTe nanocrystals, and energy is well conserved in ab initio molecular dynamics simulations.

  8. Efficient implementation of effective core potential integrals and gradients on graphical processing units.

    PubMed

    Song, Chenchen; Wang, Lee-Ping; Sachse, Torsten; Preiss, Julia; Presselt, Martin; Martínez, Todd J

    2015-07-01

    Effective core potential integral and gradient evaluations are accelerated via implementation on graphical processing units (GPUs). Two simple formulas are proposed to estimate the upper bounds of the integrals, and these are used for screening. A sorting strategy is designed to balance the workload between GPU threads properly. Significant improvements in performance and reduced scaling with system size are observed when combining the screening and sorting methods, and the calculations are highly efficient for systems containing up to 10 000 basis functions. The GPU implementation preserves the precision of the calculation; the ground state Hartree-Fock energy achieves good accuracy for CdSe and ZnTe nanocrystals, and energy is well conserved in ab initio molecular dynamics simulations. PMID:26156472

  9. A graphical method to evaluate predominant geochemical processes occurring in groundwater systems for radiocarbon dating

    USGS Publications Warehouse

    Han, Liang-Feng; Plummer, L. Niel; Aggarwal, Pradeep

    2012-01-01

    A graphical method is described for identifying geochemical reactions needed in the interpretation of radiocarbon age in groundwater systems. Graphs are constructed by plotting the measured 14C, δ13C, and concentration of dissolved inorganic carbon and are interpreted according to specific criteria to recognize water samples that are consistent with a wide range of processes, including geochemical reactions, carbon isotopic exchange, 14C decay, and mixing of waters. The graphs are used to provide a qualitative estimate of radiocarbon age, to deduce the hydrochemical complexity of a groundwater system, and to compare samples from different groundwater systems. Graphs of chemical and isotopic data from a series of previously-published groundwater studies are used to demonstrate the utility of the approach. Ultimately, the information derived from the graphs is used to improve geochemical models for adjustment of radiocarbon ages in groundwater systems.

  10. Acceleration of Electron Repulsion Integral Evaluation on Graphics Processing Units via Use of Recurrence Relations.

    PubMed

    Miao, Yipu; Merz, Kenneth M

    2013-02-12

    Electron repulsion integral (ERI) calculation on graphical processing units (GPUs) can significantly accelerate quantum chemical calculations. Herein, the ab initio self-consistent-field (SCF) calculation is implemented on GPUs using recurrence relations, which is one of the fastest ERI evaluation algorithms currently available. A direct-SCF scheme to assemble the Fock matrix efficiently is presented, wherein ERIs are evaluated on-the-fly to avoid CPU-GPU data transfer, a well-known architectural bottleneck in GPU specific computation. Realized speedups on GPUs reach 10-100 times relative to traditional CPU nodes, with accuracies of better than 1 × 10(-7) for systems with more than 4000 basis functions. PMID:26588740

  11. Real-time resampling in Fourier domain optical coherence tomography using a graphics processing unit.

    PubMed

    Van der Jeught, Sam; Bradu, Adrian; Podoleanu, Adrian Gh

    2010-01-01

    Fourier domain optical coherence tomography (FD-OCT) requires either a linear-in-wavenumber spectrometer or a computationally heavy software algorithm to recalibrate the acquired optical signal from wavelength to wavenumber. The first method is sensitive to the position of the prism in the spectrometer, while the second method drastically slows down the system speed when it is implemented on a serially oriented central processing unit. We implement the full resampling process on a commercial graphics processing unit (GPU), distributing the necessary calculations to many stream processors that operate in parallel. A comparison between several recalibration methods is made in terms of performance and image quality. The GPU is also used to accelerate the fast Fourier transform (FFT) and to remove the background noise, thereby achieving full GPU-based signal processing without the need for extra resampling hardware. A display rate of 25 framessec is achieved for processed images (1,024 x 1,024 pixels) using a line-scan charge-coupled device (CCD) camera operating at 25.6 kHz. PMID:20614994

  12. Real-time resampling in Fourier domain optical coherence tomography using a graphics processing unit

    NASA Astrophysics Data System (ADS)

    van der Jeught, Sam; Bradu, Adrian; Podoleanu, Adrian Gh.

    2010-05-01

    Fourier domain optical coherence tomography (FD-OCT) requires either a linear-in-wavenumber spectrometer or a computationally heavy software algorithm to recalibrate the acquired optical signal from wavelength to wavenumber. The first method is sensitive to the position of the prism in the spectrometer, while the second method drastically slows down the system speed when it is implemented on a serially oriented central processing unit. We implement the full resampling process on a commercial graphics processing unit (GPU), distributing the necessary calculations to many stream processors that operate in parallel. A comparison between several recalibration methods is made in terms of performance and image quality. The GPU is also used to accelerate the fast Fourier transform (FFT) and to remove the background noise, thereby achieving full GPU-based signal processing without the need for extra resampling hardware. A display rate of 25 frames/sec is achieved for processed images (1024×1024 pixels) using a line-scan charge-coupled device (CCD) camera operating at 25.6 kHz.

  13. Process industries - graphic arts, paint, plastics, and textiles: all cousins under the skin

    NASA Astrophysics Data System (ADS)

    Simon, Frederick T.

    2002-06-01

    The origin and selection of colors in the process industries is different depending upon how the creative process is applied and what are the capabilities of the manufacturing process. The fashion industry (clothing) with its supplier of textiles is the leader of color innovation. Color may be introduced into textile products at several stages in the manufacturing process from fiber through yarn and finally into fabric. The paint industry is divided into two major applications: automotive and trades sales. Automotive colors are selected by stylists who are in the employ of the automobile manufacturers. Trade sales paint on the other hand can be decided by paint manufactureres or by invididuals who patronize custom mixing facilities. Plastics colors are for the most part decided by the industrial designers who include color as part of the design. Graphic Arts (painting) is a burgeoning industry that uses color in image reproduction and package design. Except for text, printed material in color today has become the norm rather than an exception.

  14. Real-time speckle variance swept-source optical coherence tomography using a graphics processing unit

    PubMed Central

    Lee, Kenneth K. C.; Mariampillai, Adrian; Yu, Joe X. Z.; Cadotte, David W.; Wilson, Brian C.; Standish, Beau A.; Yang, Victor X. D.

    2012-01-01

    Abstract: Advances in swept source laser technology continues to increase the imaging speed of swept-source optical coherence tomography (SS-OCT) systems. These fast imaging speeds are ideal for microvascular detection schemes, such as speckle variance (SV), where interframe motion can cause severe imaging artifacts and loss of vascular contrast. However, full utilization of the laser scan speed has been hindered by the computationally intensive signal processing required by SS-OCT and SV calculations. Using a commercial graphics processing unit that has been optimized for parallel data processing, we report a complete high-speed SS-OCT platform capable of real-time data acquisition, processing, display, and saving at 108,000 lines per second. Subpixel image registration of structural images was performed in real-time prior to SV calculations in order to reduce decorrelation from stationary structures induced by the bulk tissue motion. The viability of the system was successfully demonstrated in a high bulk tissue motion scenario of human fingernail root imaging where SV images (512 × 512 pixels, n = 4) were displayed at 54 frames per second. PMID:22808428

  15. Real-time blood flow visualization using the graphics processing unit.

    PubMed

    Yang, Owen; Cuccia, David; Choi, Bernard

    2011-01-01

    Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ∼10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark. PMID:21280915

  16. Atmospheric process evaluation of mobile source emissions

    SciTech Connect

    1995-07-01

    During the past two decades there has been a considerable effort in the US to develop and introduce an alternative to the use of gasoline and conventional diesel fuel for transportation. The primary motives for this effort have been twofold: energy security and improvement in air quality, most notably ozone, or smog. The anticipated improvement in air quality is associated with a decrease in the atmospheric reactivity, and sometimes a decrease in the mass emission rate, of the organic gas and NO{sub x} emissions from alternative fuels when compared to conventional transportation fuels. Quantification of these air quality impacts is a prerequisite to decisions on adopting alternative fuels. The purpose of this report is to present a critical review of the procedures and data base used to assess the impact on ambient air quality of mobile source emissions from alternative and conventional transportation fuels and to make recommendations as to how this process can be improved. Alternative transportation fuels are defined as methanol, ethanol, CNG, LPG, and reformulated gasoline. Most of the discussion centers on light-duty AFVs operating on these fuels. Other advanced transportation technologies and fuels such as hydrogen, electric vehicles, and fuel cells, will not be discussed. However, the issues raised herein can also be applied to these technologies and other classes of vehicles, such as heavy-duty diesels (HDDs). An evaluation of the overall impact of AFVs on society requires consideration of a number of complex issues. It involves the development of new vehicle technology associated with engines, fuel systems, and emission control technology; the implementation of the necessary fuel infrastructure; and an appropriate understanding of the economic, health, safety, and environmental impacts associated with the use of these fuels. This report addresses the steps necessary to properly evaluate the impact of AFVs on ozone air quality.

  17. Optical diagnostics of a single evaporating droplet using fast parallel computing on graphics processing units

    NASA Astrophysics Data System (ADS)

    Jakubczyk, D.; Migacz, S.; Derkachov, G.; Woźniak, M.; Archer, J.; Kolwas, K.

    2016-09-01

    We report on the first application of the graphics processing units (GPUs) accelerated computing technology to improve performance of numerical methods used for the optical characterization of evaporating microdroplets. Single microdroplets of various liquids with different volatility and molecular weight (glycerine, glycols, water, etc.), as well as mixtures of liquids and diverse suspensions evaporate inside the electrodynamic trap under the chosen temperature and composition of atmosphere. The series of scattering patterns recorded from the evaporating microdroplets are processed by fitting complete Mie theory predictions with gradientless lookup table method. We showed that computations on GPUs can be effectively applied to inverse scattering problems. In particular, our technique accelerated calculations of the Mie scattering theory on a single-core processor in a Matlab environment over 800 times and almost 100 times comparing to the corresponding code in C language. Additionally, we overcame problems of the time-consuming data post-processing when some of the parameters (particularly the refractive index) of an investigated liquid are uncertain. Our program allows us to track the parameters characterizing the evaporating droplet nearly simultaneously with the progress of evaporation.

  18. Developing extensible lattice-Boltzmann simulators for general-purpose graphics-processing units

    SciTech Connect

    Walsh, S C; Saar, M O

    2011-12-21

    Lattice-Boltzmann methods are versatile numerical modeling techniques capable of reproducing a wide variety of fluid-mechanical behavior. These methods are well suited to parallel implementation, particularly on the single-instruction multiple data (SIMD) parallel processing environments found in computer graphics processing units (GPUs). Although more recent programming tools dramatically improve the ease with which GPU programs can be written, the programming environment still lacks the flexibility available to more traditional CPU programs. In particular, it may be difficult to develop modular and extensible programs that require variable on-device functionality with current GPU architectures. This paper describes a process of automatic code generation that overcomes these difficulties for lattice-Boltzmann simulations. It details the development of GPU-based modules for an extensible lattice-Boltzmann simulation package - LBHydra. The performance of the automatically generated code is compared to equivalent purpose written codes for both single-phase, multiple-phase, and multiple-component flows. The flexibility of the new method is demonstrated by simulating a rising, dissolving droplet in a porous medium with user generated lattice-Boltzmann models and subroutines.

  19. Graphical Technique to Support the Teaching/Learning Process of Software Process Reference Models

    NASA Astrophysics Data System (ADS)

    Espinosa-Curiel, Ismael Edrein; Rodríguez-Jacobo, Josefina; Fernández-Zepeda, José Alberto

    In this paper, we propose a set of diagrams to visualize software process reference models (PRM). The diagrams, called dimods, are the combination of some visual and process modeling techniques such as rich pictures, mind maps, IDEF and RAD diagrams. We show the use of this technique by designing a set of dimods for the Mexican Software Industry Process Model (MoProSoft). Additionally, we perform an evaluation of the usefulness of dimods. The result of the evaluation shows that dimods may be a support tool that facilitates the understanding, memorization, and learning of software PRMs in both, software development organizations and universities. The results also show that dimods may have advantages over the traditional description methods for these types of models.

  20. A low-cost microcomputer system for interactive graphical processing of geophysical data on magnetic tape

    NASA Astrophysics Data System (ADS)

    Ulbrich, Carlton W.; Holden, Daniel N.

    In a recent Eos article (“Applications of Personal Computers in Geophysics,” Eos, November 18, 1986, p. 1321), W.H.K. Lee, J.C. Lahr, and R.E. Haberman described the uses of microcomputers in scientific work, with emphasis on applications in geophysics. One of the conclusions of their article was that common microcomputers are not convenient for processing of geophysical data in a high-level language such as Fortran because of the long times required to compile executable programs of only moderate size. They also indicate that common personal computers (PCs) are usually not equipped with tape drives and are not powerful enough to do heavy input/output (I/O) or “number crunching.” The purpose of this note is to supplement the material that Lee et al. presented by describing a relatively low-cost microcomputer system that is capable of performing interactive graphical processing of meteorological data on magnetic tape in a variety of computer languages, including Fortran.

  1. Real-time lossy compression of hyperspectral images using iterative error analysis on graphics processing units

    NASA Astrophysics Data System (ADS)

    Sánchez, Sergio; Plaza, Antonio

    2012-06-01

    Hyperspectral image compression is an important task in remotely sensed Earth Observation as the dimensionality of this kind of image data is ever increasing. This requires on-board compression in order to optimize the donwlink connection when sending the data to Earth. A successful algorithm to perform lossy compression of remotely sensed hyperspectral data is the iterative error analysis (IEA) algorithm, which applies an iterative process which allows controlling the amount of information loss and compression ratio depending on the number of iterations. This algorithm, which is based on spectral unmixing concepts, can be computationally expensive for hyperspectral images with high dimensionality. In this paper, we develop a new parallel implementation of the IEA algorithm for hyperspectral image compression on graphics processing units (GPUs). The proposed implementation is tested on several different GPUs from NVidia, and is shown to exhibit real-time performance in the analysis of an Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) data sets collected over different locations. The proposed algorithm and its parallel GPU implementation represent a significant advance towards real-time onboard (lossy) compression of hyperspectral data where the quality of the compression can be also adjusted in real-time.

  2. Spatial resolution recovery utilizing multi-ray tracing and graphic processing unit in PET image reconstruction

    NASA Astrophysics Data System (ADS)

    Liang, Yicheng; Peng, Hao

    2015-02-01

    Depth-of-interaction (DOI) poses a major challenge for a PET system to achieve uniform spatial resolution across the field-of-view, particularly for small animal and organ-dedicated PET systems. In this work, we implemented an analytical method to model system matrix for resolution recovery, which was then incorporated in PET image reconstruction on a graphical processing unit platform, due to its parallel processing capacity. The method utilizes the concepts of virtual DOI layers and multi-ray tracing to calculate the coincidence detection response function for a given line-of-response. The accuracy of the proposed method was validated for a small-bore PET insert to be used for simultaneous PET/MR breast imaging. In addition, the performance comparisons were studied among the following three cases: 1) no physical DOI and no resolution modeling; 2) two physical DOI layers and no resolution modeling; and 3) no physical DOI design but with a different number of virtual DOI layers. The image quality was quantitatively evaluated in terms of spatial resolution (full-width-half-maximum and position offset), contrast recovery coefficient and noise. The results indicate that the proposed method has the potential to be used as an alternative to other physical DOI designs and achieve comparable imaging performances, while reducing detector/system design cost and complexity.

  3. Particle-in-cell simulations with charge-conserving current deposition on graphic processing units

    NASA Astrophysics Data System (ADS)

    Ren, Chuang; Kong, Xianglong; Huang, Michael; Decyk, Viktor; Mori, Warren

    2011-10-01

    Recently using CUDA, we have developed an electromagnetic Particle-in-Cell (PIC) code with charge-conserving current deposition for Nvidia graphic processing units (GPU's) (Kong et al., Journal of Computational Physics 230, 1676 (2011). On a Tesla M2050 (Fermi) card, the GPU PIC code can achieve a one-particle-step process time of 1.2 - 3.2 ns in 2D and 2.3 - 7.2 ns in 3D, depending on plasma temperatures. In this talk we will discuss novel algorithms for GPU-PIC including charge-conserving current deposition scheme with few branching and parallel particle sorting. These algorithms have made efficient use of the GPU shared memory. We will also discuss how to replace the computation kernels of existing parallel CPU codes while keeping their parallel structures. This work was supported by U.S. Department of Energy under Grant Nos. DE-FG02-06ER54879 and DE-FC02-04ER54789 and by NSF under Grant Nos. PHY-0903797 and CCF-0747324.

  4. Fast ray-tracing of human eye optics on Graphics Processing Units.

    PubMed

    Wei, Qi; Patkar, Saket; Pai, Dinesh K

    2014-05-01

    We present a new technique for simulating retinal image formation by tracing a large number of rays from objects in three dimensions as they pass through the optic apparatus of the eye to objects. Simulating human optics is useful for understanding basic questions of vision science and for studying vision defects and their corrections. Because of the complexity of computing such simulations accurately, most previous efforts used simplified analytical models of the normal eye. This makes them less effective in modeling vision disorders associated with abnormal shapes of the ocular structures which are hard to be precisely represented by analytical surfaces. We have developed a computer simulator that can simulate ocular structures of arbitrary shapes, for instance represented by polygon meshes. Topographic and geometric measurements of the cornea, lens, and retina from keratometer or medical imaging data can be integrated for individualized examination. We utilize parallel processing using modern Graphics Processing Units (GPUs) to efficiently compute retinal images by tracing millions of rays. A stable retinal image can be generated within minutes. We simulated depth-of-field, accommodation, chromatic aberrations, as well as astigmatism and correction. We also show application of the technique in patient specific vision correction by incorporating geometric models of the orbit reconstructed from clinical medical images. PMID:24713524

  5. Graphics Processing Unit (GPU) Acceleration of the Goddard Earth Observing System Atmospheric Model

    NASA Technical Reports Server (NTRS)

    Putnam, Williama

    2011-01-01

    The Goddard Earth Observing System 5 (GEOS-5) is the atmospheric model used by the Global Modeling and Assimilation Office (GMAO) for a variety of applications, from long-term climate prediction at relatively coarse resolution, to data assimilation and numerical weather prediction, to very high-resolution cloud-resolving simulations. GEOS-5 is being ported to a graphics processing unit (GPU) cluster at the NASA Center for Climate Simulation (NCCS). By utilizing GPU co-processor technology, we expect to increase the throughput of GEOS-5 by at least an order of magnitude, and accelerate the process of scientific exploration across all scales of global modeling, including: The large-scale, high-end application of non-hydrostatic, global, cloud-resolving modeling at 10- to I-kilometer (km) global resolutions Intermediate-resolution seasonal climate and weather prediction at 50- to 25-km on small clusters of GPUs Long-range, coarse-resolution climate modeling, enabled on a small box of GPUs for the individual researcher After being ported to the GPU cluster, the primary physics components and the dynamical core of GEOS-5 have demonstrated a potential speedup of 15-40 times over conventional processor cores. Performance improvements of this magnitude reduce the required scalability of 1-km, global, cloud-resolving models from an unfathomable 6 million cores to an attainable 200,000 GPU-enabled cores.

  6. Accelerated rescaling of single Monte Carlo simulation runs with the Graphics Processing Unit (GPU).

    PubMed

    Yang, Owen; Choi, Bernard

    2013-01-01

    To interpret fiber-based and camera-based measurements of remitted light from biological tissues, researchers typically use analytical models, such as the diffusion approximation to light transport theory, or stochastic models, such as Monte Carlo modeling. To achieve rapid (ideally real-time) measurement of tissue optical properties, especially in clinical situations, there is a critical need to accelerate Monte Carlo simulation runs. In this manuscript, we report on our approach using the Graphics Processing Unit (GPU) to accelerate rescaling of single Monte Carlo runs to calculate rapidly diffuse reflectance values for different sets of tissue optical properties. We selected MATLAB to enable non-specialists in C and CUDA-based programming to use the generated open-source code. We developed a software package with four abstraction layers. To calculate a set of diffuse reflectance values from a simulated tissue with homogeneous optical properties, our rescaling GPU-based approach achieves a reduction in computation time of several orders of magnitude as compared to other GPU-based approaches. Specifically, our GPU-based approach generated a diffuse reflectance value in 0.08ms. The transfer time from CPU to GPU memory currently is a limiting factor with GPU-based calculations. However, for calculation of multiple diffuse reflectance values, our GPU-based approach still can lead to processing that is ~3400 times faster than other GPU-based approaches. PMID:24298424

  7. Lossy hyperspectral image compression on a graphics processing unit: parallelization strategy and performance evaluation

    NASA Astrophysics Data System (ADS)

    Santos, Lucana; Magli, Enrico; Vitulli, Raffaele; Núñez, Antonio; López, José F.; Sarmiento, Roberto

    2013-01-01

    There is an intense necessity for the development of new hardware architectures for the implementation of algorithms for hyperspectral image compression on board satellites. Graphics processing units (GPUs) represent a very attractive opportunity, offering the possibility to dramatically increase the computation speed in applications that are data and task parallel. An algorithm for the lossy compression of hyperspectral images is implemented on a GPU using Nvidia computer unified device architecture (CUDA) parallel computing architecture. The parallelization strategy is explained, with emphasis on the entropy coding and bit packing phases, for which a more sophisticated strategy is necessary due to the existing data dependencies. Experimental results are obtained by comparing the performance of the GPU implementation with a single-threaded CPU implementation, showing high speedups of up to 15.41. A profiling of the algorithm is provided, demonstrating the high performance of the designed parallel entropy coding phase. The accuracy of the GPU implementation is presented, as well as the effect of the configuration parameters on performance. The convenience of using GPUs for on-board processing is demonstrated, and solutions to the potential difficulties encountered when accelerating hyperspectral compression algorithms are proposed, if space-qualified GPUs become a reality in the near future.

  8. Spatial resolution recovery utilizing multi-ray tracing and graphic processing unit in PET image reconstruction.

    PubMed

    Liang, Yicheng; Peng, Hao

    2015-02-01

    Depth-of-interaction (DOI) poses a major challenge for a PET system to achieve uniform spatial resolution across the field-of-view, particularly for small animal and organ-dedicated PET systems. In this work, we implemented an analytical method to model system matrix for resolution recovery, which was then incorporated in PET image reconstruction on a graphical processing unit platform, due to its parallel processing capacity. The method utilizes the concepts of virtual DOI layers and multi-ray tracing to calculate the coincidence detection response function for a given line-of-response. The accuracy of the proposed method was validated for a small-bore PET insert to be used for simultaneous PET/MR breast imaging. In addition, the performance comparisons were studied among the following three cases: 1) no physical DOI and no resolution modeling; 2) two physical DOI layers and no resolution modeling; and 3) no physical DOI design but with a different number of virtual DOI layers. The image quality was quantitatively evaluated in terms of spatial resolution (full-width-half-maximum and position offset), contrast recovery coefficient and noise. The results indicate that the proposed method has the potential to be used as an alternative to other physical DOI designs and achieve comparable imaging performances, while reducing detector/system design cost and complexity. PMID:25591118

  9. Ultra-fast displaying Spectral Domain Optical Doppler Tomography system using a Graphics Processing Unit.

    PubMed

    Jeong, Hyosang; Cho, Nam Hyun; Jung, Unsang; Lee, Changho; Kim, Jeong-Yeon; Kim, Jeehyun

    2012-01-01

    We demonstrate an ultrafast displaying Spectral Domain Optical Doppler Tomography system using Graphics Processing Unit (GPU) computing. The calculation of FFT and the Doppler frequency shift is accelerated by the GPU. Our system can display processed OCT and ODT images simultaneously in real time at 120 fps for 1,024 pixels × 512 lateral A-scans. The computing time for the Doppler information was dependent on the size of the moving average window, but with a window size of 32 pixels the ODT computation time is only 8.3 ms, which is comparable to the data acquisition time. Also the phase noise decreases significantly with the window size. Since the performance of a real-time display for OCT/ODT is very important for clinical applications that need immediate diagnosis for screening or biopsy. Intraoperative surgery can take much benefit from the real-time display flow rate information from the technology. Moreover, the GPU is an attractive tool for clinical and commercial systems for functional OCT features as well. PMID:22969328

  10. A performance comparison of different graphics processing units running direct N-body simulations

    NASA Astrophysics Data System (ADS)

    Capuzzo-Dolcetta, R.; Spera, M.

    2013-11-01

    Hybrid computational architectures based on the joint power of Central Processing Units (CPUs) and Graphic Processing Units (GPUs) are becoming popular and powerful hardware tools for a wide range of simulations in biology, chemistry, engineering, physics, etc. In this paper we present a performance comparison of various GPUs available on market when applied to the numerical integration of the classic, gravitational, N-body problem. To do this, we developed an OpenCL version of the parallel code HiGPUs used for these tests, because this portable version is the only apt to work on GPUs of different makes. The main general result is that we confirm the reliability, speed and cheapness of GPUs when applied to the examined kind of problems (i.e. when the forces to evaluate are dependent on the mutual distances, as it happens in gravitational physics and molecular dynamics). More specifically, we find that also the cheap GPUs built to be employed just for gaming applications are very performant in terms of computing speed also in scientific applications and, although with some limitations concerning on-board memory, can be a good choice to build a cheap and efficient machine for scientific applications.

  11. Graphics processing units as tools to predict mechanisms of biological signaling pathway regulation

    NASA Astrophysics Data System (ADS)

    McCarter, Patrick; Elston, Timothy; Nagiek, Michal; Dohlman, Henrik

    2013-04-01

    Biochemical and genomic studies have revealed protein components of S. cerevisiae (yeast) signal transduction networks. These networks allow the transmission of extracellular signals to the cell nucleus through coordinated biochemical interactions, resulting in direct responses to specific external stimuli. The coordination and regulation mechanisms of proteins in these networks have not been fully characterized. Thus, in this work we develop systems of ordinary differential equations to characterize processes that regulate signaling pathways. We employ graphics processing units (GPUs) in high performance computing environments to search in parallel through substantially more comprehensive parameter sets than allowed by personal computers. As a result, we are able to parameterize larger models with experimental data, leading to an increase in our model prediction capabilities. Thus far these models have helped to identify specific mechanisms such as positive and negative feedback loops that control network protein activity. We ultimately believe that the use of GPUs in biochemical signal transduction pathway modeling will help to discern how regulation mechanisms allow cells to respond to multiple external stimuli.

  12. Mobile Monitoring Data Processing & Analysis Strategies

    EPA Science Inventory

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  13. Mobile Monitoring Data Processing and Analysis Strategies

    EPA Science Inventory

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  14. VACTIV: A graphical dialog based program for an automatic processing of line and band spectra

    NASA Astrophysics Data System (ADS)

    Zlokazov, V. B.

    2013-05-01

    The program VACTIV-Visual ACTIV-has been developed for an automatic analysis of spectrum-like distributions, in particular gamma-ray spectra or alpha-spectra and is a standard graphical dialog based Windows XX application, driven by a menu, mouse and keyboard. On the one hand, it was a conversion of an existing Fortran program ACTIV [1] to the DELPHI language; on the other hand, it is a transformation of the sequential syntax of Fortran programming to a new object-oriented style, based on the organization of event interactions. New features implemented in the algorithms of both the versions consisted in the following as peak model both an analytical function and a graphical curve could be used; the peak search algorithm was able to recognize not only Gauss peaks but also peaks with an irregular form; both narrow peaks (2-4 channels) and broad ones (50-100 channels); the regularization technique in the fitting guaranteed a stable solution in the most complicated cases of strongly overlapping or weak peaks. The graphical dialog interface of VACTIV is much more convenient than the batch mode of ACTIV. [1] V.B. Zlokazov, Computer Physics Communications, 28 (1982) 27-37. NEW VERSION PROGRAM SUMMARYProgram Title: VACTIV Catalogue identifier: ABAC_v2_0 Licensing provisions: no Programming language: DELPHI 5-7 Pascal. Computer: IBM PC series. Operating system: Windows XX. RAM: 1 MB Keywords: Nuclear physics, spectrum decomposition, least squares analysis, graphical dialog, object-oriented programming. Classification: 17.6. Catalogue identifier of previous version: ABAC_v1_0 Journal reference of previous version: Comput. Phys. Commun. 28 (1982) 27 Does the new version supersede the previous version?: Yes. Nature of problem: Program VACTIV is intended for precise analysis of arbitrary spectrum-like distributions, e.g. gamma-ray and X-ray spectra and allows the user to carry out the full cycle of automatic processing of such spectra, i.e. calibration, automatic peak search

  15. VACTIV: A graphical dialog based program for an automatic processing of line and band spectra

    NASA Astrophysics Data System (ADS)

    Zlokazov, V. B.

    2013-05-01

    The program VACTIV-Visual ACTIV-has been developed for an automatic analysis of spectrum-like distributions, in particular gamma-ray spectra or alpha-spectra and is a standard graphical dialog based Windows XX application, driven by a menu, mouse and keyboard. On the one hand, it was a conversion of an existing Fortran program ACTIV [1] to the DELPHI language; on the other hand, it is a transformation of the sequential syntax of Fortran programming to a new object-oriented style, based on the organization of event interactions. New features implemented in the algorithms of both the versions consisted in the following as peak model both an analytical function and a graphical curve could be used; the peak search algorithm was able to recognize not only Gauss peaks but also peaks with an irregular form; both narrow peaks (2-4 channels) and broad ones (50-100 channels); the regularization technique in the fitting guaranteed a stable solution in the most complicated cases of strongly overlapping or weak peaks. The graphical dialog interface of VACTIV is much more convenient than the batch mode of ACTIV. [1] V.B. Zlokazov, Computer Physics Communications, 28 (1982) 27-37. NEW VERSION PROGRAM SUMMARYProgram Title: VACTIV Catalogue identifier: ABAC_v2_0 Licensing provisions: no Programming language: DELPHI 5-7 Pascal. Computer: IBM PC series. Operating system: Windows XX. RAM: 1 MB Keywords: Nuclear physics, spectrum decomposition, least squares analysis, graphical dialog, object-oriented programming. Classification: 17.6. Catalogue identifier of previous version: ABAC_v1_0 Journal reference of previous version: Comput. Phys. Commun. 28 (1982) 27 Does the new version supersede the previous version?: Yes. Nature of problem: Program VACTIV is intended for precise analysis of arbitrary spectrum-like distributions, e.g. gamma-ray and X-ray spectra and allows the user to carry out the full cycle of automatic processing of such spectra, i.e. calibration, automatic peak search

  16. Large-scale analytical Fourier transform of photomask layouts using graphics processing units

    NASA Astrophysics Data System (ADS)

    Sakamoto, Julia A.

    2015-10-01

    Compensation of lens-heating effects during the exposure scan in an optical lithographic system requires knowledge of the heating profile in the pupil of the projection lens. A necessary component in the accurate estimation of this profile is the total integrated distribution of light, relying on the squared modulus of the Fourier transform (FT) of the photomask layout for individual process layers. Requiring a layout representation in pixelated image format, the most common approach is to compute the FT numerically via the fast Fourier transform (FFT). However, the file size for a standard 26- mm×33-mm mask with 5-nm pixels is an overwhelming 137 TB in single precision; the data importing process alone, prior to FFT computation, can render this method highly impractical. A more feasible solution is to handle layout data in a highly compact format with vertex locations of mask features (polygons), which correspond to elements in an integrated circuit, as well as pattern symmetries and repetitions (e.g., GDSII format). Provided the polygons can decompose into shapes for which analytical FT expressions are possible, the analytical approach dramatically reduces computation time and alleviates the burden of importing extensive mask data. Algorithms have been developed for importing and interpreting hierarchical layout data and computing the analytical FT on a graphics processing unit (GPU) for rapid parallel processing, not assuming incoherent imaging. Testing was performed on the active layer of a 392- μm×297-μm virtual chip test structure with 43 substructures distributed over six hierarchical levels. The factor of improvement in the analytical versus numerical approach for importing layout data, performing CPU-GPU memory transfers, and executing the FT on a single NVIDIA Tesla K20X GPU was 1.6×104, 4.9×103, and 3.8×103, respectively. Various ideas for algorithm enhancements will be discussed.

  17. Process and Object Interpretations of Vector Magnitude Mediated by Use of the Graphics Calculator.

    ERIC Educational Resources Information Center

    Forster, Patricia

    2000-01-01

    Analyzes the development of one student's understanding of vector magnitude and how her problem solving was mediated by use of the absolute value graphics calculator function. (Contains 35 references.) (Author/ASK)

  18. Parallel design of JPEG-LS encoder on graphics processing units

    NASA Astrophysics Data System (ADS)

    Duan, Hao; Fang, Yong; Huang, Bormin

    2012-01-01

    With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.

  19. The application of projected conjugate gradient solvers on graphical processing units

    SciTech Connect

    Lin, Youzuo; Renaut, Rosemary

    2011-01-26

    Graphical processing units introduce the capability for large scale computation at the desktop. Presented numerical results verify that efficiencies and accuracies of basic linear algebra subroutines of all levels when implemented in CUDA and Jacket are comparable. But experimental results demonstrate that the basic linear algebra subroutines of level three offer the greatest potential for improving efficiency of basic numerical algorithms. We consider the solution of the multiple right hand side set of linear equations using Krylov subspace-based solvers. Thus, for the multiple right hand side case, it is more efficient to make use of a block implementation of the conjugate gradient algorithm, rather than to solve each system independently. Jacket is used for the implementation. Furthermore, including projection from one system to another improves efficiency. A relevant example, for which simulated results are provided, is the reconstruction of a three dimensional medical image volume acquired from a positron emission tomography scanner. Efficiency of the reconstruction is improved by using projection across nearby slices.

  20. Simulation of Coarse-Grained Protein-Protein Interactions with Graphics Processing Units.

    PubMed

    Tunbridge, Ian; Best, Robert B; Gain, James; Kuttel, Michelle M

    2010-11-01

    We report a hybrid parallel central and graphics processing units (CPU-GPU) implementation of a coarse-grained model for replica exchange Monte Carlo (REMC) simulations of protein assemblies. We describe the design, optimization, validation, and benchmarking of our algorithms, particularly the parallelization strategy, which is specific to the requirements of GPU hardware. Performance evaluation of our hybrid implementation shows scaled speedup as compared to a single-core CPU; reference simulations of small 100 residue proteins have a modest speedup of 4, while large simulations with thousands of residues are up to 1400 times faster. Importantly, the combination of coarse-grained models with highly parallel GPU hardware vastly increases the length- and time-scales accessible for protein simulation, making it possible to simulate much larger systems of interacting proteins than have previously been attempted. As a first step toward the simulation of the assembly of an entire viral capsid, we have demonstrated that the chosen coarse-grained model, together with REMC sampling, is capable of identifying the correctly bound structure, for a pair of fragments from the human hepatitis B virus capsid. Our parallel solution can easily be generalized to other interaction functions and other types of macromolecules and has implications for the parallelization of similar N-body problems that require random access lookups. PMID:26617104

  1. Graphics processing unit accelerated one-dimensional blood flow computation in the human arterial tree.

    PubMed

    Itu, Lucian; Sharma, Puneet; Kamen, Ali; Suciu, Constantin; Comaniciu, Dorin

    2013-12-01

    One-dimensional blood flow models have been used extensively for computing pressure and flow waveforms in the human arterial circulation. We propose an improved numerical implementation based on a graphics processing unit (GPU) for the acceleration of the execution time of one-dimensional model. A novel parallel hybrid CPU-GPU algorithm with compact copy operations (PHCGCC) and a parallel GPU only (PGO) algorithm are developed, which are compared against previously introduced PHCG versions, a single-threaded CPU only algorithm and a multi-threaded CPU only algorithm. Different second-order numerical schemes (Lax-Wendroff and Taylor series) are evaluated for the numerical solution of one-dimensional model, and the computational setups include physiologically motivated non-periodic (Windkessel) and periodic boundary conditions (BC) (structured tree) and elastic and viscoelastic wall laws. Both the PHCGCC and the PGO implementations improved the execution time significantly. The speed-up values over the single-threaded CPU only implementation range from 5.26 to 8.10 × , whereas the speed-up values over the multi-threaded CPU only implementation range from 1.84 to 4.02 × . The PHCGCC algorithm performs best for an elastic wall law with non-periodic BC and for viscoelastic wall laws, whereas the PGO algorithm performs best for an elastic wall law with periodic BC. PMID:24009129

  2. Seismic interpretation using Support Vector Machines implemented on Graphics Processing Units

    SciTech Connect

    Kuzma, H A; Rector, J W; Bremer, D

    2006-06-22

    Support Vector Machines (SVMs) estimate lithologic properties of rock formations from seismic data by interpolating between known models using synthetically generated model/data pairs. SVMs are related to kriging and radial basis function neural networks. In our study, we train an SVM to approximate an inverse to the Zoeppritz equations. Training models are sampled from distributions constructed from well-log statistics. Training data is computed via a physically realistic forward modeling algorithm. In our experiments, each training data vector is a set of seismic traces similar to a 2-d image. The SVM returns a model given by a weighted comparison of the new data to each training data vector. The method of comparison is given by a kernel function which implicitly transforms data into a high-dimensional feature space and performs a dot-product. The feature space of a Gaussian kernel is made up of sines and cosines and so is appropriate for band-limited seismic problems. Training an SVM involves estimating a set of weights from the training model/data pairs. It is designed to be an easy problem; at worst it is a quadratic programming problem on the order of the size of the training set. By implementing the slowest part of our SVM algorithm on a graphics processing unit (GPU), we improve the speed of the algorithm by two orders of magnitude. Our SVM/GPU combination achieves results that are similar to those of conventional iterative inversion in fractions of the time.

  3. High-throughput Characterization of Porous Materials Using Graphics Processing Units

    SciTech Connect

    Kim, Jihan; Martin, Richard L.; Ruebel, Oliver; Haranczyk, Maciej; Smit, Berend

    2012-03-19

    We have developed a high-throughput graphics processing units (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations where the grid values represent interactions (i.e., Lennard-Jones + Coulomb potentials) between gas molecules (i.e., CH$_{4}$ and CO$_{2}$) and material's framework atoms. Using a parallel flood fill CPU algorithm, inaccessible regions inside the framework structures are identified and blocked based on their energy profiles. Finally, we compute the Henry coefficients and heats of adsorption through statistical Widom insertion Monte Carlo moves in the domain restricted to the accessible space. The code offers significant speedup over a single core CPU code and allows us to characterize a set of porous materials at least an order of magnitude larger than ones considered in earlier studies. For structures selected from such a prescreening algorithm, full adsorption isotherms can be calculated by conducting multiple grand canonical Monte Carlo simulations concurrently within the GPU.

  4. Accelerated Molecular Dynamics Simulations with the AMOEBA Polarizable Force Field on Graphics Processing Units

    PubMed Central

    2013-01-01

    The accelerated molecular dynamics (aMD) method has recently been shown to enhance the sampling of biomolecules in molecular dynamics (MD) simulations, often by several orders of magnitude. Here, we describe an implementation of the aMD method for the OpenMM application layer that takes full advantage of graphics processing units (GPUs) computing. The aMD method is shown to work in combination with the AMOEBA polarizable force field (AMOEBA-aMD), allowing the simulation of long time-scale events with a polarizable force field. Benchmarks are provided to show that the AMOEBA-aMD method is efficiently implemented and produces accurate results in its standard parametrization. For the BPTI protein, we demonstrate that the protein structure described with AMOEBA remains stable even on the extended time scales accessed at high levels of accelerations. For the DNA repair metalloenzyme endonuclease IV, we show that the use of the AMOEBA force field is a significant improvement over fixed charged models for describing the enzyme active-site. The new AMOEBA-aMD method is publicly available (http://wiki.simtk.org/openmm/VirtualRepository) and promises to be interesting for studying complex systems that can benefit from both the use of a polarizable force field and enhanced sampling. PMID:24634618

  5. Accelerating image reconstruction in three-dimensional optoacoustic tomography on graphics processing units

    PubMed Central

    Wang, Kun; Huang, Chao; Kao, Yu-Jiun; Chou, Cheng-Ying; Oraevsky, Alexander A.; Anastasio, Mark A.

    2013-01-01

    Purpose: Optoacoustic tomography (OAT) is inherently a three-dimensional (3D) inverse problem. However, most studies of OAT image reconstruction still employ two-dimensional imaging models. One important reason is because 3D image reconstruction is computationally burdensome. The aim of this work is to accelerate existing image reconstruction algorithms for 3D OAT by use of parallel programming techniques. Methods: Parallelization strategies are proposed to accelerate a filtered backprojection (FBP) algorithm and two different pairs of projection/backprojection operations that correspond to two different numerical imaging models. The algorithms are designed to fully exploit the parallel computing power of graphics processing units (GPUs). In order to evaluate the parallelization strategies for the projection/backprojection pairs, an iterative image reconstruction algorithm is implemented. Computer simulation and experimental studies are conducted to investigate the computational efficiency and numerical accuracy of the developed algorithms. Results: The GPU implementations improve the computational efficiency by factors of 1000, 125, and 250 for the FBP algorithm and the two pairs of projection/backprojection operators, respectively. Accurate images are reconstructed by use of the FBP and iterative image reconstruction algorithms from both computer-simulated and experimental data. Conclusions: Parallelization strategies for 3D OAT image reconstruction are proposed for the first time. These GPU-based implementations significantly reduce the computational time for 3D image reconstruction, complementing our earlier work on 3D OAT iterative image reconstruction. PMID:23387778

  6. Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

    NASA Astrophysics Data System (ADS)

    Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

    2013-01-01

    Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.

  7. Density-fitted singles and doubles coupled cluster on graphics processing units

    SciTech Connect

    Sherrill, David; Sumpter, Bobby G; DePrince, III, A. Eugene

    2014-01-01

    We adapt an algorithm for singles and doubles coupled cluster (CCSD) that uses density fitting (DF) or Cholesky decomposition (CD) in the construction and contraction of all electron repulsion integrals (ERI s) for use on heterogeneous compute nodes consisting of a multicore CPU and at least one graphics processing unit (GPU). The use of approximate 3-index ERI s ameliorates two of the major difficulties in designing scientific algorithms for GPU s: (i) the extremely limited global memory on the devices and (ii) the overhead associated with data motion across the PCI bus. For the benzene trimer described by an aug-cc-pVDZ basis set, the use of a single NVIDIA Tesla C2070 (Fermi) GPU accelerates a CD-CCSD computation by a factor of 2.1, relative to the multicore CPU-only algorithm that uses 6 highly efficient Intel core i7-3930K CPU cores. The use of two Fermis provides an acceleration of 2.89, which is comparable to that observed when using a single NVIDIA Kepler K20c GPU (2.73).

  8. Graphics processing unit (GPU)-accelerated particle filter framework for positron emission tomography image reconstruction.

    PubMed

    Yu, Fengchao; Liu, Huafeng; Hu, Zhenghui; Shi, Pengcheng

    2012-04-01

    As a consequence of the random nature of photon emissions and detections, the data collected by a positron emission tomography (PET) imaging system can be shown to be Poisson distributed. Meanwhile, there have been considerable efforts within the tracer kinetic modeling communities aimed at establishing the relationship between the PET data and physiological parameters that affect the uptake and metabolism of the tracer. Both statistical and physiological models are important to PET reconstruction. The majority of previous efforts are based on simplified, nonphysical mathematical expression, such as Poisson modeling of the measured data, which is, on the whole, completed without consideration of the underlying physiology. In this paper, we proposed a graphics processing unit (GPU)-accelerated reconstruction strategy that can take both statistical model and physiological model into consideration with the aid of state-space evolution equations. The proposed strategy formulates the organ activity distribution through tracer kinetics models and the photon-counting measurements through observation equations, thus making it possible to unify these two constraints into a general framework. In order to accelerate reconstruction, GPU-based parallel computing is introduced. Experiments of Zubal-thorax-phantom data, Monte Carlo simulated phantom data, and real phantom data show the power of the method. Furthermore, thanks to the computing power of the GPU, the reconstruction time is practical for clinical application. PMID:22472843

  9. Acceleration of High Angular Momentum Electron Repulsion Integrals and Integral Derivatives on Graphics Processing Units.

    PubMed

    Miao, Yipu; Merz, Kenneth M

    2015-04-14

    We present an efficient implementation of ab initio self-consistent field (SCF) energy and gradient calculations that run on Compute Unified Device Architecture (CUDA) enabled graphical processing units (GPUs) using recurrence relations. We first discuss the machine-generated code that calculates the electron-repulsion integrals (ERIs) for different ERI types. Next we describe the porting of the SCF gradient calculation to GPUs, which results in an acceleration of the computation of the first-order derivative of the ERIs. However, only s, p, and d ERIs and s and p derivatives could be executed simultaneously on GPUs using the current version of CUDA and generation of NVidia GPUs using a previously described algorithm [Miao and Merz J. Chem. Theory Comput. 2013, 9, 965-976.]. Hence, we developed an algorithm to compute f type ERIs and d type ERI derivatives on GPUs. Our benchmarks shows the performance GPU enable ERI and ERI derivative computation yielded speedups of 10-18 times relative to traditional CPU execution. An accuracy analysis using double-precision calculations demonstrates that the overall accuracy is satisfactory for most applications. PMID:26574356

  10. Graphic processing unit accelerated real-time partially coherent beam generator

    NASA Astrophysics Data System (ADS)

    Ni, Xiaolong; Liu, Zhi; Chen, Chunyi; Jiang, Huilin; Fang, Hanhan; Song, Lujun; Zhang, Su

    2016-07-01

    A method of using liquid-crystals (LCs) to generate a partially coherent beam in real-time is described. An expression for generating a partially coherent beam is given and calculated using a graphic processing unit (GPU), i.e., the GeForce GTX 680. A liquid-crystal on silicon (LCOS) with 256 × 256 pixels is used as the partially coherent beam generator (PCBG). An optimizing method with partition convolution is used to improve the generating speed of our LC PCBG. The total time needed to generate a random phase map with a coherence width range from 0.015 mm to 1.5 mm is less than 2.4 ms for calculation and readout with the GPU; adding the time needed for the CPU to read and send to LCOS with the response time of the LC PCBG, the real-time partially coherent beam (PCB) generation frequency of our LC PCBG is up to 312 Hz. To our knowledge, it is the first real-time partially coherent beam generator. A series of experiments based on double pinhole interference are performed. The result shows that to generate a laser beam with a coherence width of 0.9 mm and 1.5 mm, with a mean error of approximately 1%, the RMS values needed 0.021306 and 0.020883 and the PV values required 0.073576 and 0.072998, respectively.

  11. Fast, multi-channel real-time processing of signals with microsecond latency using graphics processing units

    NASA Astrophysics Data System (ADS)

    Rath, N.; Kato, S.; Levesque, J. P.; Mauel, M. E.; Navratil, G. A.; Peng, Q.

    2014-04-01

    Fast, digital signal processing (DSP) has many applications. Typical hardware options for performing DSP are field-programmable gate arrays (FPGAs), application-specific integrated DSP chips, or general purpose personal computer systems. This paper presents a novel DSP platform that has been developed for feedback control on the HBT-EP tokamak device. The system runs all signal processing exclusively on a Graphics Processing Unit (GPU) to achieve real-time performance with latencies below 8 μs. Signals are transferred into and out of the GPU using PCI Express peer-to-peer direct-memory-access transfers without involvement of the central processing unit or host memory. Tests were performed on the feedback control system of the HBT-EP tokamak using forty 16-bit floating point inputs and outputs each and a sampling rate of up to 250 kHz. Signals were digitized by a D-TACQ ACQ196 module, processing done on an NVIDIA GTX 580 GPU programmed in CUDA, and analog output was generated by D-TACQ AO32CPCI modules.

  12. Fast, multi-channel real-time processing of signals with microsecond latency using graphics processing units.

    PubMed

    Rath, N; Kato, S; Levesque, J P; Mauel, M E; Navratil, G A; Peng, Q

    2014-04-01

    Fast, digital signal processing (DSP) has many applications. Typical hardware options for performing DSP are field-programmable gate arrays (FPGAs), application-specific integrated DSP chips, or general purpose personal computer systems. This paper presents a novel DSP platform that has been developed for feedback control on the HBT-EP tokamak device. The system runs all signal processing exclusively on a Graphics Processing Unit (GPU) to achieve real-time performance with latencies below 8 μs. Signals are transferred into and out of the GPU using PCI Express peer-to-peer direct-memory-access transfers without involvement of the central processing unit or host memory. Tests were performed on the feedback control system of the HBT-EP tokamak using forty 16-bit floating point inputs and outputs each and a sampling rate of up to 250 kHz. Signals were digitized by a D-TACQ ACQ196 module, processing done on an NVIDIA GTX 580 GPU programmed in CUDA, and analog output was generated by D-TACQ AO32CPCI modules. PMID:24784666

  13. Fast, multi-channel real-time processing of signals with microsecond latency using graphics processing units

    SciTech Connect

    Rath, N. Levesque, J. P.; Mauel, M. E.; Navratil, G. A.; Peng, Q.; Kato, S.

    2014-04-15

    Fast, digital signal processing (DSP) has many applications. Typical hardware options for performing DSP are field-programmable gate arrays (FPGAs), application-specific integrated DSP chips, or general purpose personal computer systems. This paper presents a novel DSP platform that has been developed for feedback control on the HBT-EP tokamak device. The system runs all signal processing exclusively on a Graphics Processing Unit (GPU) to achieve real-time performance with latencies below 8 μs. Signals are transferred into and out of the GPU using PCI Express peer-to-peer direct-memory-access transfers without involvement of the central processing unit or host memory. Tests were performed on the feedback control system of the HBT-EP tokamak using forty 16-bit floating point inputs and outputs each and a sampling rate of up to 250 kHz. Signals were digitized by a D-TACQ ACQ196 module, processing done on an NVIDIA GTX 580 GPU programmed in CUDA, and analog output was generated by D-TACQ AO32CPCI modules.

  14. Simulating data processing for an Advanced Ion Mobility Mass Spectrometer

    SciTech Connect

    Chavarría-Miranda, Daniel; Clowers, Brian H.; Anderson, Gordon A.; Belov, Mikhail E.

    2007-11-03

    We have designed and implemented a Cray XD-1-based sim- ulation of data capture and signal processing for an ad- vanced Ion Mobility mass spectrometer (Hadamard trans- form Ion Mobility). Our simulation is a hybrid application that uses both an FPGA component and a CPU-based soft- ware component to simulate Ion Mobility mass spectrome- try data processing. The FPGA component includes data capture and accumulation, as well as a more sophisticated deconvolution algorithm based on a PNNL-developed en- hancement to standard Hadamard transform Ion Mobility spectrometry. The software portion is in charge of stream- ing data to the FPGA and collecting results. We expect the computational and memory addressing logic of the FPGA component to be portable to an instrument-attached FPGA board that can be interfaced with a Hadamard transform Ion Mobility mass spectrometer.

  15. Interactive Computing and Graphics in Undergraduate Digital Signal Processing. Microcomputing Working Paper Series F 84-9.

    ERIC Educational Resources Information Center

    Onaral, Banu; And Others

    This report describes the development of a Drexel University electrical and computer engineering course on digital filter design that used interactive computing and graphics, and was one of three courses in a senior-level sequence on digital signal processing (DSP). Interactive and digital analysis/design routines and the interconnection of these…

  16. Note: Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

    NASA Astrophysics Data System (ADS)

    Cerchiari, G.; Croccolo, F.; Cardinaux, F.; Scheffold, F.

    2012-10-01

    We present an implementation of the analysis of dynamic near field scattering (NFS) data using a graphics processing unit. We introduce an optimized data management scheme thereby limiting the number of operations required. Overall, we reduce the processing time from hours to minutes, for typical experimental conditions. Previously the limiting step in such experiments, the processing time is now comparable to the data acquisition time. Our approach is applicable to various dynamic NFS methods, including shadowgraph, Schlieren and differential dynamic microscopy.

  17. Mobile Ultrasound Plane Wave Beamforming on iPhone or iPad using Metal- based GPU Processing

    NASA Astrophysics Data System (ADS)

    Hewener, Holger J.; Tretbar, Steffen H.

    Mobile and cost effective ultrasound devices are being used in point of care scenarios or the drama room. To reduce the costs of such devices we already presented the possibilities of consumer devices like the Apple iPad for full signal processing of raw data for ultrasound image generation. Using technologies like plane wave imaging to generate a full image with only one excitation/reception event the acquisition times and power consumption of ultrasound imaging can be reduced for low power mobile devices based on consumer electronics realizing the transition from FPGA or ASIC based beamforming into more flexible software beamforming. The massive parallel beamforming processing can be done with the Apple framework "Metal" for advanced graphics and general purpose GPU processing for the iOS platform. We were able to integrate the beamforming reconstruction into our mobile ultrasound processing application with imaging rates up to 70 Hz on iPad Air 2 hardware.

  18. In-Situ Statistical Analysis of Autotune Simulation Data using Graphical Processing Units

    SciTech Connect

    Ranjan, Niloo; Sanyal, Jibonananda; New, Joshua Ryan

    2013-08-01

    Developing accurate building energy simulation models to assist energy efficiency at speed and scale is one of the research goals of the Whole-Building and Community Integration group, which is a part of Building Technologies Research and Integration Center (BTRIC) at Oak Ridge National Laboratory (ORNL). The aim of the Autotune project is to speed up the automated calibration of building energy models to match measured utility or sensor data. The workflow of this project takes input parameters and runs EnergyPlus simulations on Oak Ridge Leadership Computing Facility s (OLCF) computing resources such as Titan, the world s second fastest supercomputer. Multiple simulations run in parallel on nodes having 16 processors each and a Graphics Processing Unit (GPU). Each node produces a 5.7 GB output file comprising 256 files from 64 simulations. Four types of output data covering monthly, daily, hourly, and 15-minute time steps for each annual simulation is produced. A total of 270TB+ of data has been produced. In this project, the simulation data is statistically analyzed in-situ using GPUs while annual simulations are being computed on the traditional processors. Titan, with its recent addition of 18,688 Compute Unified Device Architecture (CUDA) capable NVIDIA GPUs, has greatly extended its capability for massively parallel data processing. CUDA is used along with C/MPI to calculate statistical metrics such as sum, mean, variance, and standard deviation leveraging GPU acceleration. The workflow developed in this project produces statistical summaries of the data which reduces by multiple orders of magnitude the time and amount of data that needs to be stored. These statistical capabilities are anticipated to be useful for sensitivity analysis of EnergyPlus simulations.

  19. A GRAPHICS PROCESSING UNIT-ENABLED, HIGH-RESOLUTION COSMOLOGICAL MICROLENSING PARAMETER SURVEY

    SciTech Connect

    Bate, N. F.; Fluke, C. J.

    2012-01-10

    In the era of synoptic surveys, the number of known gravitationally lensed quasars is set to increase by over an order of magnitude. These new discoveries will enable a move from single-quasar studies to investigations of statistical samples, presenting new opportunities to test theoretical models for the structure of quasar accretion disks and broad emission line regions (BELRs). As one crucial step in preparing for this influx of new lensed systems, a large-scale exploration of microlensing convergence-shear parameter space is warranted, requiring the computation of O(10{sup 5}) high-resolution magnification maps. Based on properties of known lensed quasars, and expectations from accretion disk/BELR modeling, we identify regions of convergence-shear parameter space, map sizes, smooth matter fractions, and pixel resolutions that should be covered. We describe how the computationally time-consuming task of producing {approx}290,000 magnification maps with sufficient resolution (10,000{sup 2} pixel map{sup -1}) to probe scales from the inner edge of the accretion disk to the BELR can be achieved in {approx}400 days on a 100 teraflop s{sup -1} high-performance computing facility, where the processing performance is achieved with graphics processing units. We illustrate a use-case for the parameter survey by investigating the effects of varying the lens macro-model on accretion disk constraints in the lensed quasar Q2237+0305. We find that although all constraints are consistent within their current error bars, models with more densely packed microlenses tend to predict shallower accretion disk radial temperature profiles. With a large parameter survey such as the one described here, such systematics on microlensing measurements could be fully explored.

  20. Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units-Radial Distribution Function Histogramming.

    PubMed

    Levine, Benjamin G; Stone, John E; Kohlmeyer, Axel

    2011-05-01

    The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU's memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 seconds per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis. PMID:21547007

  1. Fast analysis of molecular dynamics trajectories with graphics processing units-Radial distribution function histogramming

    SciTech Connect

    Levine, Benjamin G.; Stone, John E.; Kohlmeyer, Axel

    2011-05-01

    The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU's memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 s per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis.

  2. Quantum Chemistry for Solvated Molecules on Graphical Processing Units Using Polarizable Continuum Models.

    PubMed

    Liu, Fang; Luehr, Nathan; Kulik, Heather J; Martínez, Todd J

    2015-07-14

    The conductor-like polarization model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in chemical simulations. However, its application in quantum mechanical calculations of large-scale biomolecular systems can be limited by computational expense of both the gas phase electronic structure and the solvation interaction. We have previously used graphical processing units (GPUs) to accelerate the first of these steps. Here, we extend the use of GPUs to accelerate electronic structure calculations including C-PCM solvation. Implementation on the GPU leads to significant acceleration of the generation of the required integrals for C-PCM. We further propose two strategies to improve the solution of the required linear equations: a dynamic convergence threshold and a randomized block-Jacobi preconditioner. These strategies are not specific to GPUs and are expected to be beneficial for both CPU and GPU implementations. We benchmark the performance of the new implementation using over 20 small proteins in solvent environment. Using a single GPU, our method evaluates the C-PCM related integrals and their derivatives more than 10× faster than that with a conventional CPU-based implementation. Our improvements to the linear solver provide a further 3× acceleration. The overall calculations including C-PCM solvation require, typically, 20-40% more effort than that for their gas phase counterparts for a moderate basis set and molecule surface discretization level. The relative cost of the C-PCM solvation correction decreases as the basis sets and/or cavity radii increase. Therefore, description of solvation with this model should be routine. We also discuss applications to the study of the conformational landscape of an amyloid fibril. PMID:26575750

  3. FLOCKING-BASED DOCUMENT CLUSTERING ON THE GRAPHICS PROCESSING UNIT [Book Chapter

    SciTech Connect

    Charles, J S; Patton, R M; Potok, T E; Cui, X

    2008-01-01

    Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the fl ocking behavior of birds. Each bird represents a single document and fl ies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly diffi cult to receive results in a reasonable amount of time. However, fl ocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have experienced improved performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefi t the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NVIDIA®, we developed a document fl ocking implementation to be run on the NVIDIA® GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3,000 documents. The results of these tests were very signifi cant. Performance gains ranged from three to nearly fi ve times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.

  4. Parallel flow accumulation algorithms for graphical processing units with application to RUSLE model

    NASA Astrophysics Data System (ADS)

    Sten, Johan; Lilja, Harri; Hyväluoma, Jari; Westerholm, Jan; Aspnäs, Mats

    2016-04-01

    Digital elevation models (DEMs) are widely used in the modeling of surface hydrology, which typically includes the determination of flow directions and flow accumulation. The use of high-resolution DEMs increases the accuracy of flow accumulation computation, but as a drawback, the computational time may become excessively long if large areas are analyzed. In this paper we investigate the use of graphical processing units (GPUs) for efficient flow accumulation calculations. We present two new parallel flow accumulation algorithms based on dependency transfer and topological sorting and compare them to previously published flow transfer and indegree-based algorithms. We benchmark the GPU implementations against industry standards, ArcGIS and SAGA. With the flow-transfer D8 flow routing model and binary input data, a speed up of 19 is achieved compared to ArcGIS and 15 compared to SAGA. We show that on GPUs the topological sort-based flow accumulation algorithm leads on average to a speedup by a factor of 7 over the flow-transfer algorithm. Thus a total speed up of the order of 100 is achieved. We test the algorithms by applying them to the Revised Universal Soil Loss Equation (RUSLE) erosion model. For this purpose we present parallel versions of the slope, LS factor and RUSLE algorithms and show that the RUSLE erosion results for an area of 12 km x 24 km containing 72 million cells can be calculated in less than a second. Since flow accumulation is needed in many hydrological models, the developed algorithms may find use in many other applications than RUSLE modeling. The algorithm based on topological sorting is particularly promising for dynamic hydrological models where flow accumulations are repeatedly computed over an unchanged DEM.

  5. Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units—Radial Distribution Function Histogramming

    PubMed Central

    Stone, John E.; Kohlmeyer, Axel

    2011-01-01

    The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU’s memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 seconds per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis. PMID:21547007

  6. Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations.

    PubMed

    Wilkinson, Karl; Skylaris, Chris-Kriton

    2013-10-30

    We present the first graphical processing unit (GPU) coprocessor-enabled version of the Order-N Electronic Total Energy Package (ONETEP) code for linear-scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom-localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive-based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high-performance computing platforms. PMID:24038140

  7. Increasing the economy of design and preparation for manufacturing by integrated and graphic data processing: CAD/CAM - Phase III

    NASA Astrophysics Data System (ADS)

    Grupe, U.

    1986-04-01

    The development of CAD/CAM techniques and equipment for aircraft production at Dornier and MBB during the period 1983-1986 is reviewed. The topics discussed include geometry processing, structural mechanics, design of fabrication equipment, NC techniques, production planning, and processing of production orders. Consideration is given to the increased use of color graphics, the change from vector to scanning screens, and software-related problems encountered in shifting functions from the mainframe computer to terminals.

  8. Real-time Graphics Processing Unit Based Fourier Domain Optical Coherence Tomography and Surgical Applications

    NASA Astrophysics Data System (ADS)

    Zhang, Kang

    2011-12-01

    In this dissertation, real-time Fourier domain optical coherence tomography (FD-OCT) capable of multi-dimensional micrometer-resolution imaging targeted specifically for microsurgical intervention applications was developed and studied. As a part of this work several ultra-high speed real-time FD-OCT imaging and sensing systems were proposed and developed. A real-time 4D (3D+time) OCT system platform using the graphics processing unit (GPU) to accelerate OCT signal processing, the imaging reconstruction, visualization, and volume rendering was developed. Several GPU based algorithms such as non-uniform fast Fourier transform (NUFFT), numerical dispersion compensation, and multi-GPU implementation were developed to improve the impulse response, SNR roll-off and stability of the system. Full-range complex-conjugate-free FD-OCT was also implemented on the GPU architecture to achieve doubled image range and improved SNR. These technologies overcome the imaging reconstruction and visualization bottlenecks widely exist in current ultra-high speed FD-OCT systems and open the way to interventional OCT imaging for applications in guided microsurgery. A hand-held common-path optical coherence tomography (CP-OCT) distance-sensor based microsurgical tool was developed and validated. Through real-time signal processing, edge detection and feed-back control, the tool was shown to be capable of track target surface and compensate motion. The micro-incision test using a phantom was performed using a CP-OCT-sensor integrated hand-held tool, which showed an incision error less than +/-5 microns, comparing to >100 microns error by free-hand incision. The CP-OCT distance sensor has also been utilized to enhance the accuracy and safety of optical nerve stimulation. Finally, several experiments were conducted to validate the system for surgical applications. One of them involved 4D OCT guided micro-manipulation using a phantom. Multiple volume renderings of one 3D data set were

  9. Large eddy simulations of turbulent flows on graphics processing units: Application to film-cooling flows

    NASA Astrophysics Data System (ADS)

    Shinn, Aaron F.

    Computational Fluid Dynamics (CFD) simulations can be very computationally expensive, especially for Large Eddy Simulations (LES) and Direct Numerical Simulations (DNS) of turbulent ows. In LES the large, energy containing eddies are resolved by the computational mesh, but the smaller (sub-grid) scales are modeled. In DNS, all scales of turbulence are resolved, including the smallest dissipative (Kolmogorov) scales. Clusters of CPUs have been the standard approach for such simulations, but an emerging approach is the use of Graphics Processing Units (GPUs), which deliver impressive computing performance compared to CPUs. Recently there has been great interest in the scientific computing community to use GPUs for general-purpose computation (such as the numerical solution of PDEs) rather than graphics rendering. To explore the use of GPUs for CFD simulations, an incompressible Navier-Stokes solver was developed for a GPU. This solver is capable of simulating unsteady laminar flows or performing a LES or DNS of turbulent ows. The Navier-Stokes equations are solved via a fractional-step method and are spatially discretized using the finite volume method on a Cartesian mesh. An immersed boundary method based on a ghost cell treatment was developed to handle flow past complex geometries. The implementation of these numerical methods had to suit the architecture of the GPU, which is designed for massive multithreading. The details of this implementation will be described, along with strategies for performance optimization. Validation of the GPU-based solver was performed for fundamental bench-mark problems, and a performance assessment indicated that the solver was over an order-of-magnitude faster compared to a CPU. The GPU-based Navier-Stokes solver was used to study film-cooling flows via Large Eddy Simulation. In modern gas turbine engines, the film-cooling method is used to protect turbine blades from hot combustion gases. Therefore, understanding the physics of

  10. Graphic Arts: The Press and Finishing Processes. Fourth Edition. Teacher Edition [and] Student Edition.

    ERIC Educational Resources Information Center

    Farajollahi, Karim; Ogle, Gary; Reed, William; Woodcock, Kenneth

    Part of a series of instructional materials for courses on graphic communication, this packet contains both teacher and student materials for seven units that cover the following topics: (1) offset press systems; (2) offset inks and dampening chemistry; (3) offset press operating procedures; (4) preventive maintenance and troubleshooting; (5) job…

  11. REIONIZATION SIMULATIONS POWERED BY GRAPHICS PROCESSING UNITS. I. ON THE STRUCTURE OF THE ULTRAVIOLET RADIATION FIELD

    SciTech Connect

    Aubert, Dominique; Teyssier, Romain

    2010-11-20

    We present a set of cosmological simulations with radiative transfer in order to model the reionization history of the universe from z = 18 down to z = 6. Galaxy formation and the associated star formation are followed self-consistently with gas and dark matter dynamics using the RAMSES code, while radiative transfer is performed as a post-processing step using a moment-based method with the M1 closure relation in the ATON code. The latter has been ported to a multiple Graphics Processing Unit (GPU) architecture using the CUDA language together with the MPI library, resulting in an overall acceleration that allows us to tackle radiative transfer problems at a significantly higher resolution than previously reported: 1024{sup 3} + 2 levels of refinement for the hydrodynamic adaptive grid and 1024{sup 3} for the radiative transfer Cartesian grid. We reach a typical acceleration factor close to 100x when compared to the CPU version, allowing us to perform 1/4 million time steps in less than 3000 GPU hr. We observe good convergence properties between our different resolution runs for various volume- and mass-averaged quantities such as neutral fraction, UV background, and Thomson optical depth, as long as the effects of finite resolution on the star formation history are properly taken into account. We also show that the neutral fraction depends on the total mass density, in a way close to the predictions of photoionization equilibrium, as long as the effect of self-shielding are included in the background radiation model. Although our simulation suite has reached unprecedented mass and spatial resolution, we still fail in reproducing the z {approx} 6 constraints on the neutral fraction of hydrogen and the intensity of the UV background. In order to account for unresolved density fluctuations, we have modified our chemistry solver with a simple clumping factor model. Using our most spatially resolved simulation (12.5 Mpc h {sup -1} with 1024{sup 3} particles) to

  12. Creating Interactive Graphical Overlays in the Advanced Weather Interactive Processing System (AWIPS) Using Shapefiles and DGM Files

    NASA Technical Reports Server (NTRS)

    Barrett, Joe H., III; Lafosse, Richard; Hood, Doris; Hoeth, Brian

    2007-01-01

    Graphical overlays can be created in real-time in the Advanced Weather Interactive Processing System (AWIPS) using shapefiles or DARE Graphics Metafile (DGM) files. This presentation describes how to create graphical overlays on-the-fly for AWIPS, by using two examples of AWIPS applications that were created by the Applied Meteorology Unit (AMU). The first example is the Anvil Threat Corridor Forecast Tool, which produces a shapefile that depicts a graphical threat corridor of the forecast movement of thunderstorm anvil clouds, based on the observed or forecast upper-level winds. This tool is used by the Spaceflight Meteorology Group (SMG) and 45th Weather Squadron (45 WS) to analyze the threat of natural or space vehicle-triggered lightning over a location. The second example is a launch and landing trajectory tool that produces a DGM file that plots the ground track of space vehicles during launch or landing. The trajectory tool can be used by SMG and the 45 WS forecasters to analyze weather radar imagery along a launch or landing trajectory. Advantages of both file types will be listed.

  13. Efficient particle-in-cell simulation of auroral plasma phenomena using a CUDA enabled graphics processing unit

    NASA Astrophysics Data System (ADS)

    Sewell, Stephen

    This thesis introduces a software framework that effectively utilizes low-cost commercially available Graphic Processing Units (GPUs) to simulate complex scientific plasma phenomena that are modeled using the Particle-In-Cell (PIC) paradigm. The software framework that was developed conforms to the Compute Unified Device Architecture (CUDA), a standard for general purpose graphic processing that was introduced by NVIDIA Corporation. This framework has been verified for correctness and applied to advance the state of understanding of the electromagnetic aspects of the development of the Aurora Borealis and Aurora Australis. For each phase of the PIC methodology, this research has identified one or more methods to exploit the problem's natural parallelism and effectively map it for execution on the graphic processing unit and its host processor. The sources of overhead that can reduce the effectiveness of parallelization for each of these methods have also been identified. One of the novel aspects of this research was the utilization of particle sorting during the grid interpolation phase. The final representation resulted in simulations that executed about 38 times faster than simulations that were run on a single-core general-purpose processing system. The scalability of this framework to larger problem sizes and future generation systems has also been investigated.

  14. Computer Graphics.

    ERIC Educational Resources Information Center

    Halpern, Jeanne W.

    1970-01-01

    Computer graphics have been called the most exciting development in computer technology. At the University of Michigan, three kinds of graphics output equipment are now being used: symbolic printers, line plotters or drafting devices, and cathode-ray tubes (CRT). Six examples are given that demonstrate the range of graphics use at the University.…

  15. Interactive image processing for mobile devices

    NASA Astrophysics Data System (ADS)

    Shaw, Rodney

    2009-01-01

    As the number of consumer digital images escalates by tens of billions each year, an increasing proportion of these images are being acquired using the latest generations of sophisticated mobile devices. The characteristics of the cameras embedded in these devices now yield image-quality outcomes that approach those of the parallel generations of conventional digital cameras, and all aspects of the management and optimization of these vast new image-populations become of utmost importance in providing ultimate consumer satisfaction. However this satisfaction is still limited by the fact that a substantial proportion of all images are perceived to have inadequate image quality, and a lesser proportion of these to be completely unacceptable (for sharing, archiving, printing, etc). In past years at this same conference, the author has described various aspects of a consumer digital-image interface based entirely on an intuitive image-choice-only operation. Demonstrations have been given of this facility in operation, essentially allowing criticalpath navigation through approximately a million possible image-quality states within a matter of seconds. This was made possible by the definition of a set of orthogonal image vectors, and defining all excursions in terms of a fixed linear visual-pixel model, independent of the image attribute. During recent months this methodology has been extended to yield specific user-interactive image-quality solutions in the form of custom software, which at less than 100kb is readily embedded in the latest generations of unlocked portable devices. This has also necessitated the design of new user-interfaces and controls, as well as streamlined and more intuitive versions of the user quality-choice hierarchy. The technical challenges and details will be described for these modified versions of the enhancement methodology, and initial practical experience with typical images will be described.

  16. Discontinuous Galerkin method with Gaussian artificial viscosity on graphical processing units for nonlinear acoustics

    NASA Astrophysics Data System (ADS)

    Tripathi, Bharat B.; Marchiano, Régis; Baskar, Sambandam; Coulouvrat, François

    2015-10-01

    Propagation of acoustical shock waves in complex geometry is a topic of interest in the field of nonlinear acoustics. For instance, simulation of Buzz Saw Noice requires the treatment of shock waves generated by the turbofan through the engines of aeroplanes with complex geometries and wall liners. Nevertheless, from a numerical point of view it remains a challenge. The two main hurdles are to take into account the complex geometry of the domain and to deal with the spurious oscillations (Gibbs phenomenon) near the discontinuities. In this work, first we derive the conservative hyperbolic system of nonlinear acoustics (up to quadratic nonlinear terms) using the fundamental equations of fluid dynamics. Then, we propose to adapt the classical nodal discontinuous Galerkin method to develop a high fidelity solver for nonlinear acoustics. The discontinuous Galerkin method is a hybrid of finite element and finite volume method and is very versatile to handle complex geometry. In order to obtain better performance, the method is parallelized on Graphical Processing Units. Like other numerical methods, discontinuous Galerkin method suffers with the problem of Gibbs phenomenon near the shock, which is a numerical artifact. Among the various ways to manage these spurious oscillations, we choose the method of parabolic regularization. Although, the introduction of artificial viscosity into the system is a popular way of managing shocks, we propose a new approach of introducing smooth artificial viscosity locally in each element, wherever needed. Firstly, a shock sensor using the linear coefficients of the spectral solution is used to locate the position of the discontinuities. Then, a viscosity coefficient depending on the shock sensor is introduced into the hyperbolic system of equations, only in the elements near the shock. The viscosity is applied as a two-dimensional Gaussian patch with its shape parameters depending on the element dimensions, referred here as Element

  17. Real-time 2D parallel windowed Fourier transform for fringe pattern analysis using Graphics Processing Unit.

    PubMed

    Gao, Wenjing; Huyen, Nguyen Thi Thanh; Loi, Ho Sy; Kemao, Qian

    2009-12-01

    In optical interferometers, fringe projection systems, and synthetic aperture radars, fringe patterns are common outcomes and usually degraded by unavoidable noises. The presence of noises makes the phase extraction and phase unwrapping challenging. Windowed Fourier transform (WFT) based algorithms have been proven to be effective for fringe pattern analysis to various applications. However, the WFT-based algorithms are computationally expensive, prohibiting them from real-time applications. In this paper, we propose a fast parallel WFT-based library using graphics processing units and computer unified device architecture. Real-time WFT-based algorithms are achieved with 4 frames per second in processing 256x256 fringe patterns. Up to 132x speedup is obtained for WFT-based algorithms using NVIDIA GTX295 graphics card than sequential C in quad-core 2.5GHz Intel(R)Xeon(R) CPU E5420. PMID:20052242

  18. Application of computer generated color graphic techniques to the processing and display of three dimensional fluid dynamic data

    NASA Technical Reports Server (NTRS)

    Anderson, B. H.; Putt, C. W.; Giamati, C. C.

    1981-01-01

    Color coding techniques used in the processing of remote sensing imagery were adapted and applied to the fluid dynamics problems associated with turbofan mixer nozzles. The computer generated color graphics were found to be useful in reconstructing the measured flow field from low resolution experimental data to give more physical meaning to this information and in scanning and interpreting the large volume of computer generated data from the three dimensional viscous computer code used in the analysis.

  19. Compressed sensing reconstruction for whole-heart imaging with 3D radial trajectories: a graphics processing unit implementation.

    PubMed

    Nam, Seunghoon; Akçakaya, Mehmet; Basha, Tamer; Stehning, Christian; Manning, Warren J; Tarokh, Vahid; Nezafat, Reza

    2013-01-01

    A disadvantage of three-dimensional (3D) isotropic acquisition in whole-heart coronary MRI is the prolonged data acquisition time. Isotropic 3D radial trajectories allow undersampling of k-space data in all three spatial dimensions, enabling accelerated acquisition of the volumetric data. Compressed sensing (CS) reconstruction can provide further acceleration in the acquisition by removing the incoherent artifacts due to undersampling and improving the image quality. However, the heavy computational overhead of the CS reconstruction has been a limiting factor for its application. In this article, a parallelized implementation of an iterative CS reconstruction method for 3D radial acquisitions using a commercial graphics processing unit is presented. The execution time of the graphics processing unit-implemented CS reconstruction was compared with that of the C++ implementation, and the efficacy of the undersampled 3D radial acquisition with CS reconstruction was investigated in both phantom and whole-heart coronary data sets. Subsequently, the efficacy of CS in suppressing streaking artifacts in 3D whole-heart coronary MRI with 3D radial imaging and its convergence properties were studied. The CS reconstruction provides improved image quality (in terms of vessel sharpness and suppression of noise-like artifacts) compared with the conventional 3D gridding algorithm, and the graphics processing unit implementation greatly reduces the execution time of CS reconstruction yielding 34-54 times speed-up compared with C++ implementation. PMID:22392604

  20. Phase transitions in contagion processes mediated by recurrent mobility patterns

    NASA Astrophysics Data System (ADS)

    Balcan, Duygu; Vespignani, Alessandro

    2011-07-01

    Human mobility and activity patterns mediate contagion on many levels, including the spatial spread of infectious diseases, diffusion of rumours, and emergence of consensus. These patterns however are often dominated by specific locations and recurrent flows and poorly modelled by the random diffusive dynamics generally used to study them. Here we develop a theoretical framework to analyse contagion within a network of locations where individuals recall their geographic origins. We find a phase transition between a regime in which the contagion affects a large fraction of the system and one in which only a small fraction is affected. This transition cannot be uncovered by continuous deterministic models because of the stochastic features of the contagion process and defines an invasion threshold that depends on mobility parameters, providing guidance for controlling contagion spread by constraining mobility processes. We recover the threshold behaviour by analysing diffusion processes mediated by real human commuting data.

  1. International Student Mobility and the Bologna Process

    ERIC Educational Resources Information Center

    Teichler, Ulrich

    2012-01-01

    The Bologna Process is the newest of a chain of activities stimulated by supra-national actors since the 1950s to challenge national borders in higher education in Europe. Now, the ministers in charge of higher education of the individual European countries have agreed to promote a similar cycle-structure of study programmes and programmes based…

  2. Interpretation of Medical Imaging Data with a Mobile Application: A Mobile Digital Imaging Processing Environment

    PubMed Central

    Lin, Meng Kuan; Nicolini, Oliver; Waxenegger, Harald; Galloway, Graham J.; Ullmann, Jeremy F. P.; Janke, Andrew L.

    2013-01-01

    Digital Imaging Processing (DIP) requires data extraction and output from a visualization tool to be consistent. Data handling and transmission between the server and a user is a systematic process in service interpretation. The use of integrated medical services for management and viewing of imaging data in combination with a mobile visualization tool can be greatly facilitated by data analysis and interpretation. This paper presents an integrated mobile application and DIP service, called M-DIP. The objective of the system is to (1) automate the direct data tiling, conversion, pre-tiling of brain images from Medical Imaging NetCDF (MINC), Neuroimaging Informatics Technology Initiative (NIFTI) to RAW formats; (2) speed up querying of imaging measurement; and (3) display high-level of images with three dimensions in real world coordinates. In addition, M-DIP provides the ability to work on a mobile or tablet device without any software installation using web-based protocols. M-DIP implements three levels of architecture with a relational middle-layer database, a stand-alone DIP server, and a mobile application logic middle level realizing user interpretation for direct querying and communication. This imaging software has the ability to display biological imaging data at multiple zoom levels and to increase its quality to meet users’ expectations. Interpretation of bioimaging data is facilitated by an interface analogous to online mapping services using real world coordinate browsing. This allows mobile devices to display multiple datasets simultaneously from a remote site. M-DIP can be used as a measurement repository that can be accessed by any network environment, such as a portable mobile or tablet device. In addition, this system and combination with mobile applications are establishing a virtualization tool in the neuroinformatics field to speed interpretation services. PMID:23847587

  3. Interpretation of medical imaging data with a mobile application: a mobile digital imaging processing environment.

    PubMed

    Lin, Meng Kuan; Nicolini, Oliver; Waxenegger, Harald; Galloway, Graham J; Ullmann, Jeremy F P; Janke, Andrew L

    2013-01-01

    Digital Imaging Processing (DIP) requires data extraction and output from a visualization tool to be consistent. Data handling and transmission between the server and a user is a systematic process in service interpretation. The use of integrated medical services for management and viewing of imaging data in combination with a mobile visualization tool can be greatly facilitated by data analysis and interpretation. This paper presents an integrated mobile application and DIP service, called M-DIP. The objective of the system is to (1) automate the direct data tiling, conversion, pre-tiling of brain images from Medical Imaging NetCDF (MINC), Neuroimaging Informatics Technology Initiative (NIFTI) to RAW formats; (2) speed up querying of imaging measurement; and (3) display high-level of images with three dimensions in real world coordinates. In addition, M-DIP provides the ability to work on a mobile or tablet device without any software installation using web-based protocols. M-DIP implements three levels of architecture with a relational middle-layer database, a stand-alone DIP server, and a mobile application logic middle level realizing user interpretation for direct querying and communication. This imaging software has the ability to display biological imaging data at multiple zoom levels and to increase its quality to meet users' expectations. Interpretation of bioimaging data is facilitated by an interface analogous to online mapping services using real world coordinate browsing. This allows mobile devices to display multiple datasets simultaneously from a remote site. M-DIP can be used as a measurement repository that can be accessed by any network environment, such as a portable mobile or tablet device. In addition, this system and combination with mobile applications are establishing a virtualization tool in the neuroinformatics field to speed interpretation services. PMID:23847587

  4. Real-time imaging implementation of the Army Research Laboratory synchronous impulse reconstruction radar on a graphics processing unit architecture

    NASA Astrophysics Data System (ADS)

    Park, Song Jun; Nguyen, Lam H.; Shires, Dale R.; Henz, Brian J.

    2009-05-01

    High computing requirements for the synchronous impulse reconstruction (SIRE) radar algorithm present a challenge for near real-time processing, particularly the calculations involved in output image formation. Forming an image requires a large number of parallel and independent floating-point computations. To reduce the processing time and exploit the abundant parallelism of image processing, a graphics processing unit (GPU) architecture is considered for the imaging algorithm. Widely available off the shelf, high-end GPUs offer inexpensive technology that exhibits great capacity of computing power in one card. To address the parallel nature of graphics processing, the GPU architecture is designed for high computational throughput realized through multiple computing resources to target data parallel applications. Due to a leveled or in some cases reduced clock frequency in mainstream single and multi-core general-purpose central processing units (CPUs), GPU computing is becoming a competitive option for compute-intensive radar imaging algorithm prototyping. We describe the translation and implementation of the SIRE radar backprojection image formation algorithm on a GPU platform. The programming model for GPU's parallel computing and hardware-specific memory optimizations are discussed in the paper. A considerable level of speedup is available from the GPU implementation resulting in processing at real-time acquisition speeds.

  5. Using wesBench to Study the Rendering Performance of Graphics Processing Units

    SciTech Connect

    Bethel, Edward W

    2010-01-08

    Graphics operations consist of two broad operations. The first, which we refer to here as vertex operations, consists of transformation, lighting, primitive assembly, and so forth. The second, which we refer to as pixel or fragment operations, consist of rasterization, texturing, scissoring, blending, and fill. Overall GPU rendering performance is a function of throughput of both these interdependent stages: if one stage is slower than the other, the faster stage will be forced to run more slowly and overall rendering performance will be adversely affected. This relationship is commutative: if the later stage has a greater workload than the earlier stage, the earlier stage will be forced to 'slow down.' For example, a large triangle that covers many screen pixels will incur a very small amount of work in the vertex stage while at the same time incurring a relatively large amount of work in the fragment stage. Rendering performance of a scene consisting of many large-area triangles will be limited by throughput of the fragment stage, which will have relatively more work than the vertex stage. There are two main objectives for this document. First, we introduce a new graphics benchmark, wesBench, which is useful for measuring performance of both stages of the rendering pipeline under varying conditions. Second, we present its methodology for measuring performance and show results of several performance measurement studies aimed at producing better understanding of GPU rendering performance characteristics and limits under varying configurations. First, in Section 2, we explore the 'crossover' point between geometry and rasterization. Second, in Section 3, we explore additional performance characteristics, some of which are ill- or un-documented. Lastly, several appendices provide additional material concerning problems with the gfxbench benchmark, and details about the new wesBench graphics benchmark.

  6. NATURAL graphics

    NASA Technical Reports Server (NTRS)

    Jones, R. H.

    1984-01-01

    The hardware and software developments in computer graphics are discussed. Major topics include: system capabilities, hardware design, system compatibility, and software interface with the data base management system.

  7. [Dynamic Pulse Signal Processing and Analyzing in Mobile System].

    PubMed

    Chou, Yongxin; Zhang, Aihua; Ou, Jiqing; Qi, Yusheng

    2015-09-01

    In order to derive dynamic pulse rate variability (DPRV) signal from dynamic pulse signal in real time, a method for extracting DPRV signal was proposed and a portable mobile monitoring system was designed. The system consists of a front end for collecting and wireless sending pulse signal and a mobile terminal. The proposed method is employed to extract DPRV from dynamic pulse signal in mobile terminal, and the DPRV signal is analyzed both in the time domain and the frequency domain and also with non-linear method in real time. The results show that the proposed method can accurately derive DPRV signal in real time, the system can be used for processing and analyzing DPRV signal in real time. PMID:26904868

  8. Image processing for navigation on a mobile embedded platform: design of an autonomous mobile robot

    NASA Astrophysics Data System (ADS)

    Loose, Harald; Lemke, Christiane; Papazov, Chavdar

    2006-02-01

    This paper deals with intelligent mobile platforms connected to a camera controlled by a small hardware-platform called RCUBE. This platform is able to provide features of a typical actuator-sensor board with various inputs and outputs as well as computing power and image recognition capabilities. Several intelligent autonomous RCBUE devices can be equipped and programmed to participate in the BOSPORUS network. These components form an intelligent network for gathering sensor and image data, sensor data fusion, navigation and control of mobile platforms. The RCUBE platform provides a standalone solution for image processing, which will be explained and presented. It plays a major role for several components in a reference implementation of the BOSPORUS system. On the one hand, intelligent cameras will be positioned in the environment, analyzing the events from a fixed point of view and sharing their perceptions with other components in the system. On the other hand, image processing results will contribute to a reliable navigation of a mobile system, which is crucially important. Fixed landmarks and other objects appropriate for determining the position of a mobile system can be recognized. For navigation other methods are added, i.e. GPS calculations and odometers.

  9. Real-time display on SD-OCT using a linear-in-wavenumber spectrometer and a graphics processing unit

    NASA Astrophysics Data System (ADS)

    Watanabe, Yuuki; Itagaki, Toshiki

    2010-02-01

    We demonstrated a real-time display of processed OCT images using a linear-in-wavenumber (linear-k) spectrometer and a graphics processing unit (GPU). We used the linear-k spectrometer with optimal combination of a diffractive grating with 1200 lines/mm and a F2 equilateral prism in the 840 nm spectral region, to avoid calculating the re-sampling process. The calculations of the FFT (fast Fourier transform) were accelerated by the low cost GPU with many stream processors, which realized highly parallel processing. A display rate of 27.9 frames per second for processed images (2048 FFT size × 1000 lateral A-scans) was achieved in our OCT system using a line scan CCD camera operated at 27.9 kHz.

  10. Graphic Storytelling

    ERIC Educational Resources Information Center

    Thompson, John

    2009-01-01

    Graphic storytelling is a medium that allows students to make and share stories, while developing their art communication skills. American comics today are more varied in genre, approach, and audience than ever before. When considering the impact of Japanese manga on the youth, graphic storytelling emerges as a powerful player in pop culture. In…

  11. Business Graphics

    NASA Technical Reports Server (NTRS)

    1987-01-01

    Genigraphics Corporation's Masterpiece 8770 FilmRecorder is an advanced high resolution system designed to improve and expand a company's in-house graphics production. GRAFTIME/software package was designed to allow office personnel with minimal training to produce professional level graphics for business communications and presentations. Products are no longer being manufactured.

  12. High mobility solution-processed hybrid light emitting transistors

    NASA Astrophysics Data System (ADS)

    Walker, Bright; Ullah, Mujeeb; Chae, Gil Jo; Burn, Paul L.; Cho, Shinuk; Kim, Jin Young; Namdas, Ebinazar B.; Seo, Jung Hwa

    2014-11-01

    We report the design, fabrication, and characterization of high-performance, solution-processed hybrid (inorganic-organic) light emitting transistors (HLETs). The devices employ a high-mobility, solution-processed cadmium sulfide layer as the switching and transport layer, with a conjugated polymer Super Yellow as an emissive material in non-planar source/drain transistor geometry. We demonstrate HLETs with electron mobilities of up to 19.5 cm2/V s, current on/off ratios of >107, and external quantum efficiency of 10-2% at 2100 cd/m2. These combined optical and electrical performance exceed those reported to date for HLETs. Furthermore, we provide full analysis of charge injection, charge transport, and recombination mechanism of the HLETs. The high brightness coupled with a high on/off ratio and low-cost solution processing makes this type of hybrid device attractive from a manufacturing perspective.

  13. High mobility solution-processed hybrid light emitting transistors

    SciTech Connect

    Walker, Bright; Kim, Jin Young; Ullah, Mujeeb; Burn, Paul L.; Namdas, Ebinazar B. E-mail: seojh@dau.ac.kr; Chae, Gil Jo; Cho, Shinuk; Seo, Jung Hwa E-mail: seojh@dau.ac.kr

    2014-11-03

    We report the design, fabrication, and characterization of high-performance, solution-processed hybrid (inorganic-organic) light emitting transistors (HLETs). The devices employ a high-mobility, solution-processed cadmium sulfide layer as the switching and transport layer, with a conjugated polymer Super Yellow as an emissive material in non-planar source/drain transistor geometry. We demonstrate HLETs with electron mobilities of up to 19.5 cm{sup 2}/V s, current on/off ratios of >10{sup 7}, and external quantum efficiency of 10{sup −2}% at 2100 cd/m{sup 2}. These combined optical and electrical performance exceed those reported to date for HLETs. Furthermore, we provide full analysis of charge injection, charge transport, and recombination mechanism of the HLETs. The high brightness coupled with a high on/off ratio and low-cost solution processing makes this type of hybrid device attractive from a manufacturing perspective.

  14. Adaptive step ODE algorithms for the 3D simulation of electric heart activity with graphics processing units.

    PubMed

    Garcia-Molla, V M; Liberos, A; Vidal, A; Guillem, M S; Millet, J; Gonzalez, A; Martinez-Zaldivar, F J; Climent, A M

    2014-01-01

    In this paper we studied the implementation and performance of adaptive step methods for large systems of ordinary differential equations systems in graphics processing units, focusing on the simulation of three-dimensional electric cardiac activity. The Rush-Larsen method was applied in all the implemented solvers to improve efficiency. We compared the adaptive methods with the fixed step methods, and we found that the fixed step methods can be faster while the adaptive step methods are better in terms of accuracy and robustness. PMID:24377685

  15. An atomic orbital-based formulation of the complete active space self-consistent field method on graphical processing units

    SciTech Connect

    Hohenstein, Edward G.; Luehr, Nathan; Ufimtsev, Ivan S.; Martínez, Todd J.

    2015-06-14

    Despite its importance, state-of-the-art algorithms for performing complete active space self-consistent field (CASSCF) computations have lagged far behind those for single reference methods. We develop an algorithm for the CASSCF orbital optimization that uses sparsity in the atomic orbital (AO) basis set to increase the applicability of CASSCF. Our implementation of this algorithm uses graphical processing units (GPUs) and has allowed us to perform CASSCF computations on molecular systems containing more than one thousand atoms. Additionally, we have implemented analytic gradients of the CASSCF energy; the gradients also benefit from GPU acceleration as well as sparsity in the AO basis.

  16. Towards real-time wavefront sensorless adaptive optics using a graphical processing unit (GPU) in a line scanning system

    NASA Astrophysics Data System (ADS)

    Biss, David P.; Patel, Ankit H.; Ferguson, R. Daniel; Mujat, Mircea; Iftimia, Nicusor; Hammer, Daniel X.

    2011-03-01

    Adaptive optics ophthalmic imaging systems that rely on a standalone wave-front sensor can be costly to build and difficult for non-technical personnel to operate. As an alternative we present a simplified wavefront sensorless adaptive optics laser scanning ophthalmoscope. This sensorless system is based on deterministic search algorithms that utilize the image's spatial frequency as an optimization metric. We implement this algorithm on a NVIDIA video card to take advantage of the graphics processing unit (GPU)'s parallel architecture to reduce algorithm computation times and approach real-time correction.

  17. Stable image acquisition for mobile image processing applications

    NASA Astrophysics Data System (ADS)

    Henning, Kai-Fabian; Fritze, Alexander; Gillich, Eugen; Mönks, Uwe; Lohweg, Volker

    2015-02-01

    Today, mobile devices (smartphones, tablets, etc.) are widespread and of high importance for their users. Their performance as well as versatility increases over time. This leads to the opportunity to use such devices for more specific tasks like image processing in an industrial context. For the analysis of images requirements like image quality (blur, illumination, etc.) as well as a defined relative position of the object to be inspected are crucial. Since mobile devices are handheld and used in constantly changing environments the challenge is to fulfill these requirements. We present an approach to overcome the obstacles and stabilize the image capturing process such that image analysis becomes significantly improved on mobile devices. Therefore, image processing methods are combined with sensor fusion concepts. The approach consists of three main parts. First, pose estimation methods are used to guide a user moving the device to a defined position. Second, the sensors data and the pose information are combined for relative motion estimation. Finally, the image capturing process is automated. It is triggered depending on the alignment of the device and the object as well as the image quality that can be achieved under consideration of motion and environmental effects.

  18. The Metropolis Monte Carlo method with CUDA enabled Graphic Processing Units

    NASA Astrophysics Data System (ADS)

    Hall, Clifford; Ji, Weixiao; Blaisten-Barojas, Estela

    2014-02-01

    We present a CPU-GPU system for runtime acceleration of large molecular simulations using GPU computation and memory swaps. The memory architecture of the GPU can be used both as container for simulation data stored on the graphics card and as floating-point code target, providing an effective means for the manipulation of atomistic or molecular data on the GPU. To fully take advantage of this mechanism, efficient GPU realizations of algorithms used to perform atomistic and molecular simulations are essential. Our system implements a versatile molecular engine, including inter-molecule interactions and orientational variables for performing the Metropolis Monte Carlo (MMC) algorithm, which is one type of Markov chain Monte Carlo. By combining memory objects with floating-point code fragments we have implemented an MMC parallel engine that entirely avoids the communication time of molecular data at runtime. Our runtime acceleration system is a forerunner of a new class of CPU-GPU algorithms exploiting memory concepts combined with threading for avoiding bus bandwidth and communication. The testbed molecular system used here is a condensed phase system of oligopyrrole chains. A benchmark shows a size scaling speedup of 60 for systems with 210,000 pyrrole monomers. Our implementation can easily be combined with MPI to connect in parallel several CPU-GPU duets.

  19. r-Java: An r-process Code and Graphical User Interface for Heavy-Element Nucleosynthesis

    NASA Astrophysics Data System (ADS)

    Charignon, Camille; Kostka, Mathew; Konin, Nico; Jaikumar, Prashanth; Ouyed, Rachid

    2011-04-01

    We present r-Java, an r-process code for open use, that performs r-process nucleosynthesis calculations. Equipped with a simple graphical user interface, r-Java is capable of carrying out nuclear statistical equilibrium (NSE) as well as static and dynamic r-process calculations for a wide range of input parameters. In this introductory paper, we present the motivation and details behind r-Java, and results from our static and dynamic simulations. Static simulations are explored for a range of neutron irradiation and temperatures. Dynamic simulations are studied with a parameterized expansion formula. Our code generates the resulting abundance pattern based on a general entropy expression that can be applied to degenerate as well as non-degenerate matter, allowing us to track the rapid density and temperature evolution of the ejecta during the initial stages of ejecta expansion. At present, our calculations are limited to the waiting-point approximation.

  20. The Metropolis Monte Carlo method with CUDA enabled Graphic Processing Units

    SciTech Connect

    Hall, Clifford; Ji, Weixiao; Blaisten-Barojas, Estela

    2014-02-01

    We present a CPU–GPU system for runtime acceleration of large molecular simulations using GPU computation and memory swaps. The memory architecture of the GPU can be used both as container for simulation data stored on the graphics card and as floating-point code target, providing an effective means for the manipulation of atomistic or molecular data on the GPU. To fully take advantage of this mechanism, efficient GPU realizations of algorithms used to perform atomistic and molecular simulations are essential. Our system implements a versatile molecular engine, including inter-molecule interactions and orientational variables for performing the Metropolis Monte Carlo (MMC) algorithm, which is one type of Markov chain Monte Carlo. By combining memory objects with floating-point code fragments we have implemented an MMC parallel engine that entirely avoids the communication time of molecular data at runtime. Our runtime acceleration system is a forerunner of a new class of CPU–GPU algorithms exploiting memory concepts combined with threading for avoiding bus bandwidth and communication. The testbed molecular system used here is a condensed phase system of oligopyrrole chains. A benchmark shows a size scaling speedup of 60 for systems with 210,000 pyrrole monomers. Our implementation can easily be combined with MPI to connect in parallel several CPU–GPU duets. -- Highlights: •We parallelize the Metropolis Monte Carlo (MMC) algorithm on one CPU—GPU duet. •The Adaptive Tempering Monte Carlo employs MMC and profits from this CPU—GPU implementation. •Our benchmark shows a size scaling-up speedup of 62 for systems with 225,000 particles. •The testbed involves a polymeric system of oligopyrroles in the condensed phase. •The CPU—GPU parallelization includes dipole—dipole and Mie—Jones classic potentials.

  1. Parallelized multi–graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy

    PubMed Central

    Tankam, Patrice; Santhanam, Anand P.; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P.

    2014-01-01

    Abstract. Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing. PMID:24695868

  2. Enabling customer self service through image processing on mobile devices

    NASA Astrophysics Data System (ADS)

    Kliche, Ingmar; Hellmann, Sascha; Kreutel, Jörn

    2013-03-01

    Our paper will outline the results of a research project that employs image processing for the automatic diagnosis of technical devices whose internal state is communicated through visual displays. In particular, we developed a method for detecting exceptional states of retail wireless routers, analysing the state and blinking behaviour of the LEDs that make up most routers' user interface. The method was made configurable by means of abstracting away from a particular device's display properties, thus being able to analyse a whole range of different devices whose displays are covered by our abstraction. The method of analysis and its configuration mechanism were implemented as a native mobile application for the Android Platform. It employs the local camera of mobile devices for capturing a router's state, and uses overlaid visual hints for guiding the user toward that perspective from where an analysis is possible.

  3. Large scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU)

    PubMed Central

    Shi, Yulin; Veidenbaum, Alexander V.; Nicolau, Alex; Xu, Xiangmin

    2014-01-01

    Background Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post-hoc processing and analysis. New Method Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. Results We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22x speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. Comparison with Existing Method(s) To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Conclusions Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. PMID:25277633

  4. Distributed cooperating processes in a mobile robot control system

    NASA Technical Reports Server (NTRS)

    Skillman, Thomas L., Jr.

    1988-01-01

    A mobile inspection robot has been proposed for the NASA Space Station. It will be a free flying autonomous vehicle that will leave a berthing unit to accomplish a variety of inspection tasks around the Space Station, and then return to its berth to recharge, refuel, and transfer information. The Flying Eye robot will receive voice communication to change its attitude, move at a constant velocity, and move to a predefined location along a self generated path. This mobile robot control system requires integration of traditional command and control techniques with a number of AI technologies. Speech recognition, natural language understanding, task and path planning, sensory abstraction and pattern recognition are all required for successful implementation. The interface between the traditional numeric control techniques and the symbolic processing to the AI technologies must be developed, and a distributed computing approach will be needed to meet the real time computing requirements. To study the integration of the elements of this project, a novel mobile robot control architecture and simulation based on the blackboard architecture was developed. The control system operation and structure is discussed.

  5. Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods.

    PubMed

    Andrade, Xavier; Aspuru-Guzik, Alán

    2013-10-01

    We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems, our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs. PMID:26589153

  6. Robot graphic simulation testbed

    NASA Technical Reports Server (NTRS)

    Cook, George E.; Sztipanovits, Janos; Biegl, Csaba; Karsai, Gabor; Springfield, James F.

    1991-01-01

    The objective of this research was twofold. First, the basic capabilities of ROBOSIM (graphical simulation system) were improved and extended by taking advantage of advanced graphic workstation technology and artificial intelligence programming techniques. Second, the scope of the graphic simulation testbed was extended to include general problems of Space Station automation. Hardware support for 3-D graphics and high processing performance make high resolution solid modeling, collision detection, and simulation of structural dynamics computationally feasible. The Space Station is a complex system with many interacting subsystems. Design and testing of automation concepts demand modeling of the affected processes, their interactions, and that of the proposed control systems. The automation testbed was designed to facilitate studies in Space Station automation concepts.

  7. Mobil uses two-layer coating process on replacement pipe

    SciTech Connect

    Not Available

    1991-03-01

    Mobil Oil's West Coast Pipe Line, as part of an ongoing program, has replaced sections of its crude oil pipe line that crosses Southern California's San Joaquin Valley. The significant aspect of the replacement project was the use of a new two-part coating process that has the ability to make cathodic protection more effective, while not deteriorating in service. Mobil's crude line extends from the company's San Joaquin Valley oil field in Kern County to the Torrance, Calif., refinery on the south side of Los Angeles. It crosses a variety of terrain including desert, foothills and urban development. Crude oil from the San Joaquin Valley is heavy and requires heating for efficient flow. Normal operating temperature is about 180{degrees}F. Due to moisture in the soil surrounding the hot line, the risk of corrosion is constant. Additionally, soil stress on such a line extending through the California hills inflicts damage on the protective coating. Under these conditions, coatings can soften, bake out and eventually become brittle. The ultimate result is separation from the pipe. The coating system employs a two-part process. Each of the two coatings are tailored to each other in a patented process, forming a chemical bond between the layers. This enhances the pipe protection both mechanically and electrically.

  8. Programmer's Guide for FFORM. Physical Processes in Terrestrial and Aquatic Ecosystems, Computer Programs and Graphics Capabilities.

    ERIC Educational Resources Information Center

    Anderson, Lougenia; Gales, Larry

    This module is part of a series designed to be used by life science students for instruction in the application of physical theory to ecosystem operation. Most modules contain computer programs which are built around a particular application of a physical process. FFORM is a portable format-free input subroutine package written in ANSI Fortran IV…

  9. Effects of Graphic Organizers on Student Achievement in the Writing Process

    ERIC Educational Resources Information Center

    Brown, Marjorie

    2011-01-01

    Writing at the high school level requires higher cognitive and literacy skills. Educators must decide the strategies best suited for the varying skills of each process. Compounding this issue is the need to instruct students with learning disabilities. Writing for students with learning disabilities is a struggle at minimum; teachers have to find…

  10. Graphic Novels, Web Comics, and Creator Blogs: Examining Product and Process

    ERIC Educational Resources Information Center

    Carter, James Bucky

    2011-01-01

    Young adult literature (YAL) of the late 20th and early 21st century is exploring hybrid forms with growing regularity by embracing textual conventions from sequential art, video games, film, and more. As well, Web-based technologies have given those who consume YAL more immediate access to authors, their metacognitive creative processes, and…

  11. Accelerating the performance of a novel meshless method based on collocation with radial basis functions by employing a graphical processing unit as a parallel coprocessor

    NASA Astrophysics Data System (ADS)

    Owusu-Banson, Derek

    In recent times, a variety of industries, applications and numerical methods including the meshless method have enjoyed a great deal of success by utilizing the graphical processing unit (GPU) as a parallel coprocessor. These benefits often include performance improvement over the previous implementations. Furthermore, applications running on graphics processors enjoy superior performance per dollar and performance per watt than implementations built exclusively on traditional central processing technologies. The GPU was originally designed for graphics acceleration but the modern GPU, known as the General Purpose Graphical Processing Unit (GPGPU) can be used for scientific and engineering calculations. The GPGPU consists of massively parallel array of integer and floating point processors. There are typically hundreds of processors per graphics card with dedicated high-speed memory. This work describes an application written by the author, titled GaussianRBF to show the implementation and results of a novel meshless method that in-cooperates the collocation of the Gaussian radial basis function by utilizing the GPU as a parallel co-processor. Key phases of the proposed meshless method have been executed on the GPU using the NVIDIA CUDA software development kit. Especially, the matrix fill and solution phases have been carried out on the GPU, along with some post processing. This approach resulted in a decreased processing time compared to similar algorithm implemented on the CPU while maintaining the same accuracy.

  12. The LHEA PDP 11/70 graphics processing facility users guide

    NASA Technical Reports Server (NTRS)

    1978-01-01

    A compilation of all necessary and useful information needed to allow the inexperienced user to program on the PDP 11/70. Information regarding the use of editing and file manipulation utilities as well as operational procedures are included. The inexperienced user is taken through the process of creating, editing, compiling, task building and debugging his/her FORTRAN program. Also, documentation on additional software is included.

  13. Perception in statistical graphics

    NASA Astrophysics Data System (ADS)

    VanderPlas, Susan Ruth

    There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.

  14. Conceptual Learning with Multiple Graphical Representations: Intelligent Tutoring Systems Support for Sense-Making and Fluency-Building Processes

    ERIC Educational Resources Information Center

    Rau, Martina A.

    2013-01-01

    Most learning environments in the STEM disciplines use multiple graphical representations along with textual descriptions and symbolic representations. Multiple graphical representations are powerful learning tools because they can emphasize complementary aspects of complex learning contents. However, to benefit from multiple graphical…

  15. Graphical expression of thermodynamic characteristics of absorption process in ammonia-water system

    NASA Astrophysics Data System (ADS)

    Pospíšil, Jiří; Fortelný, Zdeněk

    2012-04-01

    The adiabatic sorption is very interesting phenomenon that occurs when vapor of refrigerant is in contact with unsaturated liquid absorbent-refrigerant mixture and exchange of heat is forbid between the system and an environment. This contribution introduces new auxiliary lines that enable correct position determination of the adiabatic sorption process in the p-T-x diagram of ammoniawater system. The presented auxiliary lines were obtained from common functions for fast calculation of water-ammonia system properties. Absorption cycles designers often utilize p-t-x diagrams of working mixtures for first suggestion of new absorption cycles. The p-t-x diagrams enable fast correct determination of saturate states of liquid (and gaseous) mixtures of refrigerants and absorbents. The working mixture isn't only at saturated state during a real working cycle. If we know pressure and temperature of an unsaturated mixture, exact position determination is possible in the p-t-x diagrams too.

  16. r-Java: an r-process code and graphical user interface for heavy-element nucleosynthesis

    NASA Astrophysics Data System (ADS)

    Charignon, C.; Kostka, M.; Koning, N.; Jaikumar, P.; Ouyed, R.

    2011-07-01

    We present r-Java, an r-process code for open use that performs r-process nucleosynthesis calculations. Equipped with a simple graphical user interface, r-Java is capable of carrying out nuclear statistical equilibrium (NSE), as well as static and dynamic r-process calculations, for a wide range of input parameters. In this introductory paper, we present the motivation and details behind r-Java and results from our static and dynamic simulations. Static simulations are explored for a range of neutron irradiation and temperatures. Dynamic simulations are studied with a parameterized expansion formula. Our code generates the resulting abundance pattern based on a general entropy expression that can be applied to both degenerate and non-degenerate matter, allowing us to track the rapid density and temperature evolution of the ejecta during the initial stages of ejecta expansion. At present, our calculations are limited to the waiting-point approximation. We encourage the nuclear astrophysics community to provide feedback on the code and related documentation, which is available for download from the website of the Quark-Nova Project: http://quarknova.ucalgary.ca/.

  17. Genetic algorithm supported by graphical processing unit improves the exploration of effective connectivity in functional brain imaging

    PubMed Central

    Chan, Lawrence Wing Chi; Pang, Bin; Shyu, Chi-Ren; Chan, Tao; Khong, Pek-Lan

    2015-01-01

    Brain regions of human subjects exhibit certain levels of associated activation upon specific environmental stimuli. Functional Magnetic Resonance Imaging (fMRI) detects regional signals, based on which we could infer the direct or indirect neuronal connectivity between the regions. Structural Equation Modeling (SEM) is an appropriate mathematical approach for analyzing the effective connectivity using fMRI data. A maximum likelihood (ML) discrepancy function is minimized against some constrained coefficients of a path model. The minimization is an iterative process. The computing time is very long as the number of iterations increases geometrically with the number of path coefficients. Using regular Quad-Core Central Processing Unit (CPU) platform, duration up to 3 months is required for the iterations from 0 to 30 path coefficients. This study demonstrates the application of Graphical Processing Unit (GPU) with the parallel Genetic Algorithm (GA) that replaces the Powell minimization in the standard program code of the analysis software package. It was found in the same example that GA under GPU reduced the duration to 20 h and provided more accurate solution when compared with standard program code under CPU. PMID:25999846

  18. Optimization of Parallel Legendre Transform using Graphics Processing Unit (GPU) for a Geodynamo Code

    NASA Astrophysics Data System (ADS)

    Lokavarapu, H. V.; Matsui, H.

    2015-12-01

    Convection and magnetic field of the Earth's outer core are expected to have vast length scales. To resolve these flows, high performance computing is required for geodynamo simulations using spherical harmonics transform (SHT), a significant portion of the execution time is spent on the Legendre transform. Calypso is a geodynamo code designed to model magnetohydrodynamics of a Boussinesq fluid in a rotating spherical shell, such as the outer core of the Earth. The code has been shown to scale well on computer clusters capable of computing at the order of 10⁵ cores using Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) parallelization for CPUs. To further optimize, we investigate three different algorithms of the SHT using GPUs. One is to preemptively compute the Legendre polynomials on the CPU before executing SHT on the GPU within the time integration loop. In the second approach, both the Legendre polynomials and the SHT are computed on the GPU simultaneously. In the third approach , we initially partition the radial grid for the forward transform and the harmonic order for the backward transform between the CPU and GPU. There after, the partitioned works are simultaneously computed in the time integration loop. We examine the trade-offs between space and time, memory bandwidth and GPU computations on Maverick, a Texas Advanced Computing Center (TACC) supercomputer. We have observed improved performance using a GPU enabled Legendre transform. Furthermore, we will compare and contrast the different algorithms in the context of GPUs.

  19. Real-time 4D signal processing and visualization using graphics processing unit on a regular nonlinear-k Fourier-domain OCT system.

    PubMed

    Zhang, Kang; Kang, Jin U

    2010-05-24

    We realized graphics processing unit (GPU) based real-time 4D (3D+time) signal processing and visualization on a regular Fourier-domain optical coherence tomography (FD-OCT) system with a nonlinear k-space spectrometer. An ultra-high speed linear spline interpolation (LSI) method for lambda-to-k spectral re-sampling is implemented in the GPU architecture, which gives average interpolation speeds of >3,000,000 line/s for 1024-pixel OCT (1024-OCT) and >1,400,000 line/s for 2048-pixel OCT (2048-OCT). The complete FD-OCT signal processing including lambda-to-k spectral re-sampling, fast Fourier transform (FFT) and post-FFT processing have all been implemented on a GPU. The maximum complete A-scan processing speeds are investigated to be 680,000 line/s for 1024-OCT and 320,000 line/s for 2048-OCT, which correspond to 1GByte processing bandwidth. In our experiment, a 2048-pixel CMOS camera running up to 70 kHz is used as an acquisition device. Therefore the actual imaging speed is camera- limited to 128,000 line/s for 1024-OCT or 70,000 line/s for 2048-OCT. 3D Data sets are continuously acquired in real time at 1024-OCT mode, immediately processed and visualized as high as 10 volumes/second (12,500 A-scans/volume) by either en face slice extraction or ray-casting based volume rendering from 3D texture mapped in graphics memory. For standard FD-OCT systems, a GPU is the only additional hardware needed to realize this improvement and no optical modification is needed. This technique is highly cost-effective and can be easily integrated into most ultrahigh speed FD-OCT systems to overcome the 3D data processing and visualization bottlenecks. PMID:20589038

  20. Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit

    SciTech Connect

    Badal, Andreu; Badano, Aldo

    2009-11-15

    Purpose: It is a known fact that Monte Carlo simulations of radiation transport are computationally intensive and may require long computing times. The authors introduce a new paradigm for the acceleration of Monte Carlo simulations: The use of a graphics processing unit (GPU) as the main computing device instead of a central processing unit (CPU). Methods: A GPU-based Monte Carlo code that simulates photon transport in a voxelized geometry with the accurate physics models from PENELOPE has been developed using the CUDA programming model (NVIDIA Corporation, Santa Clara, CA). Results: An outline of the new code and a sample x-ray imaging simulation with an anthropomorphic phantom are presented. A remarkable 27-fold speed up factor was obtained using a GPU compared to a single core CPU. Conclusions: The reported results show that GPUs are currently a good alternative to CPUs for the simulation of radiation transport. Since the performance of GPUs is currently increasing at a faster pace than that of CPUs, the advantages of GPU-based software are likely to be more pronounced in the future.

  1. Grid-based algorithm to search critical points, in the electron density, accelerated by graphics processing units.

    PubMed

    Hernández-Esparza, Raymundo; Mejía-Chica, Sol-Milena; Zapata-Escobar, Andy D; Guevara-García, Alfredo; Martínez-Melchor, Apolinar; Hernández-Pérez, Julio-M; Vargas, Rubicelia; Garza, Jorge

    2014-12-01

    Using a grid-based method to search the critical points in electron density, we show how to accelerate such a method with graphics processing units (GPUs). When the GPU implementation is contrasted with that used on central processing units (CPUs), we found a large difference between the time elapsed by both implementations: the smallest time is observed when GPUs are used. We tested two GPUs, one related with video games and other used for high-performance computing (HPC). By the side of the CPUs, two processors were tested, one used in common personal computers and other used for HPC, both of last generation. Although our parallel algorithm scales quite well on CPUs, the same implementation on GPUs runs around 10× faster than 16 CPUs, with any of the tested GPUs and CPUs. We have found what one GPU dedicated for video games can be used without any problem for our application, delivering a remarkable performance, in fact; this GPU competes against one HPC GPU, in particular when single-precision is used. PMID:25345784

  2. Performance evaluation for volumetric segmentation of multiple sclerosis lesions using MATLAB and computing engine in the graphical processing unit (GPU)

    NASA Astrophysics Data System (ADS)

    Le, Anh H.; Park, Young W.; Ma, Kevin; Jacobs, Colin; Liu, Brent J.

    2010-03-01

    Multiple Sclerosis (MS) is a progressive neurological disease affecting myelin pathways in the brain. Multiple lesions in the white matter can cause paralysis and severe motor disabilities of the affected patient. To solve the issue of inconsistency and user-dependency in manual lesion measurement of MRI, we have proposed a 3-D automated lesion quantification algorithm to enable objective and efficient lesion volume tracking. The computer-aided detection (CAD) of MS, written in MATLAB, utilizes K-Nearest Neighbors (KNN) method to compute the probability of lesions on a per-voxel basis. Despite the highly optimized algorithm of imaging processing that is used in CAD development, MS CAD integration and evaluation in clinical workflow is technically challenging due to the requirement of high computation rates and memory bandwidth in the recursive nature of the algorithm. In this paper, we present the development and evaluation of using a computing engine in the graphical processing unit (GPU) with MATLAB for segmentation of MS lesions. The paper investigates the utilization of a high-end GPU for parallel computing of KNN in the MATLAB environment to improve algorithm performance. The integration is accomplished using NVIDIA's CUDA developmental toolkit for MATLAB. The results of this study will validate the practicality and effectiveness of the prototype MS CAD in a clinical setting. The GPU method may allow MS CAD to rapidly integrate in an electronic patient record or any disease-centric health care system.

  3. Graphics processing unit-assisted real-time three-dimensional measurement using speckle-embedded fringe.

    PubMed

    Feng, Shijie; Chen, Qian; Zuo, Chao

    2015-08-01

    This paper presents a novel two-frame fringe projection technique for real-time, accurate, and unambiguous three-dimensional (3D) measurement. One of the frames is a digital speckle pattern, and the other one is a composite image which is generated by fusing that speckle image with sinusoidal fringes. The contained sinusoidal component is used to obtain a wrapped phase map by Fourier transform profilometry, and the speckle image helps determine the fringe order for phase unwrapping. Compared with traditional methods, the proposed pattern scheme enables measurements of discontinuous surfaces with only two frames, greatly reducing the number of required patterns and thus reducing the sensitivity to movements. This merit makes the method very suitable for inspecting dynamic scenes. Moreover, it shows close performance in measurement accuracy compared with the phase-shifting method from our experiments. To process data in real time, a Compute Unified Device Architecture-enabled graphics processing unit is adopted to accelerate some time-consuming computations. With our system, measurements can be performed at 21 frames per second with a resolution of 307,000 points per frame. PMID:26368103

  4. Graphics processing unit accelerated three-dimensional model for the simulation of pulsed low-temperature plasmas

    SciTech Connect

    Fierro, Andrew Dickens, James; Neuber, Andreas

    2014-12-15

    A 3-dimensional particle-in-cell/Monte Carlo collision simulation that is fully implemented on a graphics processing unit (GPU) is described and used to determine low-temperature plasma characteristics at high reduced electric field, E/n, in nitrogen gas. Details of implementation on the GPU using the NVIDIA Compute Unified Device Architecture framework are discussed with respect to efficient code execution. The software is capable of tracking around 10 × 10{sup 6} particles with dynamic weighting and a total mesh size larger than 10{sup 8} cells. Verification of the simulation is performed by comparing the electron energy distribution function and plasma transport parameters to known Boltzmann Equation (BE) solvers. Under the assumption of a uniform electric field and neglecting the build-up of positive ion space charge, the simulation agrees well with the BE solvers. The model is utilized to calculate plasma characteristics of a pulsed, parallel plate discharge. A photoionization model provides the simulation with additional electrons after the initial seeded electron density has drifted towards the anode. Comparison of the performance benefits between the GPU-implementation versus a CPU-implementation is considered, and a speed-up factor of 13 for a 3D relaxation Poisson solver is obtained. Furthermore, a factor 60 speed-up is realized for parallelization of the electron processes.

  5. Solution of the direct problem in turbid media with inclusions using Monte Carlo simulations implemented in graphics processing units: new criterion for processing transmittance data

    NASA Astrophysics Data System (ADS)

    Carbone, Nicolas; di Rocco, Hector; Iriarte, Daniela I.; Pomarico, Juan A.

    2010-05-01

    The study of light propagation in diffusive media requires solving the radiative transfer equation, or eventually, the diffusion approximation. Except for some cases involving simple geometries, the problem with immersed inclusions has not been solved. Also, Monte Carlo (MC) calculations have become a gold standard for simulating photon migration in turbid media, although they have the drawback large processing times. The purpose of this work is two-fold: first, we introduce a new processing criterion to retrieve information about the location and shape of absorbing inclusions based on normalization to the background intensity, when no inhomogeneities are present. Second, we demonstrate the feasibility of including inhomogeneities in MC simulations implemented in graphics processing units, achieving large acceleration factors (~103), thus providing an important tool for iteratively solving the forward problem to retrieve the optical properties of the inclusion. Results using a cw source are compared with MC outcomes showing very good agreement.

  6. Computer graphics and the graphic artist

    NASA Technical Reports Server (NTRS)

    Taylor, N. L.; Fedors, E. G.; Pinelli, T. E.

    1985-01-01

    A centralized computer graphics system is being developed at the NASA Langley Research Center. This system was required to satisfy multiuser needs, ranging from presentation quality graphics prepared by a graphic artist to 16-mm movie simulations generated by engineers and scientists. While the major thrust of the central graphics system was directed toward engineering and scientific applications, hardware and software capabilities to support the graphic artists were integrated into the design. This paper briefly discusses the importance of computer graphics in research; the central graphics system in terms of systems, software, and hardware requirements; the application of computer graphics to graphic arts, discussed in terms of the requirements for a graphic arts workstation; and the problems encountered in applying computer graphics to the graphic arts. The paper concludes by presenting the status of the central graphics system.

  7. Data Processing: Use of a Mobile Classroom in the Teaching of Data Processing

    ERIC Educational Resources Information Center

    Ruge, Gerald D.

    1970-01-01

    A data processing van is a 12-foot by 60-foot mobile home containing 10 keypunch simulators, six keypunches, a verifier, sorter, accounting machine, 1622 card read-punch, and 1620 computer. The van is at each of four high schools during one quarter of the school year. (DM)

  8. PO*WW*ER mobile treatment unit process hazards analysis

    SciTech Connect

    Richardson, R.B.

    1996-06-01

    The objective of this report is to demonstrate that a thorough assessment of the risks associated with the operation of the Rust Geotech patented PO*WW*ER mobile treatment unit (MTU) has been performed and documented. The MTU was developed to treat aqueous mixed wastes at the US Department of Energy (DOE) Albuquerque Operations Office sites. The MTU uses evaporation to separate organics and water from radionuclides and solids, and catalytic oxidation to convert the hazardous into byproducts. This process hazards analysis evaluated a number of accident scenarios not directly related to the operation of the MTU, such as natural phenomena damage and mishandling of chemical containers. Worst case accident scenarios were further evaluated to determine the risk potential to the MTU and to workers, the public, and the environment. The overall risk to any group from operation of the MTU was determined to be very low; the MTU is classified as a Radiological Facility with low hazards.

  9. Optimization of image processing algorithms on mobile platforms

    NASA Astrophysics Data System (ADS)

    Poudel, Pramod; Shirvaikar, Mukul

    2011-03-01

    This work presents a technique to optimize popular image processing algorithms on mobile platforms such as cell phones, net-books and personal digital assistants (PDAs). The increasing demand for video applications like context-aware computing on mobile embedded systems requires the use of computationally intensive image processing algorithms. The system engineer has a mandate to optimize them so as to meet real-time deadlines. A methodology to take advantage of the asymmetric dual-core processor, which includes an ARM and a DSP core supported by shared memory, is presented with implementation details. The target platform chosen is the popular OMAP 3530 processor for embedded media systems. It has an asymmetric dual-core architecture with an ARM Cortex-A8 and a TMS320C64x Digital Signal Processor (DSP). The development platform was the BeagleBoard with 256 MB of NAND RAM and 256 MB SDRAM memory. The basic image correlation algorithm is chosen for benchmarking as it finds widespread application for various template matching tasks such as face-recognition. The basic algorithm prototypes conform to OpenCV, a popular computer vision library. OpenCV algorithms can be easily ported to the ARM core which runs a popular operating system such as Linux or Windows CE. However, the DSP is architecturally more efficient at handling DFT algorithms. The algorithms are tested on a variety of images and performance results are presented measuring the speedup obtained due to dual-core implementation. A major advantage of this approach is that it allows the ARM processor to perform important real-time tasks, while the DSP addresses performance-hungry algorithms.

  10. A Physics-Based Modeling and Real-Time Simulation of Biomechanical Diffusion Process Through Optical Imaged Alveolar Tissues on Graphical Processing Units

    NASA Astrophysics Data System (ADS)

    Kaya, Ilhan; Santhanam, Anand P.; Lee, Kye-Sung; Meemon, Panomsak; Papp, Nicolene; Rolland, Jannick P.

    Tissue engineering has broad applications from creating the much-needed engineered tissue and organ structures for regenerative medicine to providing in vitro testbeds for drug testing. In the latter, application domain, creating alveolar lung tissue, and simulating the diffusion process of oxygen and other possible agents from the air into the blood stream as well as modeling the removal of carbon dioxide and other possible entities from the blood stream are of critical importance to simulating lung functions in various environments. In this chapter, we propose a physics-based model to simulate the alveolar gas exchange and the alveolar diffusionDiffusion alveolar process. Tissue engineers, for the first time, may utilize these simulation results to better understand the underlying gas exchange process and properly adjust the tissue growing cycles. In this work, alveolar tissues are imaged by means of an optical coherence microscopyOptical coherence microscopy (OCM Modality OCM ) system developed in our laboratory. As a consequence, 3D alveoli tissue data with its inherent complex boundary is taken as input to the simulationSimulation diffusion system, which is based on computational fluid mechanics in simulating the alveolar gas exchange. The visualizationVisualization and the simulation of diffusion of the air into the blood through the alveoli tissue is performed using a state-of-art graphics processing unitGraphics processing unit (GPU). Results show the real-time simulation of the gas exchange through the 2D alveoli tissue.